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YIELD-RELATED POLYNUCLEOTIDES AND POLYPEPTIDES IN PLANTS 

This application claims the benefit of US Provisional Application No. 
60/310,847, filed August 9, 2001, US Provisional Application No. 60/336,049, filed 
December 5, 2001, US Provisional Application No. 60/338,692, filed December 1 1, 
2001, and US Non-provisional Application No. 10/171,468, filed June 14, 2002, the 
entire contents of which are hereby incorporated by reference. 



FIELD OF THE INVENTION 

This invention relates to the field of plant biology. More particularly, the 
present invention pertains to compositions and methods for phenotypically modifying 
a plant. 

INTRODUCTION 

A plant's traits, such as its biochemical, developmental, or phenotypic 
characteristics, may be controlled through a number of cellular processes. One 
important way to manipulate that control is through transcription factors - proteins 
that influence the expression of a particular gene or sets of genes. Transformed and 
transgenic plants that comprise cells having altered levels of at least one selected 
transcription factor, for example, possess advantageous or desirable traits. Strategies 
for manipulating traits by altering a plant cell's transcription factor content can 
therefore result in plants and crops with commercially valuable properties. Applicants 
have identified polynucleotides encoding transcription factors, developed numerous 
transgenic plants using these polynucleotides, and have analyzed the plants for a 
variety of important traits. In so doing, applicants have identified important 
polynucleotide and polypeptide sequences for producing commercially valuable 
plants and crops as well as the methods for making them and using them. Other 
aspects and embodiments of the invention are described below and can be derived 
from the teachings of this disclosure as a whole. 

BACKGROUND OF THE INVENTION 

Transcription factors (TFs) can modulate gene expression, either increasing or 
decreasing (inducing or repressing) the rate of transcription. This modulation results 
in differential levels of gene expression at various developmental stages, in different 
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tissues and cell types, and in response to different exogenous (e.g., environmental) 
and endogenous stimuli throughout the life cycle of the organism. 

Because transcription factors are key controlling elements of biological 
pathways, altering the expression levels of one or more transcription factors can 
change entire biological pathways in an organism. For example, manipulation of the 
levels of selected transcription factors may result in increased expression of 
economically useful proteins or metabolic chemicals in plants or to improve other 
agriculturally relevant characteristics. Conversely, blocked or reduced expression of a 
transcription factor may reduce biosynthesis of unwanted compounds or remove an 
undesirable trait. Therefore, manipulating transcription factor levels in a plant offers 
tremendous potential in agricultural biotechnology for modifying a plant's traits. 

The present invention provides novel transcription, factors useful for 
modifying a plant's phenotype in desirable ways. 

SUMMARY OF THE INVENTION 

In a first aspect, the invention relates to a recombinant polynucleotide 
comprising a nucleotide sequence selected from the group consisting of: (a) a 
nucleotide sequence encoding a polypeptide comprising a polypeptide sequence 
selected from those of the Sequence Listing, SEQ ID NOs:2 to 2N, where N = 2-561, 
or those listed in Table 4, or a complementary nucleotide sequence thereof; (b) a 
nucleotide sequence encoding a polypeptide comprising a variant of a polypeptide of 
(a) having one or more, or between 1 and about 5, or between 1 and about 10, or 
between 1 and about 30, conservative amino acid substitutions; (c) a nucleotide 
sequence comprising a sequence selected from those of SEQ ID NOs:l to (2N - 1), 
where N = 2-561, or those included in Table 4, or a complementary nucleotide 
sequence thereof; (d) a nucleotide sequence comprising silent substitutions in a 
nucleotide sequence of (c); (e) a nucleotide sequence which hybridizes under stringent 
conditions over substantially the entire length of a nucleotide sequence of one or more 
of: (a), (b), (c), or (d); (f) a nucleotide sequence comprising at least 10 or 15, or at 
least about 20, or at least about 30 consecutive nucleotides of a sequence of any of 
(a)-(e), or at least 10 or 15, or at least about 20, or at least about 30 consecutive 
nucleotides outside of a region encoding a conserved domain of any of (a)-(e); (g) a 
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nucleotide sequence comprising a subsequence or fragment of any of (a)-(f), which 
subsequence or fragment encodes a polypeptide having a biological activity that 
modifies aplanfs characteristic, functions as a transcription factor, or alters the level 
of transcription of a gene or transgene in a cell; (h) a nucleotide sequence having at 
least 31% sequence identity to a nucleotide sequence of any of (a)-(g); (i) a 
nucleotide sequence having at least 60%, or at least 70 %, or at least 80 %, or at least 
90 %, or at least 95 % sequence identity to a nucleotide sequence of any of (a)-(g) or a 
10 or 15 nucleotide, or at least about 20, or at least about 30 nucleotide region of a 
sequence of (a)-(g) that is outside of a region encoding a conserved domain; Q) a 
nucleotide sequence that encodes a polypeptide having at least 31% sequence identity . 
to a polypeptide listed in Table 4, or the Sequence Listing; (k) a nucleotide sequence 
which encodes a polypeptide having at least 60%, or at least 70 %, or at least 80%, or 
at least 90 %, or at least 95 % sequence identity to a polypeptide listed in Table 4, or 
the Sequence Listing; and (1) a nucleotide sequence that encodes a conserved domain 
of a polypeptide having at least 85%, or at least 90%, or at least 95%, or at least 98% 
sequence identity to a conserved domain of a polypeptide listed in Table 4, or the 
Sequence Listing. The recombinant polynucleotide may further comprise a 
constitutive, inducible, or tissue-specific promoter operably linked to the nucleotide 
sequence. The invention also relates to compositions comprising at least two of the 
above-described polynucleotides. 

In a second aspect, the invention comprises an isolated or recombinant 
polypeptide comprising a subsequence of at least about 10, or at least about 15, or at 
least about 20, or at least about 30 contiguous amino acids encoded by the 
recombinant or isolated polynucleotide described above, or comprising a subsequence 
of at least about 8, or at least about 12, or at least about 15, or at least about 20, or at 
least about 30 contiguous amino acids outside a conserved domain. 

In a third aspect, the invention comprises an isolated or recombinant 
polynucleotide that encodes a polypeptide that is a paralog of the isolated polypeptide 
described above. In one aspect, the invention is an paralog which, when expressed in 
Arabidopsis, modifies a trait of the Arabidopsis plant. 
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In a fourth aspect, the invention comprises an isolated or recombinant 
polynucleotide that encodes a polypeptide that is an ortholog of the isolated 
polypeptide described above. In one aspect, the invention is an ortholog which, when 
expressed in Arabidopsis, modifies a trait of the Arabidopsis plant. 

In a fifth aspect, the invention comprises an isolated polypeptide that is a 
paralog of the isolated polypeptide described above. In one aspect, the invention is an 
paralog which, when expressed in Arabidopsis, modifies a trait of the Arabidopsis 
plant. 

In a sixth aspect, the invention comprises an isolated polypeptide that is an 
ortholog of the isolated polypeptide described above. In one aspect, the invention is 
an ortholog which, when expressed in Arabidopsis, modifies a trait of the i 
Arabidopsis plant. 

■ 

The present invention also encompasses transcription factor variants. A 
preferred transcription factor variant is one having at least 40% amino acid sequence 
identity, a more preferred transcription factor variant is one having at least 50% amino 
acid sequence identity and a most preferred transcription factor variant is one having 
at least 65% amino acid sequence identity to the transcription factor amino acid 
sequence SEQ ID NOs:2 to 2N, where N = 2-561, and which contains at least one 
functional or structural characteristic of the transcription factor amino acid sequence. 
Sequences having lesser degrees of identity but comparable biological activity are 
considered to be equivalents. 

In another aspect, the invention is a transgenic plant comprising one or more 
of the above-described isolated or recombinant polynucleotides. In yet another 
aspect, the invention is a plant with altered expression levels of a polynucleotide 
described above or a plant with altered expression or activity levels of an above- 
described polypeptide. Further, the invention is a plant lacking a nucleotide sequence 
encoding a polypeptide described above or substantially lacking a polypeptide 
described above. The plant may be any plant, including, but not limited to, 
Arabidopsis, mustard, soybean, wheat, corn, potato, cotton, rice, oilseed rape, 
sunflower, alfalfa, sugarcane, turf, banana, blackberry, blueberry, strawberry, 
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raspberry, cantaloupe, carrot, cauliflower, coffee, cucumber, eggplant, grapes, 
honeydew, lettuce, mango, melon, onion, papaya, peas, peppers, pineapple, pumpkin, 
spinach, squash, sweet com, tobacco, tomato, watermelon, rosaceous fruits, vegetable 
brassicas, and mint or other labiates. In yet another aspect, the inventions is an 
isolated plant material of a plant, including, but not limited to, plant tissue, fruit, seed, 
plant cell, embryo, protoplast, pollen, and the like. In yet another aspect, the 
invention is a transgenic plant tissue culture of regenerable cells, including, but not 
limited to, embryos, meristematic cells, microspores, protoplast, pollen, and the like. 



. In yet another aspect the invention is a transgenic plant comprising 
more of the above described polynucleotides wherein the encoded polypeptid. 
expressed and regulates transcription of a gene. 



one or 
e is 



In a further aspect the invention provides a method of using the polynucleotide 
composition to breed a progeny plant from a transgenic plant including crossing 
plants, producing seeds from transgenic plants, and methods of breeding using 
transgenic plants, the method comprising fransforming a plant with the polynucleotide 
composition to create a transgenic plant, crossing the transgenic plant with another 
plant, selecting seed, and growing the progeny plant from the seed. 

In a further aspect, the invention provides a progeny plant derived from a 
parental plant wherein said progeny plant exhibits at least three fold greater 
messenger RNA levels than said parental plant, wherein the messenger RNA encodes 
a DNA-binding protein which is capable of binding to a DNA regulatory sequence 
and inducing expression of a plant trait gene, wherein the progeny plant is 
characterized by a change in the plant trait compared to said parental plant. In yet a 
further aspect, the progeny plant exhibits at least ten fold greater messenger RNA 
levels compared to said parental plant. In yet a further aspect, the progeny plant 
exhibits at least fifty fold greater messenger RNA levels compared to said parental 
plant. 



In a further aspect, the invention relates to a cloning or expression vector 
comprising the isolated or recombinant polynucleotide described above or cells 
comprising the cloning or expression vector. 
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In yet a further aspect, the invention relates to a composition produced by 
incubating a polynucleotide of the invention with a nuclease, a restriction enzyme, a 
polymerase; a polymerase and a primer; a cloning vector, or with a cell. 

Furthermore, the invention relates to a method for producing a plant having a 
modified trait. The method comprises altering the expression of an isolated or 
recombinant polynucleotide of the invention or altering the expression or activity of a 
polypeptide of the invention in a plant to produce a modified plant, and selecting the 
modified plant for a modified trait. In one aspect, the plant is a monocot plant. In 
another aspect, the plant is a dicot plant. In another aspect the recombinant 
polynucleotide is from a dicot plant and the plant is a monocot plant. In yet another 
aspect the recombinant polynucleotide is from a monocot plant and the plant is a dicot 
plant. In yet another aspect the recombinant polynucleotide is from a monocot plant 
and the plant is a monocot plant. In yet another aspect the recombinant 
polynucleotide is from a dicot plant and the plant is a dicot plant. 

In another aspect, the invention is a transgenic plant comprising an isolated or 
recombinant polynucleotide encoding a polypeptide wherein the polypeptide is 
selected from the group consisting of SEQ ID NOs: 2 - 2N, where N « 2-561 . In yet 
another aspect, the invention is a plant with altered expression levels of a polypeptide 
described above or a plant with altered expression or activity levels of an above- 
described polypeptide. Further, the invention is a plant lacking a polynucleotide 
sequence encoding a polypeptide described above or substantially lacking a 
polypeptide described above. The plant may be any plant, including, but not limited 
to, Arabidopsis, mustard, soybean, wheat, corn, potato, cotton, rice, oilseed rape, 
sunflower, alfalfa, sugarcane, turf, banana, blackberry, blueberry, strawberry, 
raspberry, cantaloupe, carrot, cauliflower, coffee, cucumber, eggplant, grapes, 
honeydew, lettuce, mango, melon, onion, papaya, peas, peppers, pineapple, pumpkin, 
spinach, squash, sweet corn, tobacco, tomato, watermelon, rosaceous fruits, vegetable ' 
brassicas, and mint or other labiates. In yet another aspect, the inventions is an 
isolated plant material of a plant, including, but not limited to, plant tissue, fruit, seed, 
plant cell, embryo, protoplast, pollen, and the like. In yet another aspect, the 
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invention is a transgenic plant tissue culture of regenerable cells, including, but not 
limited to, embryos, meristematic cells, microspores, protoplast, pollen, and the like. 

In another aspect, the invention relates to a method of identifying a factor that 
is modulated by or interacts with a polypeptide encoded by a polynucleotide of the 
invention. The method comprises expressing a polypeptide encoded by the 
polynucleotide in a plant; and identifying at least one factor that is modulated by or 
interacts with the polypeptide. In one embodiment the method for identifying 
modulating or interacting factors is by detecting binding by the polypeptide to a 
promoter sequence, or by detecting interactions between an additional protein and the 
polypeptide in a yeast two hybrid system, or by detecting expression of a factor by 
hybridization to a microarray, subtractive hybridization, or differential display. . 

In yet another aspect, the invention is a method of identifying a molecule that 
modulates activity or expression of a polynucleotide or polypeptide of interest. The 
method comprises placing the molecule in contact with a plant comprising the 
polynucleotide or polypeptide encoded by the polynucleotide of the invention and 
monitoring one or more of the expression level of the polynucleotide in the plant, the 
expression level of the polypeptide in the plant, and modulation of an activity of the 
polypeptide in the plant. 

In yet another aspect, the invention relates to an integrated system, computer 
or computer readable medium comprising one or more character strings 
corresponding to a polynucleotide of the invention, or to a polypeptide encoded by the 
polynucleotide. The integrated system, computer or computer readable medium may 
comprise a link between one or more sequence strings to a modified plant trait. 

In yet another aspect, the invention is a method for identifying a sequence 
similar or homologous to one or more polynucleotides of the invention, or one or 
more polypeptides encoded by the polynucleotides. The method comprises providing 
a sequence database, and querying the sequence database with one or more target 
sequences corresponding to the one or more polynucleotides or to the one or more 
polypeptides to identify one or more sequence members of the database that display 
sequence similarity or homology to one or more of the one or more target sequences. 



BNSDOCID: <WO__0301 3227A2_I_> 



WO 03/013227 



PCT/US02/25805 



The method may further comprise of linking the one or more of the 
polynucleotides of the invention, or encoded polypeptides, to a modified plant 
phenotype. 

BRIEF DESCRIPTION OF THE SEQUENCE LISTING, 
TABLES, AND FIGURE 

The Sequence Listing provides exemplary polynucleotide and polypeptide 
sequences of the invention. The traits associated with the use of the sequences are 
included in the Examples. 

Diskettel is a read-only memory computer-readable diskette and contains a 
copy of the Sequence Listing in ASCII text format. The Sequence Listing is named 
"SEQLIST5 1 444200204 1 " and is 929 kilobytes in size. The copy of the Sequence 
Listing on the diskette is hereby incorporated by reference in its entirety. 

Table 4 shows the polynucleotides and polypeptides identified by SEQ ID 
NO; Mendel Gene ID No.; conserved domain of the polypeptide; and if the 
polynucleotide was tested in a transgenic assay. The first column shows the 
polynucleotide SEQ ID NO; the second column shows the Mendel Gene ID No., GID; 
the third column shows the trait(s) resulting from the knock out or overexpression of 
the polynucleotide in the transgenic plant; the fourth column shows the category of 
the trait; the fifth column shows the transcription factor family to which the 
polynucleotide belongs; the sixth column ("Comment"), includes specific effects and 
utilities conferred by the polynucleotide of the first column; the seventh column 
shows the SEQ ID NO of the polypeptide encoded by the polynucleotide; and the 
eighth column shows the amino acid residue positions of the conserved domain in 
amino acid (AA) co-ordinates. 

Table 5 lists a summary of orthologous and homologous sequences identified 
using BLAST (tblastx program). The first column shows the polynucleotide sequence 
identifier (SEQ ID NO), the second column shows the corresponding cDNA identifier 
(Gene ID), the third column shows the orthologous or homologous polynucleotide 
GenBank Accession Number (Test Sequence ID), the fourth column shows the 
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calculated probability value that the sequence identity is due to chance (Smallest Sum 
Probability), the fifth column shows the plant species from which the test sequence 
was isolated (Test Sequence Species), and the sixth column shows the orthologous or 
homologous test sequence GenBank annotation (Test Sequence GenBank 
Annotation). 



Figure 1 shows a phylogenic tree of related plant families adapted from Daly 
et al. (2001 Plant Physiology 127:1328-1333). 

Detailed Description of Exemplary Embodiments 

In an important aspect, the present invention relates to polynucleotides and 
polypeptides, e.g. for modifying phenotypes of plants. Throughout this disclosure, 
various information sources are referred to and/or are specifically incorporated. The 
information sources include scientific journal articles, patent documents, textbooks, 
and World Wide Web browser-inactive page addresses, for example. While the 
reference to these information sources clearly indicates that they can be used by one 
of skill in the art, applicants specifically incorporate each and every one of the 
information sources cited herein, in their entirety, whether or not a specific mention of 
"incorporation by reference" is noted. The contents and teachings of each and every 
one of the information sources can be relied on and used to make and use 
embodiments of the invention. 



It must be noted that as used herein and in the appended claims, the singular 
forms "a," "an," and "the" include plural reference unless the context clearly dictates 
otherwise. Thus, for example, a reference to "a plant" includes a plurality of such 
plants, and a reference to "a stress" is a reference to one or more stresses and 
equivalents thereof known to those skilled in the art, and so forth. 

The polynucleotide sequences of the invention encode polypeptides that are 
members of well-known transcription factor families, including plant transcription 
factor families, as disclosed in Table 4. Generally, the transcription factors encoded 
by the present sequences are involved in cell differentiation and proliferation and the 
regulation of growth. Accordingly, one skilled in the art would recognize that by 
expressing the present sequences in a plant, one may change the expression of 
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autologous genes or induce the expression of introduced genes. By affecting the 
expression of similar autologous sequences in a plant that have the biological activity 
of the present sequences, or by introducing the present sequences into a plant, one 
may alter a plant's phenotype to one with improved traits. The sequences of the 
invention may also be used to transform a plant and introduce desirable traits not 
found in the wild-type cultivar or strain. Plants may then be selected for those that 
produce the most desirable degree of over- or underexpression of target genes of 
interest and coincident trait improvement. 

The sequences of the present invention may be from any species, particularly 
plant species, in a naturally occurring form or from any source whether natural, 
synthetic, semi-synthetic or recombinant. The sequences of the invention may also 
include fragments of the present amino acid sequences. In this context, a "fragment" 
refers to a fragment of a polypeptide sequence which is at least 5 to about 15 amino 
acids in length, most preferably at least 14 amino acids, and which retain some 
biological activity of a transcription factor. Where "amino acid sequence" is recited to 
refer to an amino acid sequence of a naturally occurring protein molecule, "amino 
acid sequence" and like terms are not meant to limit the amino acid sequence to the 
complete native amino acid sequence associated with the recited protein molecule. 

As one of ordinary skill in the art recognizes, transcription factors can be 
identified by the presence of a region or domain of structural similarity or identity to a 
specific consensus sequence or the presence of a specific consensus DNA-binding site 
or DNA-binding site motif (see, for example, Riechmann et al., (2000) Science 290: 
2105-21 10). The plant transcription factors may belong to one of the following 
transcription factor families: the AP2 (APETALA2) domain transcription factor 
family (Riechmann and Meyerowitz (1998) Biol Chem. 379:633-646); the MYB 
transcription factor family (Martin and Paz-Ares, (1997) Trends Genet 13:67-73); the 
MADS domain transcription factor family (Riechmann and Meyerowitz (1997) Biol 
Chem. 378:1079-1 101); the WRKY protein family (Ishiguro and Nakamura (1994) 
Mol Gen. Genet 244:563-571); the ankyrin-repeat protein family (Zhang et al. 
(1992) Plant Cell 4:1575-1588); the zinc finger protein (Z) family (Klug and Schwabe 
(1995) FASEB X 9: 597-604); the homeobox (HB) protein family (Buerglin in 
Guidebook to the Homeobox Genes, Duboule (ed.) (1994) Oxford University Press); 
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the CAAT-element binding proteins (Forsburg and Guarente (1989) Genes Dev. 
3:1 166-1 178); the squamosa promoter binding proteins (SPB) (Klein et al. (1996) 
Mol. Gen. Genet. 1996 250:7-16); the NAM protein family (Souer et al. (1996) Cell 
85:159-170); the IAA/AUX proteins (Rouse et al. (1998) Science 279:1371-1373); 
the HLH/MYC protein family (Littlewood et al. (1994) Prot. Profile 1:639-709); the 
DNA-binding protein (DBP) family (Tucker et al. (1994) EMBOJ. 13:2994-3002); 
the bZIP family of transcription factors (Foster et al. (1994) FASEB J. 8:192-200); the 
Box P-binding protein (the BPF-1) family (da Costa e Silva et al. (1993) Plant J. 
4:125-135); the high mobility group (HMG) family (Bustin and Reeves (1996; Prog. 
Nucl. Acids Res. Mol. Biol. 54:35-100); the scarecrow (SCR) family (Di Laurenzio et 
al. (1996) Cell 86:423-433); the GF14 family (Wu et al. (1997) Plant Physiol. 
1 14:1421-1431); the polycomb (PCOMB) family (Kennison (1995) Annu. Rev. Genet. ■ 
29:289-303); the teosinte branched (TEO) family (Luo et al. (1996) Nature 383:794- 
799; the ABI3 family (Giraudat et al. (1992) Plant Cell 4:1251-1261); the triple helix - 
(TH) family (Dehesh et al. (1990) Science 250:1397-1399); the EIL family (Chao et 
al. (1997) Cell 89:1 133-44); the AT-HOOK family (Reeves and Nissen (1990) J. Biol. 
Chem. 265:8573-8582); the S1FA family (Zhou et al. (1995) Nucleic Acids Res. 
23:1165-1 169); the bZIPT2 family (Lu and Ferl (1995; Plant Physiol. 109:723); the 
YABBY family (Bowman et al. (1999) Development 126:2387-96); the PAZ family 
(Bohmert et al. (1998) EMBOJ. 17:170-80); a family of miscellaneous (MISC) 
transcription factors including the DPBF family (Kim et al. (1997) Plant J. 11:1237- 
1251) and the SPF1 family (Ishiguro and Nakamura (1 994) Mol. Gen. Genet. 
244:563-571); the golden (GLD) family (Hall et al. (1998) Plant Cell 10:925-936), 
the TUBBY family (Boggin et al, (1999) Science 286:21 19-2125), the heat shock 
family (Wu C (1995) Annu Rev CellDev Biol 1 1 :441-469), the ENBP family ' 
(Christiansen et al (1996) Plant Mol Biol 32:809-821), the RING-zinc family (Jensen 
et al. (1998; FEBS letters 436:283-287), the PDBP family (Janik et al Virology. 
(1989) 168:320-329), the PCF family (Cubas P, et al. Plant J. (1999) 18:215-22), the 
SRS (SHI-related) family (Fridborg et al Plant Cell (1999) 1 1:1019-1032), the CPP 
(cysteine-rich polycomb-like) family (Cvitanich et al Proc. Natl. Acad. Sci. USA. 
(2000) 97:8163-8168), the ARF (auxin response factor) family (Ulmasov, et al. 
(1999) Proc. Natl. Acad. Sci. USA 96: 5844-5849), the SWI/SNF family 
(Collingwood et al J. Mol. End. 23:255-275), the ACBF family (Seguin et al (1997) 
Plant Mol Biol. 35:281-291), PCGL (CG-1 like) family (da Costa e Silva et al. 
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(1994) Plant MolBioL 25:921-924) the ARID family (Vazquez et al. (1999) 
Development 126: 733-42), the Jumonji family, Balciunas et al (2000, Trends 
Biochem ScL 25: 274-276), the bZEP-NIN family (Schauser et al (1999) Nature 402: 
191-195), the E2F family Kaelin et al (1992) Cell 70: 351-364) and the GRF-like 
family (Knaap et al (2000) Plant Physiol 122: 695-704). As indicated by any part of 
the list above and as known in the art, transcription factors have been sometimes 
categorized by class, family, and sub-family according to their structural content and 
consensus DNA-binding site motif, for example. Many of the classes and many of the 
families and sub-families are listed here. However, the inclusion of one sub-family 
and not another, or the inclusion of one family and not another, does not mean that the 
invention does not encompass polynucleotides or polypeptides of a certain family or 
sub-family. The list provided here is merely an example of the types of transcription 
factors and the knowledge available concerning the consensus sequences and 
consensus DNA-binding site motifs that help define them as known to those of skill in 
• the art (each of the references noted above are specifically incorporated herein by 
reference). A transcription factor may include, but is not limited to, any polypeptide 
that can activate or repress transcription of a single gene or a number of genes. This 
polypeptide group includes, but is not limited to, DNA-binding proteins, DNA- 
binding protein binding proteins, protein kinases, protein phosphatases, GTP-binding 
proteins, and receptors, and the like. 

In addition to methods for modifying a plant phenotype by employing one or 
more polynucleotides and polypeptides of the invention described herein, the 
polynucleotides and polypeptides of the invention have a variety of additional uses. 
These uses include their use in the recombinant production (i.e., expression) of 
proteins; as regulators of plant gene expression, as diagnostic probes for the presence 
of complementary or partially complementary nucleic acids (including for detection 
of natural coding nucleic acids); as substrates for further reactions, e.g., mutation 
reactions, PCR reactions, or the like; as substrates for cloning e.g., including digestion 
or ligation reactions; and for identifying exogenous or endogenous modulators of the 
transcription factors. A "polynucleotide" is a nucleic acid sequence comprising a 
plurality of polymerized nucleotides, e.g., at least about 15 consecutive polymerized 
nucleotides, optionally at least about 30 consecutive nucleotides, at least about 50 
consecutive nucleotides. In many instances, a polynucleotide comprises a nucleotide 
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sequence encoding a polypeptide (or protein) or a domain or fragment thereof. 
Additionally, the polynucleotide may comprise a promoter, an intron, an enhancer 
region, a polyadenylation site, a translation initiation site, 5' or 3' untranslated 
regions, a reporter gene, a selectable marker, or the like. The polynucleotide can be 
single stranded or double stranded DNA or RNA. The polynucleotide optionally 
comprises modified bases or a modified backbone. The polynucleotide can be, e.g., 
genomic DNA or RNA, a transcript (such as an mRNA), a cDNA, a PCR product, a 
cloned DNA, a synthetic DNA or RNA, or the like. The polynucleotide can comprise 
a sequence in either sense or antisense orientations. 



A "recombinant polynucleotide" is a polynucleotide that is not in its native 
state, e.g., the polynucleotide comprises a nucleotide sequence not found in nature, or 
the polynucleotide is in a context other than that in which it is naturally found, e.g., 
separated from nucleotide sequences with which it typically is in proximity in nature, 
or adjacent (or contiguous with) nucleotide sequences with which it typically is not in 
proximity. For example, the sequence at issue can be cloned into a vector, or 
otherwise recombined with one or more additional nucleic acid. 



An "isolated polynucleotide" is a polynucleotide whether naturally occurring 
or recombinant, that is present outside the cell in which it is typically found in nature, 
whether purified or not. Optionally, an isolated polynucleotide is subject to one or 
more enrichment or purification procedures, e.g., cell lysis, extraction, centrifugation, 
precipitation, or the like. 



A "polypeptide" is an amino acid sequence comprising a plurality of 
consecutive polymerized amino acid residues e.g., at least about 15 consecutive 
polymerized amino acid residues, optionally at least about 30 consecutive 
polymerized amino acid residues, at least about 50 consecutive polymerized amino 
acid residues. In many instances, a polypeptide comprises a polymerized amino acid 
residue sequence that is a transcription factor or a domain or portion or fragment 
thereof. Additionally, the polypeptide may comprise a localization domain, 2) an 
activation domain, 3) a repression domain, 4) an oligomerization domain or 5) a 
DNA-binding domain, or the like. The polypeptide optionally comprises modified 
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amino acid residues, naturally occurring amino acid residues not encoded by a codon, 
non-naturally occurring amino acid residues. 



A "recombinant polypeptide" is a polypeptide produced by translation of a 
recombinant polynucleotide. A "synthetic polypeptide" is a polypeptide created by 
consecutive polymerization of isolated amino acid residues using methods well 
known in the art. An "isolated polypeptide," whether a naturally occurring or a 
recombinant polypeptide, is more enriched in (or out of) a cell than the polypeptide in 
its natural state in a wild type cell, e.g., more than about 5% enriched, more than 
about 10% enriched, or more than about 20%, or more than about 50%, or more, 
enriched, i.e., alternatively denoted: 105%, 110%, 120%, 150% or more, enriched 
relative to wild type standardized at 100%. Such an enrichment is not the result of a 
natural response of a wild type plant. Alternatively, or additionally, the isolated 
polypeptide is separated from other cellular components with which it is typically 
associated, e.g., by any of the various protein purification methods herein. 

"Identity" or "similarity" refers to sequence similarity between two 
polynucleotide sequences or between two polypeptide sequences, with identity being 
a more strict comparison. The phrases "percent identity" and "% identity" refer to the 
percentage of sequence similarity found in a comparison of two or more 
polynucleotide sequences or two or more polypeptide sequences. Identity or 
similarity can be determined by comparing a position in each sequence that may be 
aligned for purposes of comparison. When a position in the compared sequence is 
occupied by the same nucleotide base or amino acid, then the molecules are identical 
at that position. A degree of similarity or identity between polynucleotide sequences 
is a function of the number of identical or matching nucleotides at positions shared by 
the polynucleotide sequences. A degree of identity of polypeptide sequences is a 
function of the number of identical amino acids at positions shared by the polypeptide 
sequences. A degree of homology or similarity of polypeptide sequences is a function 
of the number of amino acids, i.e., structurally related, at positions shared by the 
polypeptide sequences. 

"Altered" nucleic acid sequences encoding polypeptide include those 
sequences with deletions, insertions, or substitutions of different nucleotides, resulting 
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in a polynucleotide encoding a polypeptide with at least one functional characteristic 
of the polypeptide. Included within this definition are polymorphisms that may or 
may not be readily detectable using a particular oligonucleotide probe of the 
polynucleotide encoding polypeptide, and improper or unexpected hybridization to 
allelic variants, with a locus other than the normal chromosomal locus for the 
polynucleotide sequence encoding polypeptide. The encoded polypeptide protein 
may also be "altered", and may contain deletions, insertions, or substitutions of amino 
acid residues that produce a silent change and result in a functionally equivalent 
polypeptide. Deliberate amino acid substitutions may be made on the basis of 
similarity in residue side chain chemistry, including, but not limited to, polarity, 
charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of 
the residues, as long as the biological activity of polypeptide is retained. For 
example, negatively charged amino acids may include aspartic acid and glutamic acid, 
positively charged amino acids may include lysine and arginine, and amino acids with 
uncharged polar head groups having similar hydrophilicity values may include 
leucine, isoleucine, and valine; glycine and alanine; asparagine and glutamine; serine 
and threonine; and phenylalanine and tyrosine. Alignments between different 
polypeptide sequences may be used to calculate "percentage sequence similarity". 

The term "plant" includes whole plants, shoot vegetative organs/stmctures 
(e.g. , leaves, stems and tubers), roots, flowers and floral organs/structures (e.g. , 
bracts, sepals, petals, stamens, carpels, anthers and ovules), seed (including embryo, 
endosperm, and seed coat) and fruit (the mature ovary), plant tissue (e.g., vascular 
tissue, ground tissue, and the like) and cells (e.g., guard cells, egg cells, and the like), 
and progeny of same. The class of plants that can be used in the method of the 
invention is generally as broad as the class of higher and lower plants amenable to 
transformation techniques, including angiosperms (monocotyledonous and 
dicotyledonous plants), gymnosperms, ferns, horsetails, psilophytes, lycophytes, 
bryophytes, and multicellular algae. (See for example, Figure 1, adapted from Daly et 
al. 2001 Plant Physiology 127:1328-1333; and see also Tudge, C, The Variety nf 
Life, Oxford University Press, New York, 2000, pp. 547-606.) 

A "transgenic plant" refers to a plant that contains genetic material not found 
in a wild type plant of the same species, variety or cultivar. The genetic material may 
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include a transgene, an insertional mutagenesis event (such as by transposon or T- 
DNA insertional mutagenesis), an activation tagging sequence, a mutated sequence, a 
homologous recombination event or a sequence modified by chimeraplasty. 
Typically, the foreign genetic material has been introduced into the plant by human 
manipulation, but any method can be used as one of skill in the art recognizes. 

A transgenic plant may contain an expression vector or cassette. The 
expression cassette typically comprises a polypeptide-encoding sequence operably 
linked (i.e., under regulatory control of) to appropriate inducible or constitutive 
regulatory sequences that allow for the expression of polypeptide. The expression 
cassette can be introduced into a plant by transformation or by breeding after 
transformation of a parent plant. A plant refers to a whole plant as well as to a plant 
part, such as seed, fruit, leaf, or root, plant tissue, plant cells or any other plant 
material, e.g., a plant explant, as well as to progeny thereof, and to in vitro systems 
that mimic biochemical or cellular components or processes in a cell. 

"Ectopic expression or altered expression" in reference to a polynucleotide 
indicates that the pattern of expression in, e.g., a transgenic plant or plant tissue, is 
different from the expression pattern in a wild type plant or a reference plant of the 
same species. The pattern of expression may also be compared with a reference 
expression pattern in a wild type plant of the same species. For example, the 
polynucleotide or polypeptide is expressed in a cell or tissue type other than a cell or 
tissue type in which the sequence is expressed in the wild type plant, or by expression 
at a time other than at the time the sequence is expressed in the wild type plant, or by 
a response to different inducible agents, such as hormones or environmental signals, 
or at different expression levels (either higher or lower) compared with those found in 
a wild type plant. The term also refers to altered expression patterns that are produced 
by lowering the levels of expression to below the detection level or completely 
abolishing expression. The resulting expression pattern can be transient or stable, 
constitutive or inducible. In reference to a polypeptide, the term "ectopic expression 
or altered expression" further may relate to altered activity levels resulting from the 
interactions of the polypeptides with exogenous or endogenous modulators or from 
interactions with factors or as a result of the chemical modification of the 
polypeptides. 
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A "fragment" or "domain," with respect to a polypeptide, refers to a 
subsequence of the polypeptide. In some cases, the fragment or domain, is a 
subsequence of the polypeptide which performs at least one biological function of the 
intact polypeptide in substantially the same manner, or to a similar extent, as does the 
intact polypeptide. For example, a polypeptide fragment can comprise a recognizable 
structural motif or functional domain such as a DNA-binding site or domain that 
binds to a DNA promoter region, an activation domain, or a domain for protein- 
protein interactions. Fragments can vary in size from as few as 6 amino acids to the 
full length of the intact polypeptide, but are preferably at least about 30 amino acids in 
length and more preferably at least about 60 amino acids in length. In reference to a 
polynucleotide sequence, "a fragment" refers to any subsequence of a polynucleotide, 
typically, of at least about 15 consecutive nucleotides, preferably at least about 30 
nucleotides, more preferably at least about 50 nucleotides, of any of the sequences 
provided herein. 

• The invention also encompasses production of DNA sequences that encode 
transcription factors and transcription factor derivatives, or fragments thereof, entirely 
by synthetic chemistry. After production, the synthetic sequence may be inserted into 
any of the many available expression vectors and cell systems using reagents well 
known in the art. Moreover, synthetic chemistry may be used to introduce mutations 
into a sequence encoding transcription factors or any fragment thereof. 

A "conserved domain", with respect to a polypeptide, refers to a domain 
within a transcription factor family which exhibits a higher degree of sequence 
homology, such as at least 65% sequence identity including conservative 
substitutions, and preferably at least 80% sequence identity, and more preferably at 
least 85%, or at least about 86%, or at least about 87%, or at least about 88%, or at 
least about 90%, or at least about 95%, or at least about 98% amino acid residue 
sequence identity of a polypeptide of consecutive amino acid residues. A fragment or 
domain can be referred to as outside a consensus sequence or outside a consensus 
DNA-binding site that is known to exist or that exists for a particular transcription 
factor class, family, or sub-family. In this case, the fragment or domain will not 
include the exact amino acids of a consensus sequence or consensus DNA-binding 
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site of a transcription factor class, family or sub-family, or the exact amino acids of a 
particular transcription factor consensus sequence or consensus DNA-binding site. 
Furthermore, a particular fragment, region, or domain of a polypeptide, or a 
polynucleotide encoding a polypeptide, can be "outside a conserved domain" if all the 
amino acids of the fragment, region, or domain fall outside of a defined conserved 
domain(s) for a polypeptide or protein. The conserved domains for each of 
polypeptides of SEQ ID NOs:2 - 2N, where N = 2-561, are listed in Table 4 as 
described in Example VII. Also, many of the polypeptides of Table 4 have conserved 
domains specifically indicated by start and stop sites. A comparison of the regions of 
the polypeptides in SEQ ID NOs:2 - 2N, where N = 2-561, or of those in Table 4, 
allows one of skill in the art to identify conserved domain(s) for any of the 
polypeptides listed or referred to in this disclosure, including those in Table 4. 

A "trait" refers to a physiological, morphological, biochemical, or physical 
characteristic of a plant or particular plant material or cell. In some instances, this 
characteristic is visible to the human eye, such as seed or plant size, or can be 
measured by biochemical techniques, such as detecting the protein, starch, or oil 
content of seed or leaves, or by observation of a metabolic or physiological process, 
e.g. by measuring uptake of carbon dioxide, or by the observation of the expression 
level of a gene or genes, e.g., by employing Northern analysis, RT-PCR, microarray 
gene expression assays, or reporter gene expression systems, or by agricultural 
observations such as stress tolerance, yield, or pathogen tolerance. Any technique can 
be used to measure the amount of, comparative level of, or difference in any selected 
chemical compound or macromolecule in the transgenic plants, however. 

"Trait modification" refers to a detectable difference in a characteristic in a 
plant ectopically expressing a polynucleotide or polypeptide of the present invention 
relative to a plant not doing so, such as a wild type plant. In some cases, the trait 
modification can be evaluated quantitatively. For example, the trait modification can 
entail at least about a 2% increase or decrease in an observed trait (difference), at least 
a 5% difference, at least about a 10% difference, at least about a 20% difference, at 
least about a 30%, at least about a 50%, at least about a 70%, or at least about a 100%, 
or an even greater difference compared with a wild type plant. It is known that there 
can be a natural variation in the modified trait. Therefore, the trait modification 
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observed entails a change of the normal distribution of the trait in the plants compared 
with the distribution observed in wild type plant. 

I. Traits Which May Be Modified 

Trait modifications of particular interest include those to seed (such as embryo 
or endosperm), fruit, root, flower, leaf, stem, shoot, seedling or the like, including: • 
enhanced tolerance to environmental conditions including freezing, chilling, heat, 
drought, water saturation, radiation and ozone; improved tolerance to microbial, 
fungal or viral diseases; improved tolerance to pest infestations, including nematodes, 
mollicutes, parasitic higher plants or the like; decreased herbicide sensitivity; 
improved tolerance of heavy metals or enhanced ability to take up heavy metals; 
improved growth under poor photoconditions (e.g., low light and/or short day length), 
or changes in expression levels of genes of interest. Other phenotype that can be 
modified relate to the production of plant metabolites, such as variations in the 
production of taxol, tocopherol, tocotrienol, sterols, phytosterols, vitamins, wax 
monomers, anti-oxidants, amino acids, lignins, cellulose, tannins, prenyllipids (such 
as chlorophylls and carotenoids), glucosinolates, and terpenoids, enhanced or 
compositionally altered protein or oil production (especially in seeds), or modified 
sugar (insoluble or soluble) and/or starch composition. Physical plant characteristics 
that can be modified include cell development (such as the number of trichomes), fruit 
and seed size and number, yields of plant parts such as stems, leaves, inflorescences, 
and roots, the stability of the seeds during storage, characteristics of the seed pod 
(e.g., susceptibility to shattering), root hair length and quantity, internode distances, or 
the quality of seed coat. Plant growth characteristics that can be modified include 
growth rate, germination rate of seeds, vigor of plants and seedlings, leaf and flower 
senescence, male sterility, apomixis, flowering time, flower abscission, rate of 
nitrogen uptake, osmotic sensitivity to soluble sugar concentrations, biomass or 
transpiration characteristics, as well as plant architecture characteristics such as apical 
dominance, branching patterns, number of organs, organ identity, organ shape or size. 

II. Transcription Factors Modify Expression Of Endogenous Genes 

Expression of genes which encode transcription factors that modify expression 
of endogenous genes, polynucleotides, and proteins are well known in the art. M 
addition, transgenic plants comprising isolated polynucleotides encoding transcription 
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factors may also modify expression of endogenous genes, polynucleotides, and 
proteins. Examples include Peng et al. (1997, Genes and Development 11:3194- 
3205) and Peng et al (1999, Nature, 400:256-261). In addition, many others have 
demonstrated that an Arabidopsis transcription factor expressed in an exogenous plant 
species elicits the same or very similar phenotypic response. See, for example, Fu et 
al. (2001, Plant Cell 13:1791-1802); Nandi et al. (2000, Curr. Biol. 10:215-218); 
Coupland (1995, Nature 377:482-483); and Weigel and Nilsson (1995, Nature 
377:482-500). 

In another example, Mandel et al. (1992, Cell 71-133-143) and Suzuki et al. 
(2001, Plant J. 28:409-418) teach that a transcription factor expressed in another plant 
species elicits the same or very similar phenotypic response of the endogenous 
sequence, as often predicted in earlier studies of Arabidopsis transcription factors in 
Arabidopsis (see Mandel et al., 1992, supra; Suzuki et al., 2001, supra). 

Other examples include Muller et al. (2001, Plant J. 28:169-179); Kim et al. 
(2001, Plant J. 25:247-259); Kyozuka and Shimamoto (2002, Plant Cell Physiol. 
43:130-135); Boss and Thomas (2002, Nature, 416:847-850); He et al. (2000, 
Transgenic Res., 9:223-227); and Robson et al. (2001, Plant J. 28:619-631). 

In yet another example, Gilmour et al. (1998, Plant J. 16:433-442) teach an 
Arabidopsis AP2 transcription factor, CBF1, which, when overexpressed in transgenic 
plants, increases plant freezing tolerance. Jaglo et al (2001, Plant Physiol. 127:910- 
917) further identified sequences in Brassica napus which encode CBF-like genes and 
that transcripts for these genes accumulated rapidly in response to low temperature. 
Transcripts encoding CBF-like proteins were also found to accumulate rapidly in 
response to low temperature in wheat, as well as in tomato. An alignment of the CBF 
proteins from Arabidopsis, B. napus, wheat, rye, and tomato revealed the presence of 
conserved amino acid sequences, PKK/RPAGRxKFxETRHP and DSAWR, that 
bracket the AP2/EREBP DNA binding domains of the proteins and distinguish them 
from other members of the AP2/EREBP protein family. (See Jaglo et al., supra.) 

III. Polypeptides and Polynucleotides of the Invention 
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The present invention provides, among other things, transcription factors 
(TFs), and transcription factor homologue polypeptides, and isolated or recombinant 
polynucleotides encoding the polypeptides, or novel variant polypeptides or 
polynucleotides encoding novel variants of transcription factors derived from the 
specific sequences provided here. These polypeptides and polynucleotides may be 
employed to modify a plant's characteristic. 

Exemplary polynucleotides encoding the polypeptides of the invention were 
identified in the Arabidopsis thaliana GenBank database using publicly available 
sequence analysis programs and parameters. Sequences initially identified were then 
further characterized to identify sequences comprising specified sequence strings 
corresponding to sequence motifs present in families of known transcription factors. 
In addition, further exemplary polynucleotides encoding the polypeptides of the 
invention were identified in the plant GenBank database using publicly available 
sequence analysis programs and parameters. Sequences initially identified were then • 
further characterized to identify sequences comprising specified sequence strings 
corresponding to sequence motifs present in families of known transcription factors. 
Polynucleotide sequences meeting such criteria were confirmed as transcription 
factors. 



Additional polynucleotides of the invention were identified by screening 
Arabidopsis thaliana and/or other plant cDNA libraries with probes corresponding to 
known transcription factors under low stringency hybridization conditions. 
Additional sequences, including full length coding sequences were subsequently 
recovered by the rapid amplification of cDNA ends (RACE) procedure, using a 
commercially available kit according to the manufacturer's instructions. Where 
necessary, multiple rounds of RACE are performed to isolate 5' and 3' ends. The full 
length cDNA was then recovered by a routine end-to-end polymerase chain reaction 
(PCR) using primers specific to the isolated 5' and 3' ends. Exemplary sequences are 
provided in the Sequence Listing. 

The polynucleotides of the invention can be or were ectopically expressed in 
overexpressor or knockout plants and the changes in the characteristic(s) or trait(s) of 
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the plants observed. Therefore, the polynucleotides and polypeptides can be 
employed to improve the characteristics of plants. 



The polynucleotides of the invention can be or were ectopically expressed in 
overexpressor plant cells and the changes in the expression levels of a number of 
genes, polynucleotides, and/or proteins of the plant cells observed. Therefore, the 
polynucleotides and polypeptides can be employed to change expression levels of a 
genes, polynucleotides, and/or proteins of plants. 

IV. Producing Polypeptides 

The polynucleotides of the invention include sequences that encode 
transcription factors and transcription factor homologue polypeptides and sequences 
complementary thereto, as well as unique fragments of coding sequence, or sequence 
complementary thereto. Such polynucleotides can be, e.g., DNA or RNA, e.g., 
mRNA, cRNA, synthetic RNA, genomic DNA, cDNA synthetic DNA, 
oligonucleotides, etc. The polynucleotides are either double-stranded or single- 
stranded, and include either, or both sense (i.e., coding) sequences and antisense (i.e., 
non-coding, complementary) sequences. The polynucleotides include the coding 
sequence of a transcription factor, or transcription factor homologue polypeptide, in 
isolation, in combination with additional coding sequences (e.g., a purification tag, a 
localization signal, as a fusion-protein, as a pre-protein, or the like), in combination 
with non-coding sequences (e.g., introns or inteins, regulatory elements such as 
promoters, enhancers, terminators, and the like), and/or in a vector or host 
environment in which the polynucleotide encoding a transcription factor or 
transcription factor homologue polypeptide is an endogenous or exogenous gene. 

A variety of methods exist for producing the polynucleotides of the invention. 
Procedures for identifying and isolating DNA clones are well known to those of skill 
in the art, and are described in, e.g., Berger and Kimmel, Guide to Molecular Cloning 
Techniques, Methods in Enzvmologv volume 152 Academic Press, Inc., San Diego, 
CA ("Berger"); Sambrook et al., Molecular Cloning - A Laboratory Manual (2nd 
Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, 1989 
("Sambrook") and Current Protocols in Molecular Biology. F. M. Ausubel et al., eds., 
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John Wiley & Sons, Inc., (supplemented through 2000) ("Ausubel"). 



Alternatively, polynucleotides of the invention, can be produced by a variety 
of in vitro amplification methods adapted to the present invention by appropriate 
selection of specific or degenerate primers. Examples of protocols sufficient to direct 
persons of skill through in vitro amplification methods, including the polymerase 
chain reaction (PCR) the ligase chain reaction (LCR), Qbeta-replicase amplification 
and other RNA polymerase mediated techniques (e.g., NASBA), e.g., for the 
production of the homologous nucleic acids of the invention are found in Berger 
(supra), Sambrook (supra), and Ausubel (supra), as well as Mullis et al, (1987) PCR 
Protocols A Guide to Me thods and Applications (Innis et al. eds) Academic Press Inc. 
San Diego, CA (1990) (Innis). Improved methods for cloning in vitro amplified 
nucleic acids are described in Wallace et al., U.S. Pat. No. 5,426,039. Improved 
methods for amplifying large nucleic acids by PCR are summarized in Cheng et al. 
(1994) Nature 369: 684-685 and the references cited therein, in which PCR amplicons 
of up to 40kb are generated. One of skill will appreciate that essentially any RNA can 
be converted into a double stranded DNA suitable for restriction digestion, PCR 
expansion and sequencing using reverse transcriptase and a polymerase. See, e.g., 
Ausubel, Sambrook and Berger, all supra. 

Alternatively, polynucleotides and oligonucleotides of the invention can be 
assembled from fragments produced by solid-phase synthesis methods. Typically, 
fragments of up to approximately 100 bases are individually synthesized and then 
enzymatically or chemically ligated to produce a desired sequence, e.g., a 
polynucleotide encoding all or part of a transcription factor. For example, chemical 
synthesis using the phosphoramidite method is described, e.g., by Beaucage et al. 
(1981) Tetrahedron Letters 22:1859-1869; and Matthes et al. (1984) EMBO J. 3:801- 
805. According to such methods, oligonucleotides are synthesized, purified, annealed 
to their complementary strand, ligated and then optionally cloned into suitable 
vectors. And if so desired, the polynucleotides and polypeptides of the invention can 
be custom ordered from any of a number of commercial suppliers. 

V. Homologous Sequences 
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Sequences homologous, i.e., that share significant sequence identity or 
similarity, to those provided in the Sequence Listing, derived from Arabidopsis 
thaliana or from other plants of choice are also an aspect of the invention. 
Homologous sequences can be derived from any plant including monocots and dicots 
and in particular agriculturally important plant species, including but not limited to, 
crops such as soybean, wheat, corn, potato, cotton, rice, rape, oilseed rape (including 
canola), sunflower, alfalfa, sugarcane and turf; or fruits and vegetables, such as 
banana, blackberry, blueberry, strawberry, and raspberry, cantaloupe, carrot, 
cauliflower, coffee, cucumber, eggplant, grapes, honeydew, lettuce, mango, melon, 
onion, papaya, peas, peppers, pineapple, pumpkin, spinach, squash, sweet corn, 
tobacco, tomato, watermelon, rosaceous fruits (such as apple, peach, pear, cherry and 
plum) and vegetable brassicas (such as broccoli, cabbage, cauliflower, Brussels 
sprouts, and kohlrabi). Other crops, fruits and vegetables whose phenotype can be 
changed include barley, rye, millet, sorghum, currant, avocado, citrus fruits such as 
oranges, lemons, grapefruit and tangerines, artichoke, cherries, nuts such as the 
walnut and peanut, endive, leek, roots, such as arrowroot, beet, cassava, turnip, radish, 
yam, and sweet potato, and beans. The homologous sequences may also be derived 
from woody species, such pine, poplar and eucalyptus, or mint or other labiates. 

Orthologs And Paralogs 

Several different methods are known by those of skill in the art for identifying 
and defining these functionally homologous sequences. Three general methods for 
defining paralogs and orthologs are described; a paralog or ortholog or homolog may 
be identified by one or more of the methods described below. 

Orthologs and paralogs are evolutionarily related genes that have similar 
sequence and similar functions. Orthologs are structurally related genes in different 
species that are derived from a speciation event. Paralogs are structurally related 
genes within a single species that are derived by a duplication event. 

Within a single plant species, gene duplication may cause two copies of a 
particular gene, giving rise to two or more genes with similar sequence and similar 
function known as paralogs. A paralog is therefore a similar gene with a similar 
function within the same species. Paralogs typically cluster together or in the same 
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clade (a group of similar genes) when a gene family phylogeny is analyzed using 
programs such as CLUSTAL (Thompson et al. (1994) Nucleic Acids Res. 22:4673- 
4680; Higgins et al. (1996) Methods Enzymol. 266 383-402). Groups of similar 
genes can also be identified with pair-wise BLAST analysis (Feng and Doolittle 
(1987) J. Mol. Evol. 25:351-360). For example, a clade of very similar MADS 
domain transcription factors from Arabidopsis all share a common function in 
flowering time (Ratcliffe et al. (2001) Plant Physiol. 126:122-132), and a group of 
very similar AP2 domain transcription factors from Arabidopsis are involved in 
tolerance of plants to freezing (Gilmour et al. (1998) Plant J. 16:433-442). Analysis 
of groups of similar genes with similar function that fall within one clade can yield 
sub-sequences that are particular to the clade. These sub-sequences, known as 
consensus sequences, can not only be used to define the sequences within each clade, 
but define the functions of these genes; genes within a clade may contain paralogous 
or orthologous sequences that share the same function. (See also, for example, Mount, 
D.W. (2001) Bioinformatics: Sequence and Genome Analysis Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, New York page 543.) 

Speciation, the production of new species from a parental species, can also 
give rise to two or more genes with similar sequence and similar function. These 
genes, termed orthologs, often have an identical function within their host plants and 
are often interchangeable between species without losing function. Because plants 
have common ancestors, many genes in any plant species will have a corresponding 
orthologous gene in another plant species. Once a phylogenic tree for a gene family 
of one species has been constructed using a program such as CLUSTAL (Thompson 
et al. (1994) Nucleic Acids Res. 22:4673-4680; Higgins et al. (1996) Methods 
Enzymol. 266:383-402), potential orthologous sequences can placed into the 
phylogenetic tree and its relationship to genes from the species of interest can be 
determined. Once the ortholog pair has been identified, the function of the test 
ortholog can be determined by determining the function of the reference ortholog. 

Transcription factors that are homologous to the listed sequences will typically 
share at least about 30% amino acid sequence identity, or at least about 30% amino 
acid sequence identity outside of a known consensus sequence or consensus DNA- 
binding site. More closely related transcription factors can share at least about 50%, 
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about 60%, about 65%, about 70%, about 75% or about 80% or about 90% or about 
95% or about 98% or more sequence identity with the listed sequences, or with the 
listed sequences but excluding or outside a known consensus sequence or consensus 
DNA-binding site, or with the listed sequences excluding one or all conserved 
domain. Factors that are most closely related to the listed sequences share, e.g., at 
least about 85%, about 90% or about 95% or more % sequence identity to the listed 
sequences, or to the listed sequences but excluding or outside a known consensus 
sequence or consensus DNA-binding site or outside one or all conserved domain. At 
the nucleotide level, the sequences will typically share at least about 40% nucleotide 
sequence identity, preferably at least about 50%, about 60%, about 70% or about 80% 
sequence identity, and more preferably about 85%), about 90%, about 95% or about 
97% or more sequence identity to one or more of the listed sequences, or to a listed 
sequence but excluding or outside a known consensus sequence or consensus DNA- 
binding site, or outside one or all conserved domain. The degeneracy of the genetic 
code enables major variations in the nucleotide sequence of a polynucleotide while . 
maintaining the amino acid sequence of the encoded protein. Conserved domains 
within a transcription factor family may exhibit a higher degree of sequence 
homology, such as at least 65% sequence identity including conservative 
substitutions, and preferably at least 80% sequence identity, and more preferably at 
least 85%, or at least about 86%, or at least about 87%, or at least about 88%, or at 
least about 90%, or at least about 95%, or at least about 98% sequence identity. 
Transcription factors that are homologous to the listed sequences should share at least 
30%, or at least about 60%, or at least about 75%, or at least about 80%, or at least 
about 90%, or at least about 95% amino acid sequence identity over the entire length 
of the polypeptide or the homolog. In addition, transcription factors that are 
homologous to the listed sequences should share at least 30%, or at least about 60%, 
or at least about 75%, or at least about 80%, or at least about 90%, or at least about 
95% amino acid sequence similarity over the entire length of the polypeptide or the 
homolog. 

Percent identity can be determined electronically, e.g., by using the 
MEGALIGN program (DNASTAR, Inc. Madison, Wis.). The MEGALIGN program 
can create alignments between two or more sequences according to different methods, 
e.g., the clustal method. (See, e.g., Higgins, D. G. and P. M. Sharp (1988) Gene 
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73:237-244.) The clustal algorithm groups sequences into clusters by examining the 
distances between all pairs. The clusters are aligned pairwise and then in groups. 
Other alignment algorithms or programs may be used, including FASTA, BLAST, or 
ENTREZ, FASTA and BLAST. These are available as a part of the GCG sequence 
analysis package (University of Wisconsin, Madison, Wis.), and can be used with or 
without default settings. ENTREZ is available through the National Center for 
Biotechnology Information. In one embodiment, the percent identity of two 
sequences can be determined by the GCG program with a gap weight of 1, e.g., each 
amino acid gap is weighted as if it were a single amino acid or nucleotide mismatch 
between the two sequences (see USPN 6,262,333). 

Other techniques for alignment are described in Methods in Enzymology, vol. 
266: Computer Methods for Macromolecular Sequence Analysis (1996), ed. Doolittle, 
Academic Press, Inc., San Diego, Calif., USA. Preferably, an alignment program that 
permits gaps in the sequence is utilized to align the sequences. The Smith- Waterman 
is one type of algorithm that permits gaps in sequence alignments. See Methods Mol. 
Biol. 70: 173-187 (1997). Also, the GAP' program using the Needleman and Wunsch 
alignment method can be utilized to align sequences. An alternative search strategy 
uses MPSRCH software, which runs on a MASPAR computer. MPSRCH uses a 
Smith-Waterman algorithm to score sequences on a massively parallel computer. 
This approach improves ability to pick up distantly related matches, and is especially 
tolerant of small gaps and nucleotide sequence errors. Nucleic acid-encoded amino 
acid sequences can be used to search both protein and DNA databases. 

The percentage similarity between two polypeptide sequences, e.g., sequence 
A and sequence B, is calculated by dividing the length of sequence A, minus the 
number of gap residues in sequence A, minus the number of gap residues in sequence 
B, into the sum of the residue matches between sequence A and sequence B, times 
one hundred. Gaps of low or of no similarity between the two amino acid sequences 
are not included in determining percentage similarity. Percent identity between 
polynucleotide sequences can also be counted or calculated by other methods known 
in the art, e.g., the Jotun Hein method. (See, e.g., Hein, J. (1990) Methods Enzymol. 
1 83:626-645.) Identity between sequences can also be determined by other methods 
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known in the art, e.g., by varying hybridization conditions (see US Patent Application 
No. 20010010913). 

Thus, the invention provides methods for identifying a sequence similar or 
paralogous or orthologous or homologous to one or more polynucleotides as noted 
herein, or one or more target polypeptides encoded by the polynucleotides, or 
otherwise noted herein and may include linking or associating a given plant 
phenotype or gene function with a sequence. In the methods, a sequence database is 
provided (locally or across an inter or intra net) and a query is made against the 
sequence database using the relevant sequences herein and associated plant 
phenotypes or gene functions. 

In addition, one or more polynucleotide sequences or one or more 
. polypeptides encoded by the polynucleotide sequences may be used to search against 
a BLOCKS (Bairoch et al. (1997) Nucleic Acids Res. 25:217-221), PFAM, and other 
databases which contain previously identified and annotated motifs, sequences and 
gene functions. Methods mat search for primary sequence patterns with secondary 
structure gap penalties (Smith et al. (1992) Protein Engineering 5:35-51) as well as 
algorithms such as Basic Local Alignment Search Tool (BLAST; Altschul, S. F. 
(1993) J. Mol. Evol. 36:290-300; Altschul et al. (1990) supra), BLOCKS (Henikoff, 
S. and Henikoff, G. J. (1991) Nucleic Acids Research 19:6565-6572), Hidden Markov 
Models (HMM; Eddy, S. R. (1996) Cur. Opin. Str. Biol. 6:361-365; Sonnhammer et 
al. (1997) Proteins 28:405-420), and the like, can be used to manipulate and analyze 
polynucleotide and polypeptide sequences encoded by polynucleotides. These 
databases, algorithms and other methods are well known in the art and are described 
in Ausubel et al. (1 997; Short Protocols in Molecular Biology, John Wiley & Sons, 
New York N.Y., unit 7.7) and in Meyers, R. A. (1995; Molecular Biology and 
Biotechnology, Wiley VCH, New York N.Y., p 856-853). 

Furthermore, methods using manual alignment of sequences similar or 
homologous to one or more polynucleotide sequences or one or more polypeptides 
encoded by the polynucleotide sequences may be used to identify regions of similarity 
and conserved domains. Such manual methods are well-known of those of skill in the 
art and can include, for example, comparisons of tertiary structure between a 
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polypeptide sequence encoded by a polynucleotide which comprises a known function 
with a polypeptide sequence encoded by a polynucleotide sequence which has a 
function not yet determined. Such examples of tertiary structure may comprise 
predicted alpha helices, beta-sheets, amphipathic helices, leucine zipper motifs, zinc 
finger motifs, proline-rich regions, cysteine repeat motifs, and the like. 



VI. Identifying Polynucleotides or Nucleic Acids by Hybridization 

Polynucleotides homologous to the sequences illustrated in the Sequence 
Listing and tables can be identified, e.g., by hybridization to each other under 
stringent or under highly stringent conditions. Single stranded polynucleotides 
hybridize when they associate based on a variety of well characterized physical- 
chemical forces, such as hydrogen bonding, solvent exclusion, base stacking and the 
like. The stringency of a hybridization reflects the degree of sequence identity of the 
nucleic acids involved, such that the higher the stringency, the more similar are the 
two polynucleotide strands. Stringency is influenced by a variety of factors, including 
temperature, salt concentration and composition, organic and non-organic additives, 
solvents, etc. present in both the hybridization and wash solutions and incubations 
(and number thereof), as described in more detail in the references cited above. 
Encompassed by the invention are polynucleotide sequences that are capable of 
hybridizing to the claimed polynucleotide sequences, and, in particular, to those 
shown in SEQ ID NOs: 860; 802; 240; 274; 558; 24; 1120; 44; 460; 286; 120; 130; 
134; 698; 832; 580; 612; 48, and fragments thereof under various conditions of 
stringency. (See, e.g., Wahl, G. M. and S. L. Berger (1987) Methods Enzymol. 
152:399-407; Kimmel, A. R. (1987) Methods Enzymol. 152:507-511.) Estimates of 
homology are provided by either DNA-DNA or DNA-RNA hybridization under 
conditions of stringency as is well understood by those skilled in the art (Haines and 
Higgins, Eds. (1985) Nucleic Acid Hybridisation, IRL Press, Oxford, U.K.). 
Stringency conditions can be adjusted to screen for moderately similar fragments, 
such as homologous sequences from distantly related organisms, to highly similar 
fragments, such as genes that duplicate functional enzymes from closely related 
organisms. Post-hybridization washes determine stringency conditions. 

In addition to the nucleotide sequences listed in Tables 4 and 5, full length 
cDNA, orthologs, paralogs and homologs of the present nucleotide sequences may be 
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identified and isolated using well known methods. The cDNA libraries orthologs, 
paralogs and homologs of the present nucleotide sequences may be screened using 
hybridization methods to determine their utility as hybridization target or 
amplification probes. 

An example of stringent hybridization conditions for hybridization of 
complementary nucleic acids which have more than 100 complementary residues on a 
filter in a Southern or northern blot is about 5°C to 20°C lower than the thermal 
melting point (T m ) for the specific sequence at a defined ionic strength and pH. The 
T m is the temperature (under defined ionic strength and pH) at which 50% of the 
target sequence hybridizes to a perfectly matched probe. Nucleic acid molecules that 
.hybridize under stringent conditions will typically hybridize to a probe based on either 
the entire cDNA or selected portions, e.g.,to a unique subsequence, of the cDNA 
under wash conditions of 0.2x SSC to 2.0 x SSC, 0.1% SDS at 50-65° C. For 
example, high stringency is about 0.2 x SSC, 0.1% SDS at 65° C. Ultra-high 
stringency will be the same conditions except the wash temperature is raised about 3 
to about 5° C, and ultra-ultra-high stringency will be the same conditions except the 
wash temperature is raised about 6 to about 9° C. For identification of less closely 
related homologies washes can be performed at a lower temperature, e.g., 50° C. Li 
general, stringency is increased by raising the wash temperature and/or decreasing the 
concentration of SSC, as known in the art. 

In another example, stringent salt concentration will ordinarily be less than 
about 750 mM NaCl and 75 mM trisodium citrate, preferably less than about 500 mM 
NaCl and 50 mM trisodium citrate, and most preferably less than about 250 mM NaCl 
and 25 mM trisodium citrate. Low stringency hybridization can be obtained in the 
absence of organic solvent, e.g., formamide, while high stringency hybridization can 
be obtained in the presence of at least about 35% formamide, and most preferably at 
least about 50% formamide. Stringent temperature conditions will ordinarily include 
temperatures of at least about 30° C, more preferably of at least about 37° C, and most 
preferably of at least about 42° C. Varying additional parameters, such as 
hybridization time, the concentration of detergent, e.g., sodium dodecyl sulfate (SDS), 
and the inclusion or exclusion of carrier DNA, are well known to those skilled in the 
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art. Various levels of stringency are accomplished by combining these various 
conditions as needed. In a preferred embodiment, hybridization will occur at 30° C in 
750 mM NaCl, 75 raM trisodium citrate, and 1% SDS. In a more preferred 
embodiment, hybridization will occur at 37° C in 500 mM NaCl, 50 mM trisodium 
citrate, 1% SDS, 35% formamide, and 100 ng/ml denatured salmon sperm DNA 
(ssDNA). In a most preferred embodiment, hybridization will occur at 42° C in 250 
mM NaCl, 25 mM trisodium citrate, 1% SDS, 50% formamide, and 200 ug/ml 
ssDNA. Useful variations on these conditions will be readily apparent to those skilled 
in the art. 



The washing steps that follow hybridization can also vary in stringency. Wash 
stringency conditions can be defined by salt concentration and by temperature. As 
above, wash stringency can be increased by decreasing salt concentration or by 
increasing temperature. For example, stringent salt concentration for the wash steps 
will preferably be less than about 30 mM NaCl and 3 mM trisodium citrate, and most 
preferably less than about 1 5 mM NaCl and 1 .5 mM trisodium citrate. Stringent . 
temperature conditions for the wash steps will ordinarily include temperature of at 
least about 25° C, more preferably of at least about 42° C. Another preferred set of 
highly stringent conditions uses two final washes in 0.1X SSC, 0. 1% SDS at 65° C. . 
The most preferred high stringency washes are of at least about 68° C. For example, ■ 
in a preferred embodiment, wash steps will occur at 25° C in 30 mM NaCl, 3 mM 
trisodium citrate, and 0. 1 % SDS. In a more preferred embodiment, wash steps will 
occur at 42° C in 15 mM NaCl, 1.5 mM trisodium citrate, and 0.1% SDS. In a most 
preferred embodiment, the wash steps will occur at 68° C in 15 mM NaCl, 1.5 mM 
trisodium citrate, and 0.1% SDS. Additional variations on these conditions will be 
readily apparent to those skilled in the art (see U.S. Patent Application No. 
20010010913). 



As another example, stringent conditions can be selected such that an 
oligonucleotide that is perfectly complementary to the coding oligonucleotide 
hybridizes to the coding oligonucleotide with at least about a 5-1 Ox higher signal to 
noise ratio than the ratio for hybridization of the perfectly complementary 
oligonucleotide to a nucleic acid encoding a transcription factor known as of the filing 
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date of the application. Conditions can be selected such that a higher signal to noise 
ratio is observed in the particular assay which is used, e.g., about 15x, 25x, 35x, 50x 
or more. Accordingly, the subject nucleic acid hybridizes to the unique coding 
oligonucleotide with at least a 2x higher signal to noise ratio as compared to 
hybridization of the coding oligonucleotide to a nucleic acid encoding known 
polypeptide. Again, higher signal to noise ratios can be selected, e.g., about 5x, lOx, 
25x, 35x, 50x or more. The particular signal will depend on the label used in the 
relevant assay, e.g., a fluorescent label, a colorimetric label, a radioactive label, or the 
like. 

Alternatively, transcription factor homolog polypeptides can be obtained by 
screening an expression library using antibodies specific for one or more transcription 
factors. With the provision herein of the disclosed transcription factor, and 
transcription factor homologue nucleic acid sequences, the encoded polypeptide(s) 
can be expressed and purified in a heterologous expression system (e.g., E. coli) and 
used to raise antibodies (monoclonal or polyclonal) specific for the polypeptide(s) in 
question. Antibodies can also be raised against synthetic peptides derived from 
transcription factor, or transcription factor homologue, amino acid sequences. 
Methods of raising antibodies are well known in the art and are described in Harlow 
and Lane (1988) Antibodies: A Laboratory Manual Cold Spring Harbor Laboratory, 
. New York. Such antibodies can then be used to screen an expression library 
produced from the plant from which it is desired to clone additional transcription 
factor homologues, using the methods described above. The selected cDNAs can be 
confirmed by sequencing and enzymatic activity. 

VII. Sequence Variations 

It will readily be appreciated by those of skill in the art, that any of a variety of 
polynucleotide sequences are capable of encoding the transcription factors and 
transcription factor homologue polypeptides of the invention. Due to the degeneracy 
of the genetic code, many different polynucleotides can encode identical and/or 
substantially similar polypeptides in addition to those sequences illustrated in the 
Sequence Listing. Nucleic acids having a sequence that differs from the sequences 
shown in the Sequence Listing, or complementary sequences, that encode functionally 
equivalent peptides (i.e., peptides having some degree of equivalent or similar 
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biological activity) but differ in sequence from the sequence shown in the sequence 
listing due to degeneracy in the genetic code, are also within the scope of the 
invention. 



Altered polynucleotide sequences encoding polypeptides include those 
sequences with deletions, insertions, or substitutions of different nucleotides, resulting 
in a polynucleotide encoding a polypeptide with at least one functional characteristic 
of the instant polypeptides. Included within this definition are polymorphisms which 
may or may not be readily detectable using a particular oligonucleotide probe of the 
polynucleotide encoding the instant polypeptides, and improper or unexpected 
hybridization to allelic variants, with a locus other than the normal chromosomal 
locus for the polynucleotide sequence encoding the instant polypeptides. 

Allelic variant refers to any of two or more alternative forms of a gene 
occupying the same chromosomal locus. Allelic variation arises naturally through 
mutation, and may result in phenotypic polymorphism within populations. Gene 
mutations can be silent (i.e., no change in the encoded polypeptide) or may encode 
polypeptides having altered amino acid sequence. The term allelic variant is also used 
herein to denote a protein encoded by an allelic variant of a gene. Splice variant refers 
to alternative forms of RNA transcribed from a gene. Splice variation arises naturally 
through use of alternative splicing sites within a transcribed RNA molecule, or less 
commonly between separately transcribed RNA molecules, and may result in several 
mRNAs transcribed from the same gene. Splice variants may encode polypeptides 
having altered amino acid sequence. The term splice variant is also used herein to 
denote a protein encoded by a splice variant of an mRNA transcribed from a gene. 

Those skilled in the art would recognize that the polypeptide sequence G681, 
SEQ ID NO: 580, represents a single transcription factor; allelic variation and 
alternative splicing may be expected to occur. Allelic variants of the polypeptide 
sequence of SEQ ED NO: 579 can be cloned by probing cDNA or genomic libraries 
from different individual organisms according to standard procedures. Allelic 
variants of the DNA sequence shown in SEQ ID NO: 579, including those containing 
silent mutations and those in which mutations result in amino acid sequence changes, 
are within the scope of the present invention, as are proteins which are allelic variants 
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of SEQ ID NO: 580. cDNAs generated from alternatively spliced mRNAs, which 
retain the properties of the transcription factor are included within the scope of the 
present invention, as are polypeptides encoded by such cDNAs and mRNAs. Allelic 
variants and splice variants of these sequences can be cloned by probing cDNA or 
genomic libraries from different individual organisms or tissues according to standard 
procedures known in the art (see USPN 6,388,064). 

For example, Table 1 illustrates, e.g., that the codons AGC, AGT, TCA, TCC, 
TCG, and TCT all encode the same amino acid: serine. Accordingly, at each position 
in the sequence where there is a codon encoding serine, any of the above trinucleotide 
sequences can be used without altering the encoded polypeptide. 
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Table 1 



Amino acid 


Possible Codons 


Alanine 


Ala 


A 


GCA 


GCC 


GCG 


GCU 






Cysteine 


Cys 


c 


TGC 


TGT 










Aspartic acid 


Asp 


D 


GAC 


GAT 










Glutamic acid Glu 


E 


GAA 


GAG 










Phenylalanine Phe 


F 


TTC 


TTT 










Glvcine 


Glv 

vji y 


G 


GGA 


GGC 


GGG 


GGT 






Histidine 


XXLo 


H 


CAC 


CAT 










T^olpllPITlf* 
JLOVXwUvlXlW 


Tip 
lie 


I 


ATA 


ATC 


ATT 








Lysine 


Lys 


XT' 

K 


AAA 


AAG 










Leucine 


Leu 


T 

L 


TTA 


TTG 


CTA 


CTC 


CTG 


CTT 


Methionine 


Met 


M 


ATG 












Asparagine 


Asn 


N 


AAC 


AAT 










Proline 


Pro 


P 


CCA 


CCC 


CCG 


CCT 






Glutamine 


Gin 


Q 


CAA 


CAG 










Arginine 


Arg 


R 


AGA 


AGG 


CGA 


CGC 


CGG 


CGT 


Serine 


Ser 


S 


AGC 


AGT 


TCA 


TCC 


TCG 


TCT 


Threonine 


Thr 


T 


ACA 


ACC 


ACG 


ACT 






Valine 


Val 


V 


GTA 


GTC 


GTG 


GTT 






Tryptophan 


Trp 


W 


TGG 












Tyrosine 


Tyr 


Y 


TAC 


TAT 











Sequence alterations that do not change the amino acid sequence encoded by 
the polynucleotide are termed "silent" variations. With the exception of the codons 
ATG and TGG, encoding methionine and tryptophan, respectively, any of the possible 
codons for the same amino acid can be substituted by a variety of techniques, e.g., 
site-directed mutagenesis, available in the art. Accordingly, any and all such 
variations of a sequence selected from the above table are a feature of the invention. 
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In addition to silent variations, other conservative variations that alter one, or a 
few amino acids in the encoded polypeptide, can be made without altering the 
function of the polypeptide, these conservative variants are, likewise, a feature of the 
invention. 

For example, substitutions, deletions and insertions introduced into the 
sequences provided in the Sequence Listing are also envisioned by the invention. 
Such sequence modifications can be engineered into a sequence by site-directed 
mutagenesis (Wu (ed.) Meth. Enzvmol . (1993) vol. 217, Academic Press) or the other 
methods noted below. Amino acid substitutions are typically of single residues; 
insertions usually will be on the order of about from 1 to 1 0 amino acid residues; and 
deletions will range about from 1 to 30 residues. In preferred embodiments, deletions 
or insertions are made in adjacent pairs, e.g., a deletion of two residues or insertion of 
two residues. Substitutions, deletions, insertions or any combination thereof can be 
combined to arrive at a sequence. The mutations that are made in the polynucleotide 
encoding the transcription factor should not place the sequence out of reading frame 
and should not create complementary regions that could produce secondary mRNA 
structure. Preferably, the polypeptide encoded by the DNA performs the desired 
function. 

Conservative substitutions are those in which at least one residue in the amino 
acid sequence has been removed and a different residue inserted in its place. Such . 
substitutions generally are made in accordance with the Table 2 when it is desired to 
maintain the activity of the protein. Table 2 shows amino acids which can be 
substituted for an amino acid in a protein and which are typically regarded as 
conservative substitutions. 



BNSDOCID: <WO_03013227A2_L> 



36 



WO 03/013227 



PCT/US02/25805 



Table 2 



Residue 


C**f\Y\ QPTVatl VP 
V/UlioCi VallVC 




Snlictitnti nnc 

O U.Ul> 11 lULlUllo 


Ala 


oer 


Arc 


Lys 




fUrr Pic 






vJlU 


Asn 


v^yb 


oer 


VJiU 


Asp 


Glv 


iTO 


TJic 
nio 


Asn, om 


Tie 

jlig 




T pii 

JjvU 


ne, v ai 




ATg, Lrin 


XVI CI 


Leu; lie 


i ilC 


Met, i^eu, iyr 


Ser 


Thr; Gly 


Thr 


Ser; Val 


Trp 


Tyr 


Tyr 


Trp; Phe 


Val 


He; Leu 



Similar substitutions are those in which at least one residue in the amino acid 
sequence has been removed and a different residue inserted in its place. Such 
substitutions generally are made in accordance with the Table 3 when it is desired to 
maintain the activity of the protein. Table 3 shows amino acids which can be 
substituted for an ainino acid in a protein and which are typically regarded as 
structural and functional substitutions. For example, a residue in column 1 of Table 3 
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may be substituted with residue in column 2; in addition, a residue in column 2 of ' 
Table 3 may be substituted with the residue of column 1. 



Table 3 


Residue 


Similar Substitutions 


Ala 


Ser; Thr; Gly; Val; Leu; lie 


Arg 


T TT' 1 

Lys; His; Gly 


Asn 


Gin; His; Gly; Ser; Thr 


Asp 


Glu, Ser; Thr 


Gin 


Asn; Ala 


Cys 


Ser; Gly 


Glu , 


Asp 


Gly 


Pro; Arg 


His 


Asn; Gin; Tyr; Phe; Lys; Arg 


He 


Ala; Leu; Val; Gly; Met 


Leu 


Ala; He; Val; Gly; Met 


Lys 


A TT* 1 /—II T\ 

Arg; His; Gin; Gly; Pro 


Met 


Leu; lie; Phe 


Phe 


Met* Leu' Tvr Tro* His; Val; 




Ala 


Ser 


Thr; Gly; Asp; Ala; Val; lie; His 


Thr 


Ser; Val; Ala; Gly 


Trp 


Tyr; Phe; His 


Tyr 


Trp; Phe; His 


Val 


Ala; lie; Leu; Gly; Thr; Ser; Glu 



Substitutions that are less conservative than those in Table 2 can be selected 
by picking residues that differ more significantly in their effect on maintaining (a) the 
structure of the polypeptide backbone in the area of the substitution, for example, as a 
sheet or helical conformation, (b) the charge or hydrophobicity of the molecule at the 
target site, or (c) the bulk of the side chain. The substitutions which in general are 
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expected to produce the greatest changes in protein properties will be those in which 
(a) a hydrophilic residue, e.g., seryl or threonyl, is substituted for (or by) a 
hydrophobic residue, e.g., leucyl, isoleucyl, phenylalanyl, valyl or alanyl; (b) a 
cysteine or proline is substituted for (or by) any other residue; (c) a residue having an 
electropositive side chain, e.g., lysyl, arginyl, or histidyl, is substituted for (or by) an 
electronegative residue, e.g., glutamyl or aspartyl; or (d) a residue having a bulky side 
chain, e.g., phenylalanine, is substituted for (or by) one not having a side chain, e.g., 
glycine. 



VIIL Further Modifying Sequences of the Invention - Mutation/Forced 
Evolution 

In addition to generating silent or conservative substitutions as noted, above, 
the present invention optionally includes methods of modifying the sequences of the 
Sequence Listing. In the methods, nucleic acid or protein modification methods are 
used to alter the given sequences to produce new sequences and/or to chemically or 
enzymatically modify given sequences to change the properties of the nucleic acids or 
proteins. 



Thus, in one embodiment, given nucleic acid sequences are modified, e.g., 
according to standard mutagenesis or artificial evolution methods to produce modified 
sequences. The modified sequences may be created using purified natural 
polynucleotides isolated from any organism or may be synthesized from purified 
compositions and chemicals using chemical means well know to those of skill in the 
art. For example, Ausubel, supra, provides additional details on mutagenesis 
methods. Artificial forced evolution methods are described, for example, by Stemmer 
(1994) Nature 370:389-391, Stemmer (1994) Proc. Natl. Acad. Sci. USA 91:10747- 
10751, and U.S. Patents 5,811,238, 5,837,500, and 6,242,568. Methods for 
engineering synthetic transcription factors and other polypeptides are described, for 
example, by Zhang et al. (2000) J. Biol. Chem. 275:33850-33860, Liu et al. (2001) J. 
Biol. Chem. 276:11323-11334, and Isalan et al. (2001) Nature Biotechnol 19:656- 
660. Many other mutation and evolution methods are also available and expected to 
be within the skill of the practitioner. 
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. Similarly, chemical or enzymatic alteration of expressed nucleic acids and 
polypeptides can be performed by standard methods. For example, sequence can be 
modified by addition of lipids, sugars, peptides, organic or inorganic compounds, by 
the inclusion of modified nucleotides or amino acids, or the like. For example, 
protein modification techniques are illustrated in Ausubel, supra. Further details on 
chemical and enzymatic modifications can be found herein. These modification 
methods can be used to modify any given sequence, or to modify any sequence 
produced by the various mutation and artificial evolution modification methods noted 
herein. 

Accordingly, the invention provides for modification of any given nucleic acid 
by mutation, evolution, chemical or enzymatic modification, or other available 
methods, as well as for the products produced by practicing such methods, e.g., using 
the sequences herein as a starting substrate for the various modification approaches. 

For example, optimized coding sequence containing codons preferred by a 
particular prokaryotic or eukaryotic host can be used e.g., to increase the rate of 
translation or to produce recombinant RNA transcripts having desirable properties, 
such as a longer half-life, as compared with transcripts produced using a non- 
optimized sequence. Translation stop codons can also be modified to reflect host 
preference. For example, preferred stop codons for Saccharomyces cerevisiae and 
mammals are TAA and TGA, respectively. The preferred stop codon for 
monocotyledonous plants is TGA, whereas insects and E. coli prefer to use TAA as 
the stop codon. 

The polynucleotide sequences of the present invention can also be engineered 
in order to alter a coding sequence for a variety of reasons, including but not limited 
to, alterations which modify the sequence to facilitate cloning, processing and/or 
expression of the gene product. For example, alterations are optionally introduced 
using techniques which are well known in the art, e.g., site-directed mutagenesis, to 
insert new restriction sites, to alter glycosylation patterns, to change codon preference, 
to introduce splice sites, etc. 
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Furthermore, a fragment or domain derived from any of the polypeptides of 
the invention can be combined with domains derived from other transcription factors 
or synthetic domains to modify the biological activity of a transcription factor. For 
instance, a DNA-binding domain derived from a transcription factor of the invention 
can be combined with the activation domain of another transcription factor or with a 
synthetic activation domain. A transcription activation domain assists in initiating 
transcription from a DNA-binding site. Examples include the transcription activation 
region of VP16 or GAL4 (Moore et al. (1998) Proc. Natl. Acad. Sci. USA 95: 376- 
381; and Aoyama et al. (1995) Plant Cell 7:1773-1785), peptides derived from 
bacterial sequences (Ma and Ptashne (1987) Cell 51; 113-119) and synthetic peptides 
(Giniger and Ptashne, (1987) Nature 330:670-672). 



IX. Expression and Modification of Polypeptides 

Typically, polynucleotide sequences of the invention are incorporated into 
recombinant DNA (or RNA) molecules that direct expression of polypeptides of the 
invention in appropriate host cells, transgenic plants, in vitro translation systems, or 
the like. Due to the inherent degeneracy of the genetic code, nucleic acid sequences 
which encode substantially the same or a functionally equivalent amino acid sequence 
can be substituted for any listed sequence to provide for cloning and expressing the 
relevant homologue. 



X, Vectors, Promoters, and Expression Systems 

The present invention includes recombinant constructs comprising one or 
more of the nucleic acid sequences herein. The constructs typically comprise a 
vector, such as a plasmid, a cosmid, a phage, a virus (e.g., a plant virus), a bacterial 
artificial chromosome (BAC), a yeast artificial chromosome (YAC), or the like, into 
which a nucleic acid sequence of the invention has been inserted, in a forward or 
reverse orientation. In a preferred aspect of this embodiment, the construct further 
comprises regulatory sequences, including, for example, a promoter, operably linked 
to the sequence. Large numbers of suitable vectors and promoters are known to those 
of skill in the art, and are commercially available. 

General texts that describe molecular biological techniques useful herein, 
including the use and production of vectors, promoters and many other relevant 
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topics, include Berger, Sambrook and Ausubel, supra. Any of the identified sequences 
can be incorporated into a cassette or vector, e.g., for expression in plants. A number of 
expression vectors suitable for stable transformation of plant cells or for the 
establishment of transgenic plants have been described including those described in 
Weissbach and Weissbach, (1989,) Methods for Plant Molecular Biology . Academic 
Press, and Gelvin et al., (1990) Plant Molecular Biology Manual Kluwer Academic 
Publishers. Specific examples include those derived from a Ti plasmid of 
Agrobacterium tumefaciens, as well as those disclosed by Herrera-Estrella et al. 
(1983) Nature 303: 209, Bevan (1984) Nucl Acid Res. 12: 8711-8721, Klee (1985) 
Bio/Technology 3: 637-642, for dicotyledonous plants. 

Alternatively, non-Ti vectors can be used to transfer the DNA into 
monocotyledonous plants and cells by using free DNA delivery techniques. Such 
methods can involve, for example, the use of liposomes, electroporation, 
microprojectile bombardment, silicon carbide whiskers, and viruses. By using these 
methods transgenic plants such as wheat, rice (Christou (1991) Bio/Technology 9: 
957-962) and corn (Gordon-Kamm (1990) Plant Cell 2: 603-618) can be produced. 
An immature embryo can also be a good target tissue for monocots for direct DNA 
delivery techniques by using the particle gun (Weeks et al. (1993) Plant Physiol 102: 
1077-1084; Vasil (1993) Bio/Technology 10: 667-674; Wan and Lemeaux (1994) 
Plant Physiol 104: 37-48, and for Agrobacterium-mediated DNA transfer (Ishida et al. 
(1996) Nature Biotech 14: 745-750). 

Typically, plant transformation vectors include one or more cloned plant 
coding sequence (genomic or cDNA) under the transcriptional control of 5' and 3 1 
regulatory sequences and a dominant selectable marker. Such plant transformation 
vectors typically also contain a promoter (e.g., a regulatory region controlling 
inducible or constitutive, environmentally-or developmentally-regulated, or cell- or 
tissue-specific expression), a transcription initiation start site, an RNA processing 
signal (such as intron splice sites), a transcription termination site, and/or a 
polyadenylation signal. 

Examples of constitutive plant promoters which can be useful for expressing 
the TF sequence include: the cauliflower mosaic virus (CaMV) 35S promoter, which 
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confers constitutive, high-level expression in most plant tissues {see, e.g., Odell et al. 
(1985) Nature 313:810-812); the nopaline synthase promoter (An et al. (1988) Plant 
Phvsiol 88:547-552); and the octopine synthase promoter (Fromm et al. (1989) Plant 
Cell 1:977-984). 



A variety of plant gene promoters that regulate gene expression in response to 
environmental, hormonal, chemical, developmental signals, and in a tissue-active 
manner can be used for expression of a TF sequence in plants. Choice of a promoter 
is based largely on the phenotype of interest and is determined by such factors as 
tissue (e.g., seed, fruit, root, pollen, vascular tissue, flower, carpel, etc.), inducibihty 
(e.g., in response to wounding, heat, cold, drought, light, pathogens, etc.), timing, 
developmental stage, and the like. Numerous known promoters have been 
characterized and can favorably be employed to promote expression of a 
polynucleotide of the invention in a transgenic plant or cell of interest. For example, 
tissue specific promoters include: seed-specific promoters (such as the napin, 
phaseolin or DC3 promoter described in US Pat. No. 5,773,697), fruit-specific 
promoters that are active during fruit ripening (such as the dru 1 promoter (US Pat. 
No. 5,783,393), or the 2A1 1 promoter (US Pat. No. 4,943,674) and the tomato 
polygalacturonase promoter (Bird et al. (1988) Plant Mol Biol 11:651), root-specific 
promoters, such as those disclosed in US Patent Nos. 5,61 8,988, 5,837,848 and 
5,905,186, pollen-active promoters such as PTA29, PTA26 and PTA13 (US Pat. No. 
5,792,929), promoters active in vascular tissue (Ringli and Keller (1998) Plant Mol 
Biol 37:977-988), flower-specific (Kaiser et al, (1995) Plant Mol Biol 28:231-243), 
pollen (Baerson et al. (1994) Plant Mol Biol 26:1947-1959), carpels (Ohl et al. (1990) 
Plant Cell 2:837-848), pollen and ovules (Baerson et al. (1993) Plant Mol Biol 
22:255-267), auxin-inducible promoters (such as that described in van der Kop et al. 
(1999) Plant Mol Biol 39:979-990 or Baumann et al. (1999) Plant Cell 1 1 :323-334), 
cytokinin-inducible promoter (Guevara-Garcia (1998) Plant Mol Biol 38:743-753), 
promoters responsive to gibberellin (Shi et al. (1998) Plant Mol Biol 38:1053-1060, 
Willmott et al. (1998) 38:817-825) and the like. Additional promoters are those that 
elicit expression in response to heat (Ainley et al. (1993) Plant Mol Biol 22: 13-23), 
light (e.g., thepearbcS-3A promoter, Kuhlemeier et al. (1989) Plant Cell 1:471, and 
the maize rbcS promoter, Schaffher and Sheen (1991) Plant Cell 3: 997); wounding 
(e.g., wunl, Siebertz et al. (1989) Plant Cell 1 : 961); pathogens (such as the PR-1 
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promoter described in Buchel et al. (1999) Plant Mol. Biol. 40:387-396, and the 
PDF1.2 promoter described in Manners et al. (1998) Plant Mol. Biol. 38:1071-80), 
and chemicals such as methyl jasmonate or salicylic acid (Gatz et al. (1997) Plant Mol 
Biol 48: 89-108). In addition, the timing of the expression can be controlled by using 
promoters such as those acting at senescence (An and Amazon (1995) Science 270: 
1986-1988); or late seed development (Odell et al. (1994) Plant Physiol 106:447-458). 

Plant expression vectors can also include RNA processing signals that can be 
positioned within, upstream or downstream of the coding sequence. In addition, the 
expression vectors can include additional regulatory sequences from the 3- 
untranslated region of plant genes, e.g., a 3' terminator region to increase mRNA 
stability of the mRNA, such as the PI-II terminator region of potato or the octopine or 
nopaline synthase 3' terminator regions. 

Additional Expression Elements 

Specific initiation signals can aid in efficient translation of coding sequences. 
These signals can include, e.g., the ATG initiation codon and adjacent sequences. In 
cases where a coding sequence, its initiation codon and upstream sequences are 

1 inserted into the appropriate expression vector, no additional translational control 
signals may be needed. However, in cases where only coding sequence (e.g., a 

. 1 mature protein coding sequence), or a portion thereof, is inserted, exogenous 
transcriptional control signals including the ATG initiation codon can be separately 
provided. The initiation codon is provided in the correct reading frame to facilitate 
transcription. Exogenous transcriptional elements and initiation codons can be of 
various origins, both natural and synthetic. The efficiency of expression can be 
enhanced by the inclusion of enhancers appropriate to the cell system in use. 

Expression Hosts 

The present invention also relates to host cells which are transduced with 
vectors of the invention, and the production of polypeptides of the invention 
(including fragments thereof) by recombinant techniques. Host cells are genetically 
engineered (i.e., nucleic acids are introduced, e.g., transduced, transformed or 
transfected) with the vectors of this invention, which may be, for example, a cloning 
vector or an expression vector comprising the relevant nucleic acids herein. The 
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vector is optionally a plasmid, a viral particle, a phage, a naked nucleic acid, etc. The 
engineered host cells can be cultured in conventional nutrient media modified as 
appropriate for activating promoters, selecting transformants, or amplifying the 
relevant gene. The culture conditions, such as temperature, pH and the like, are those 
previously used with the host cell selected for expression, and will be apparent to 
those skilled in the art and in the references cited herein, including, Sambrook and 
Ausubel. 



The host cell can be a eukaryotic cell, such as a yeast cell, or a plant cell, or 
the host cell can be a prokaryotic cell, such as a bacterial ceU. Plant protoplasts are 
also suitable for some applications. For example, the DNA fragments are introduced 
into plant tissues, cultured plant cells or plant protoplasts by standard methods 
including electroporation (Fromm et al., (1985) Proc. Natl. Acad. Sci. USA 82, 5824, 
infection by viral vectors such as cauliflower mosaic virus (CaMV) (Hohn et al., 
( 1982 > Molecular Biology of Plant Tumors (Academic Press, New York) pp. 549- 
560; US 4,407,956), high velocity ballistic penetration by small particles with the 
nucleic acid either within the matrix of small beads or particles, or on the surface 
(Klein et al., (1987) Nature 327, 70-73), use of pollen as vector (WO 85/01856), or 
use of Agrobacterium tumefaciens or A. rhizogenes carrying a T-DNA plasmid in 
which DNA fragments are cloned. The T-DNA plasmid is transmitted to plant cells 
upon infection by Agrobacterium tumefaciens, and a portion is stably integrated into 
the plant genome (Horsch et al. (1984) Science 233:496-498: Fraley et al. (1983) 
Proc. Natl. Acad. Sci. USA 80, 4803). 

The cell can include a nucleic acid of the invention which encodes a 
polypeptide, wherein the cells expresses a polypeptide of the invention. The cell can 
also include vector sequences, or the like. Furthermore, cells and transgenic plants 
that include any polypeptide or nucleic acid above or throughout this specification, 
e.g., produced by transduction of a vector of the invention, are an additional feature of 
the invention. 



For long-term, high-yield production of recombinant proteins, stable 
expression can be used. Host cells transformed with a nucleotide sequence encoding 
a polypeptide of the invention are optionally cultured under conditions suitable for the 
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expression and recovery of the encoded protein from cell culture. The protein or 
fragment thereof produced by a recombinant cell may be secreted, membrane-bound, 
or contained intracellular^, depending on the sequence and/or the vector used. As 
will be understood by those of skill in the art, expression vectors containing 
polynucleotides encoding mature proteins of the invention can be designed with 
signal sequences which direct secretion of the mature polypeptides through a 
prokaryotic or eukaryotic cell membrane. 

XI. Modified Amino Acid Residues 

Polypeptides of the invention may contain one or more modified amino acid 
residues. The presence of modified amino acids may be advantageous in, for 
example, increasing polypeptide half-life, reducing polypeptide antigenicity or 
toxicity, increasing polypeptide storage stability, or the like. Amino acid residue(s) 
are modified, for example, co-translationally or post-translationally during 
recombinant production or modified by synthetic or chemical means. 

Non-limiting examples of a modified amino acid residue include incorporation 
or other use of acetylated amino acids, glycosylated amino acids, sulfated amino 
acids, prenylated (e.g., farnesylated, geranylgeranylated) amino acids, PEG modified 
(e.g., "PEGylated") amino acids, biotinylated amino acids, carboxylated amino acids, 
phosphorylated amino acids, etc. References adequate to guide one of skill in the 
modification of amino acid residues are replete throughout the literature. 

The modified amino acid residues may prevent or increase affinity of the 
polypeptide for another molecule, including, but not limited to, polynucleotide, 
proteins, carbohydrates, lipids and lipid derivatives, and other organic or synthetic 
compounds. 

XII. Identification of Additional Factors 

A transcription factor provided by the present invention can also be used to 
identify additional endogenous or exogenous molecules that can affect a phentoype or 
trait of interest. On the one hand, such molecules include organic (small or large 
molecules) and/or inorganic compounds that affect expression of (i.e., regulate) a 
particular transcription factor. Alternatively, such molecules include endogenous 
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molecules that are acted upon either at a transcriptional level by a transcription factor 
of the invention to modify a phenotype as desired. For example, the transcription 
factors can be employed to identify one or more downstream gene with which is 
subject to a regulatory effect of the transcription factor. In one approach, a 
transcription factor or transcription factor homologue of the invention is expressed in 
a host cell, e.g., a transgenic plant cell, tissue or explant, and expression products, 
either RNA or protein, of likely or random targets are monitored, e.g., by 
hybridization to a microarray of nucleic acid probes corresponding to genes expressed 
in a tissue or cell type of interest, by two-dimensional gel electrophoresis of protein 
products, or by any other method known in the art for assessing expression of gene 
products at the level of RNA or protein. Alternatively, a transcription factor of the 
invention can be used to identify promoter sequences (i.e., binding sites) involved in 
the regulation of a downstream target. After identifying a promoter sequence, 
interactions between the transcription factor and the promoter sequence can be 
modified by changing specific nucleotides in the promoter sequence or specific amino 
acids in the transcription factor that interact with the promoter sequence to alter a 
plant trait. Typically, transcription factor DNA-binding sites are identified by gel 
shift assays. After identifying the promoter regions, the promoter region sequences 
can be employed in double-stranded DNA aiTays to identify molecules that affect the 
interactions of the transcription factors with their promoters (Bulyk et al. (1999) 
Nature Biotechnology 17:573-577). 



The identified transcription factors are also useful to identify proteins that 
modify the activity of the transcription factor. Such modification can occur by 
covalent modification, such as by phosphorylation, or by protein-protein (homo or- 
heteropolymer) interactions. Any method suitable for detecting protein-protein 
interactions can be employed. Among the methods that can be employed are co- 
immunoprecipitation, cross-linking and co-purification through gradients or 
chromatographic columns, and the two-hybrid yeast system. 

The two-hybrid system detects protein interactions in vivo and is described in 
Chien et al. ((1991), Proc. Natl. Acad. Sci. USA 88:9578-9582) and is commercially 
available from Clontech (Palo Alto, Calif.). In such a system, plasmids are 
constructed that encode two hybrid proteins: one consists of the DNA-binding domain 
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of a transcription activator protein fused to the TF polypeptide and the other consists 
of the transcription activator protein's activation domain fused to an unknown protein 
that is encoded by a cDNA that has been recombined into the plasmid as part of a 
cDNA library. The DNA-binding domain fusion plasmid and the cDNA library are 
transformed into a strain of the yeast Saccharomyces cerevisiae that contains a 
reporter gene (e.g., lacZ) whose regulatory region contains the transcription activator's 
binding site. Either hybrid protein alone cannot activate transcription of the reporter 
gene. Interaction of the two hybrid proteins reconstitutes the functional activator 
protein and results in expression of the reporter gene, which is detected by an assay 
for the reporter gene product. Then, the library plasmids responsible for reporter gene 
expression are isolated and sequenced to identify the proteins encoded by the library 
plasmids. After identifying proteins that interact with the transcription factors, assays 
for compounds that interfere with the TF protein-protein interactions can be 
preformed. 

XHI. Identification of Modulators 

In addition to the intracellular molecules described above, extracellular 
molecules that alter activity or expression of a transcription factor, either directly or 
indirectly, can be identified. For example, the methods can entail first placing a 
candidate molecule in contact with a plant or plant cell. The molecule can be 
introduced by topical administration, such as spraying or soaking of a plant, and then 
the molecule's effect on the expression or activity of the TF polypeptide or the 
expression of the polynucleotide monitored. Changes in the expression of the TF 
polypeptide can be monitored by use of polyclonal or monoclonal antibodies, gel 
electrophoresis or the like. Changes in the expression of the corresponding 
polynucleotide sequence can be detected by use of microarrays, Northerns, 
quantitative PCR, or any other technique for monitoring changes in mRNA 
expression. These techniques are exemplified in Ausubel et al. (eds) Current 
Protocols in Molecular Biology, John Wiley & Sons (1998, and supplements through 
2001). Such changes in the expression levels can be correlated with modified plant 
traits and thus identified molecules can be useful for soaking or spraying on fruit, 
vegetable and grain crops to modify traits in plants. 
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Essentially any available composition can be tested for modulatory activity of 
expression or activity of any nucleic acid or polypeptide herein. Thus, available 
libraries of compounds such as chemicals, polypeptides, nucleic acids and the like can 
be tested for modulatory activity. Often, potential modulator compounds can be 
dissolved in aqueous or organic (e.g., DMSO-based) solutions for easy delivery to the 
cell or plant of interest in which the activity of the modulator is to be tested. 
Optionally, the assays are designed to screen large modulator composition libraries by 
automating the assay steps and providing compounds from any convenient source to 
assays, which are typically run in parallel (e.g., in microtiter formats on microtiter 
plates in robotic assays). 

In one embodiment, high throughput screening methods involve providing a 
combinatorial library containing a large number of potential compounds (potential 
modulator compounds). Such "combinatorial chemical libraries" are then screened in 
one or more assays, as described herein, to identify those library members (particular 
chemical species or subclasses) that display a desired characteristic activity. The 
compounds thus identified can serve as target compounds. 

A combinatorial chemical library can be, e.g., a collection of diverse chemical 
compounds generated by chemical synthesis or biological synthesis. For example, a 
combinatorial chemical library such as a polypeptide library is formed by combining a ■ 
set of chemical building blocks (e.g., in one example, amino acids) in every possible 
way for a given compound length (i.e., the number of amino acids in a polypeptide 
compound of a set length). Exemplary libraries include peptide libraries, nucleic acid 
libraries, antibody libraries (see, e.g., Vaughn et al. (1996) Nature Biotechnology 
14(3):309-314 and PCT/US96/102S7), carbohydrate libraries (see, e.g., Liang et al. 
Science (1996) 274:1520-1522 and U.S. Patent 5,593,853), peptide nucleic acid 
libraries (see, e.g., U.S. Patent 5,539,083), and small organic molecule libraries (see, 
e.g., benzodiazepines, Baum C&EN Jan 18, page 33 (1993); isoprenoids, U.S. Patent 
5,569,588; thiazolidinones and metathiazanones, U.S. Patent 5,549,974; pyrrolidines, 
U.S. Patents 5,525,735 and 5,519,134; morpholino compounds, U.S. Patent 
5,506,337) and the like. 
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Preparation and screening of combinatorial or other libraries is well known to 
those of skill in the art. Such combinatorial chemical libraries include, but are not 
limited to, peptide libraries (see, e.g., U.S. Patent 5,010,175; Furka, (1991) Int. J. 
Pert. Prot. Res. 37:487-493; and Houghton et al. (1991) Nature 354:84-88). Other 
chemistries for generating chemical diversity libraries can also be used. 

In addition, as noted, compound screening equipment for high- throughput 
screening is generally available, e.g., using any of a number of well known robotic 
systems that have also been developed for solution phase chemistries useful in assay 
systems. These systems include automated workstations including an automated 
synthesis apparatus and robotic systems utilizing robotic arms. Any of the above 
devices are suitable for use with the present invention, e.g., for high-throughput 
screening of potential modulators. The nature and implementation of modifications to 
these devices (if any) so that they can operate as discussed herein will be apparent to 
persons skilled in the relevant art. 

Indeed, entire high throughput screening systems are commercially available. 
These systems typically automate entire procedures including all sample and reagent 
pipetting, liquid dispensing, timed incubations, and final readings of the microplate in 
detector(s) appropriate for the assay. These configurable systems provide high 
throughput and rapid start up as well as a high degree of flexibility and customization. 
Similarly, microfluidic implementations of screening are also commercially available. 

The manufacturers of such systems provide detailed protocols the various high 
throughput. Thus, for example, Zymark Corp. provides technical bulletins describing 
screening systems for detecting the modulation of gene transcription, ligand binding, 
and the like. The integrated systems herein, in addition to providing for sequence 
alignment and, optionally, synthesis of relevant nucleic acids, can include such 
screening apparatus to identify modulators that have an effect on one or more 
polynucleotides or polypeptides according to the present invention. 

In some assays it is desirable to have positive controls to ensure that the 
components of the assays are working properly. At least two types of positive 
controls are appropriate. That is, known transcriptional activators or inhibitors can be 
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incubated with cells/plants/ etc. in one sample of the assay, and the resulting 
increase/decrease in transcription can be detected by measuring the resulting increase 
in RNA/ protein expression, etc., according to the methods herein. It will be 
appreciated that modulators can also be combined with transcriptional activators or 
inhibitors to find modulators that inhibit transcriptional activation or transcriptional 
repression. Either expression of the nucleic acids and proteins herein or any 
additional nucleic acids or proteins activated by the nucleic acids or proteins herein, 
or bothj can be monitored. 



In an embodiment, the invention provides a method for identifying 
compositions that modulate the activity or expression of a polynucleotide or 
polypeptide of the invention. For example, a test compound, whether a small or large 
molecule, is placed in contact with a cell, plant (or plant tissue or explant), or 
composition comprising the polynucleotide or polypeptide of interest and a resulting 
effect on the cell, plant, (or tissue or explant) or composition is evaluated by 
monitoring, either directly or indirectly, one or more of: expression level of the 
polynucleotide or polypeptide, activity (or modulation of the activity) of the 
polynucleotide or polypeptide. In some cases, an alteration in a plant phenotype can 
be detected following contact of a plant (or plant cell, or tissue or explant) with the ■ 
putative modulator, e.g., by modulation of expression or activity of a polynucleotide 
or polypeptide of the invention. Modulation of expression or activity of a 
polynucleotide or polypeptide of the invention may also be caused by molecular 
elements in a signal transduction second messenger pathway and such modulation can 
affect similar elements in the same or another signal transduction second messenger 
pathway. 

XIV. Subsequences 

Also contemplated are uses of polynucleotides, also referred to herein as 
oligonucleotides, typically having at least 12 bases, preferably at least 15, more 
preferably at least 20, 30, or 50 bases, which hybridize under at least highly stringent 
(or ultra-high stringent or ultra-ultra-high stringent conditions) conditions to a 
polynucleotide sequence described above. The polynucleotides may be used as 
probes, primers, sense and antisense agents, and the like, according to methods as 
noted supra. 
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Subsequences of the polynucleotides of the invention, including 
polynucleotide fragments and oligonucleotides are useful as nucleic acid probes and 
primers. An oligonucleotide suitable for use as a probe or primer is at least about 15 
nucleotides in length, more often at least about 18 nucleotides, often at least about 21 
nucleotides, frequently at least about 30 nucleotides, or about 40 nucleotides, or more 
in length. A nucleic acid probe is useful in hybridization protocols, e.g., to identify 
additional polypeptide homologues of the invention, including protocols for 
microarray experiments. Primers can be annealed to a complementary target DNA 
strand by nucleic acid hybridization to form a hybrid between the primer and the 
target DNA strand, and then extended along the target DNA strand by a DNA 
polymerase enzyme. Primer pairs can be used for amplification of a nucleic acid 
sequence, e.g., by the polymerase chain reaction (PCR) or other nucleic-acid 
amplification methods. See Sambrook and Ausubel, supra. 

In addition, the invention includes an isolated or recombinant polypeptide 
including a subsequence of at least about 15 contiguous amino acids encoded by the 
recombinant or isolated polynucleotides of the invention. For example, such 
polypeptides, or domains or fragments thereof, can be used as immunogens, e.g., to 
produce antibodies specific for the polypeptide sequence, or as probes for detecting a 
sequence of interest. A subsequence can range in size from about 15 amino acids in 
length up to and including the full length of the polypeptide. 

To be encompassed by the present invention, an expressed polypeptide which 
comprises such a polypeptide subsequence performs at least one biological function 
of the intact polypeptide in substantially the same manner, or to a similar extent, as 
does the intact polypeptide. For example, a polypeptide fragment can comprise a 
recognizable structural motif or functional domain such as a DNA binding domain 
that binds to a specific DNA promoter region, an activation domain or a domain for 
protein-protein interactions. 
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XV. Production of Transgenic Plants 

Modification of Traits 

The polynucleotides of the invention are favorably employed to produce 
transgenic plants with various traits, or characteristics, that have been modified in a 
desirable manner, e.g., to improve the seed characteristics of a plant. For example, 
alteration of expression levels or patterns (e.g., spatial or temporal expression 
patterns) of one or more of the transcription factors (or transcription factor 
homologues) of the invention, as compared with the levels of the same protein found 
in a wild type plant, can be used to modify a plant's traits. An illustrative example of 
trait modification, improved characteristics, by altering expression levels of a 
particular transcription factor is described further in the Examples and the Sequence 
Listing. 

Arabidopsis as a model system 

Arabidopsis thaliana is the object of rapidly growing attention as a model for 
genetics and metabolism in plants. Arabidopsis has a small genome, and well 
documented studies are available. It is easy to grow in large numbers and mutants 
defining important genetically controlled mechanisms are either available, or can 
readily be obtained. Various methods to introduce and express isolated homologous 
genes are available (see Koncz, et al., eds. Methods in Arabidopsis Research, et al. 
(1 992), World Scientific, New Jersey, New Jersey, in "Preface"). Because of its small 
size, short life cycle, obligate autogamy and high fertility, Arabidopsis is also a 
choice organism for the isolation of mutants and studies in morphogenetic and 
development pathways, and control of these pathways by transcription factors (Koncz, 
supra, p. 72). A number of studies introducing transcription factors into A. thaliana 
have demonstrated the utility of this plant for understanding the mechanisms of gene 
regulation and trait alteration in plants. See, for example, Koncz, supra, and U.S. 
Patent Number 6,417,428). 

Arabidopsi s genes in transgenic plants. 

Expression of genes which encode transcription factors modify expression of 
endogenous genes, polynucleotides, and proteins are well known in the art. In 
addition, transgenic plants comprising isolated polynucleotides encoding transcription 
factors may also modify expression of endogenous genes, polynucleotides, and 
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proteins. Examples include Peng et al. (1997, Genes and Development 1 1 :3194- 
3205) and Peng et al. (1999, Nature, 400:256-261). In addition, many others have 
demonstrated that an Arabidopsis transcription factor expressed in an exogenous plant 
species elicits the same or very similar phenotypic response. See, for example, Fu et 
al. (2001, Plant Cell 13:1791-1802); Nandi et al. (2000, Curr. Biol. 10:215-218); 
Coupland (1995, Nature 377:482-483); and Weigel and Nilsson (1995, Nature 
377:482-500). 

Homologous genes introduced into transgenic plants. 

Homologous genes that may be derived from any plant, or from any source 
whether natural, synthetic, semi-synthetic or recombinant, and that share significant 
sequence identity or similarity to those provided by the present invention, may be 
introduced into plants, for example, crop plants, to confer desirable or improved traits. 
Consequently, transgenic plants may be produced that comprise a recombinant 
expression vector or cassette with a promoter operably linked to one or more 
sequences homologous to presently disclosed sequences. The promoter may be, for. 
example, a plant or viral promoter. 

The invention thus provides for methods for preparing transgenic plants, and 
for modifying plant traits. These methods include introducing into a plant a 
recombinant expression vector or cassette comprising a functional promoter operably 
linked to one or more sequences homologous to presently disclosed sequences. Plants 
and kits for producing these plants that result from the application of these methods 
are also encompassed by the present invention. 

The complete descriptions of the traits associated with each polynucleotide of 
the invention is fully disclosed in Table 4, Table 5, and Table 6. 
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Traits of interest 

Examples of some of the traits that may be desirable in plants, and that may be 
provided by transforming the plants with the presently disclosed sequences, are listed 
in Table 6. 



Table 6. Genes, traits and utilities that affect plant characteristics 



Trait Category 


Traits 


Transcription factor genes that 
impact traits 


Utility 
Gene effect on: 




Resistance and 
tolerance 


Salt stress resistance 


G22; G196; G226; G303; 
G312; G325; G353; G482; 
G545;G80l;G867; G884; 
G922; G926; G1452; G1794; 
G1820; G1836; G1843; G1863; 
G2053; G2110; G2140; G2153; 
G2379; G2701; G2713; G2719; 
Cj2789 


Germination rate, 
survivability, 
yield; extended 
growth range 




resistance 


017^- rHRR* fr^ftV 
G325; G353; G489; G502; 
G526;G921;G922; G926; 
G1069; G1089; G1452; G1794; 
G1930; G2140; G2153; G2379; 
G2701;G2719;G2789; 


r"*ri^TTni"?"icitirvn rate* 
vJClXLLlllclLlUll I dLC, 

survivability, yield 




Cold stress resistance; 
cold germination 


G256; G394; 

G664;G864;G1322; G2130 


Germination, 
growth, earlier 
planting 




Tolerance to freezing 


G303; G325; G353; G720; 
G912; G913; G1794; G2053; 
G2140; G2153; G2379; G2701; 
G2719; G2789 


Survivability, 
yield, appearance, 
extended range 




Heat stress resistance 


G3; G464; G682; G864; G964; 


Germination, 
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G1305; G1645; G2130 G2430 


growth, later 
planting 




Drought, low 
humidity resistance 


G303; G325; G353; G720; 
G912; G926; G1452; G1794; 
G1820; G1843; G2053; G2140; 
G2153; G2379; G2583; G2701; 
G2719; G2789 


Survivability, 
yield, extended 
range 




Radiation resistance 


G1052 


Survivability, 
vigor, appearance 




Decreased herbicide 
sensitivity 


G343;G2133;G2517 


Resistant to 
increased 
herbicide use 




Increased herbicide 
sensitivity 


G374; G877;G1519 


Use as a herbicide 
target 




Oxidative stress 

i 


G477; G789; G1807; G2133; 
G2517 


Improved yield, 
appearance, 
reduced 
senescence 




Light response 


G183;G354; G375; G1062; 
G1322; G1331; G1488; G1494; 
G1521; G1786; G1794; G2144; 
G2555; 


Germination, 
growth, 
development, 
flowering time 




Development, 
morphology 


Overall plant 
architecture 


G24;G27; G31;G33;G47 
G147; G156; G160; G182; 
G187; G195;G196;G211; 
G221;G237; G280; G342; 
G352; G357; G358; G360; 
G362; G364; G365; G367; 
G373; G377; G396; G431; 
G447; G479; G546; G546; 
G551; G578; G580; G596; 
G615; G617; G620; G625; 




Vascular tissues, 
lignin content; cell 
wall content; 
appearance 
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G638; G658; G716; G725; 
G727; G730; G740; G770; 
G858; G865; G869; G872; 
G904; G910; G912; G920; 
G939; G963; G977; G979; 
G987; G988; G993; G1007; 
G1010; 
G1049; 
G1076; 
G1131; 
G1304; 
G1331; 
G1364; 
G1415; 
G1454; 
G1475; 
G1492; 
G1540; 
G1548; 
G1589; 
G1749; 
G1763; 
G1789; 
G1794; 
G1811; 
G1839; 
G1865; 
G1884; 
G1902; 
G1914; 
G1954; 
G2057; 
G2151: 



G1014; 


G1035; 


G1046; 


G1062; 


G1069; 


G1070; 


G1089; 


G1093; 


G1127; 


G1145; 


G1229; 


G1246; 


G1318; 


G1320; 


G1330; 


G1352; 


G1354; 


G1360; 


G1379; 


G1384; 


G1399; 


G1417; 


G1442; 


G1453; 


G1459; 


G1460; 


G1471; 


G1477; 


G1487; 


G1487; 


G1499; 


G1499; 


G1531; 


G1543; 


G1543; 


G1544; 


G1584; 


G1587; 


G1588; 


G1636; 


G1642; 


G1747; 


G1749; 


G1751; 


G1752; 


G1766; 


G1767; 


G1778; 


G1790; 


G1791; 


G1793; 


G1795; 


G1800; 


G1806; 


G1835; 


G1836; 


G1838; 


G1843; 


G1853; 


G1855; 


G1881; 


G1882; 


G1883; 


G1891; 


G1896; 


G1898; 


G1904; 


G1906; 


G1913; 


G1925; 


G1929: 


G1930, 


G1958 


, G1965 


G1976 


G2107 


, G2133 


G2134. 


, G2154 


; G2157 


,G2181 
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G2290; G2299; G2340; G2340 
G2346; G2373; G2376; G2424 
G2465; G2505; G2509; G2512 
G2513; G2519; G2520; G2533, 
G2534; G2573; G2589; G2687; 
G2720; G2787; G2789; G2893 






Size: increased stature 


G189; G1073; G1435; G2430 






Size: reduced stature 
or dwarfism 


G3; G5; G21; G23; G39; G165; 
G184; G194; G258; G280; 
G340; G343; G353; G354; 
G362; G363; G370; G385; 
G396; G439; G440; G447; 
G450; G550; G557; G599; 
G636; G652; G670; G671; 
G674; G729; G760; G804; 
G831;G864; G884; G898; 
G900; G912; G913; G922; 
G932; G937; G939; G960; 
G962; G977; G991; G1000; 
G1008; G1020; G1023; G1053; 
G1067; G1075; G1137; G1181; 
G1198; G1228; G1266; G1267; 
G1275; G1277; G1309; G1311; 
G1314; G1317; G1322; G1323; 
G1326; G1332; G1334; G1367; 
G1381; G1382; G1386; G1421; 
G1488; G1494; G1537; G1545; 
G1560; G1586; G1641; G1652; 
G1655; G1671; G1750; G1756; 
G1757; G1782; G1786; G1794; 
G1839; G1845; G1879; G1886; 
G1888; G1933; G1939; G1943; 
G1944; G2011; G2094; G2115; 


Ornamental; small 
stature provides 
wind resistance; 
creation of dwarf 
varieties 
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i 


32130; G2132; G2144; G2145; 
G2147; G2156; G2294; G2313; 
G2344; G2431; G2510; G2517; 
G2521;G2893;G2893 1 




; 


Rmit size and number 

L X till uluw vAJlA^* litilJll.V/ 


G362 


Biomass, yield, 
cotton boll fiber 
density 




Flower structure, 
inflorescence 


G47; G259; G353; G354; 
G671;G732; G988; G1000; 
G1063; G1140; G1326; G1449; 
G1543; G1560; G1587; G1645; 
G1947; G2108; G2143; G2893 


Ornamental 
horticulture; 
production of 
saffron or other 
edible flowers 




Number and 

H^VplfYnTTlftTlt f\f* 
Llw V V/lULUilvllk \JX 

trichomes 


G225; G226; G247; G362; 
G585; G634; G676; G682; 
G1014; G1332; G1452; G1795; 
G2105 


Resistance to pests 
and desiccation; 
essential oil 
production 




Seed size, color, and 
number 


G156;G450;G584; G652; 
G668; G858; G979; G1040; 
G1062; G1145; G1255; G1494; 
G1531; G1534; G1594; G2105; 
G2114; 


Yield 




Root development, 
modifications 


G9; G1482; G1534;G1794; 
G1852; G2053; G2136; G2140 






lVfndifi cations to root 
hairs 


G225; G226 


Nutrient, water 
uptake, pathogen 
resistance 




Aniral dominance 


G559; G732; G1255; G1275; 
G1411; G1488; G1635; G2452; 
G2509 


Ornamental 
horticulture 




Branching patterns 


G568; G988;G1548 


Ornamental 
horticulture, knot 
reduction, 
improved 
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windscreen 




Leaf shape, color, 
modifications 


G375; G377; G428; G43! 
G447; G464; G557; G57' 
G599; G635;G671;G67< 
G736; G804; G903; G97' 
G921; G922; G1038; Gl( 
G1067; G1073; G1075; C 
G1152; G1198; G1267; G 
G1452; G1484; G1586; G 
G1767; G1786; G1792; G 
G2059- G2094- fr210V C, 
G2117;G2143;G2144;G 
G2452; G2465; G2587; G 
G2724; 


3; 

7; 

I; 

7; 

)63; 

U146; 

H269; 

!1594; 

-1886; 

r? 1 1 " 

2431; 
2583; 


Appealing shape 
or shiny leaves for 
ornamental 
agriculture, 
increased biomass 
or photosynthesis 




Silique 


G1134 


Ornamental 




Stem morphology 


G47; G438; G671; G748; 
G988; G1000 


Ornamental; 
digestibility 




Shoot modifications 


G390; G391 


Ornamental stem 
bifurcations 




Disease, 

Pathogen 

Resistance 


Bacterial 


G211;G347;G367;G418; 
G525; G545; G578; G1049 


Yield, appearance, 
survivability, 
extended range 




Fungal 

< 
< 
< 


G19; G28; G28; G28; G14 
G188;G207;G211;G237; 
G248; G278; G347; G367; 
G371;G378;G409;G477; 
G545; G545; G558; G569; 
G578; G591; G594; G616; 
3789; G805; G812; G865; 
3869; G872; G881; G896; 
3940; G1047; G1049; G10 
31084; Gl 196; G1255; Gl 


7; 

64; 
266; 


Yield, appearance, 
survivability, 
extended range 
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G1363;G1514; G1756; G1792; 

G17Q9- H17Q9- H1709- niRSO- 

G1919; G1919; G1927; G1927; 
G1936; G1936; G1950; G2069; 
G2130; G2380; G2380; G2555 






Nutrients 


Increased tolerance to 
nitrogen-limited soils 


G225; G226; G1792 






Increased tolerance to 

phosphate-limited 

soils 


G419; G545;G561;G1946 






Increased tolerance to 

potassium-limited 

soils 


G561;G911 






Hormonal 


Hormone sensitivity 


G12; G546; G926; G760; 
G913; G926; G1062; G1069; 
G1095; G1134; G1330; G1452; 
G1666; G1820; G2140; G2789 


Seed dormancy, 
drought tolerance; 
plant form, fruit 
ripening 




Seed 

biochemistry 


Production of seed 
prenyl lipids, 
including tocopherol 


G214; G259; G490; G652; 
G748; G883; G1052; G1328; 
G1930; G2509; G2520 


Antioxidant 
activity, vitamin E 




Production of seed 
sterols 


G20 


Precursors for 
human steroid 
hormones; 
cholesterol 
modulators 




Production of seed 
glucosinolates 


G353; G484; G674; G1272; 
G1506; G1897; G1946; G2113; 
G2117; G2155; G2290; G2340 


Defense against 
insects; putative 
anticancer 
activity; 
undesirable in 
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Modified seed oil 
content 



Modified seed oil 
composition 



Modified seed protein 
content 



G162; G162; G180; G192; 
G241;G265;G2S6;G291; 
G427; G509; G519;G561; 
G567; G590; G818;G849; 
G892; G961;G974; G1063; 
G1143;G1190; G1198; G1226 
G1229;G1323; G1451;G1471 
G1478; G1496; G1526; G1543 
G1640; G1644; G1646;G1672 
G1677; G1750; G1765; G1777 
G1793;G1838; G1902; G1946 
G1948; G2059; G2123; G2138; 
G2139; G2343; G2792; G2830 



animal feeds 



Vegetable oil 
production; 
increased caloric 
value for animal 
feeds; lutein 
content 



G217; G504; G622; G778; 
G791;G861;G869; G938; 
G965;G1417;G2192 



G162;G226;G241;G371; 
G427; G509; G567; G597; 
G732; G849; G865; G892; 
G963; G988; G1323; G1323; 
G1419; G1478; G1488; G1634 
G1637; G1641; G1644; G1652 
G1677; G1777; G1777; G1818 
G1820; G1903; G1909; G1946 
G1946; G1958;G2059;G2117 
G2417; G2509 



Heat stability, 
digestibility of 
seed oils 



Reduced caloric 
value for humans 



Leaf 

biochemistry 



Production of 
flavonoids 



G1666* 



Ornamental 
pigment 
production; 
pathogen 
resistance; health 
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benefits 




Production of leaf 
glucosinolates 


G264; G353; G4S4; G652; 
G674; G681;G1069; G1198; 
m^?9- m^7- cimqa- 

G1897; G1946; G2115; G2117; 
G2144; G2155; G2155; G2340; 
G2512;G2520;G2552 


Defense against 
insects; putative 

alUILdllCcr 

activity; 
undesirable in 
animal feeds 




i^rouuciion ox 
diterpenes 


\3LLy 


jjiuuciion oi 
enzymes involved 

111 dlKululU 

biosynthesis 




Production of 
anthocyanin 


G546 


Ornamental 
pigment 




Production of leaf 
piiyiooLcroii> ? me 
stigmastanol, 
campesterol 


G561;G2131;G2424 


Precursors for 
numan sieroia 
hormones; 
cholesterol 
modulators 




T POT TO'H"X7 onin 

laliy aClU 

composition 


CY)\&' cvxii' ciqai* 

VJZI*f, VJ J / /, VJoOl, vjyoz, 

G975; G987; G1266; G1337; 
G1399; G1465; G1512; G2136; 

VJZ 1 1 1 , VJZ Jl 7^ 


iNuxniionai vaiue, 
increase in waxes 
for disease 
resistance 




Production of leaf 
prenyl lipids, 
including tocopherol 


G214; G259; G280; G652; 
G987; G1543; G2509; G2520 


Antioxidant 
activity, vitamin E 




Biochemistry, 
general 


Production of 
miscellaneous 
secondary metabolites 


G229; G663 






Sugar, starch, 
hemicellulose 
composition, 


G158;G211;G211;G237; 
G242; G274; G598; G1012; 
G1266; G1309; G1309; G1641; 
G1765; G1865; G2094; G2094; 


Food digestibility, 
hemicellulose & 
pectin content; 
fiber content; plant 
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G2589; G2589 


tensile strength, 
wood quality, 
pathogen 
resistance, pulp 
production; tuber 
starch content 




Sugar sensing 


Plant response to 
sugars 


G26; G38; G43; G207; G218; 
G241;G254;G263;G308; 
G536; G567; G567; G680; 
G867; G912; G956; G996; 
G1068; G1225; G1314; G1314; 
G1337; G1759; G1804; G2153; 
G2379 


Photosynthetic 

rate, carbohydrate 

accumulation, 

biomass 

production, 

source-sink 

relationships, 

senescence 




Growth, 
Reproduction 


Plant growth rate and 
development 


G447; G617; G674; G730; 
G917; G937; G1035; G1046; 
G1131; G1425; G1452; G1459; 
G1492; G1589; G1652; G1879; 
G1943; G2430; G2431; G2465; 
G2521 


Faster growth, 
increased biomass 
or yield, improved 
appearance; delay 
in bolting 




Embryo development 


G167 






Seed germination rate 


G979; G1792;G2130 


Yield 




Plant, seedling vigor 


G561;G2346 


Survivability, 
yield 




Senescence; cell death 


G571;G636;G878; G1050; 
G1463; G1749; G1944; G2130; 
G2155; G2340; G2383 


Yield, appearance; 
response to 
pathogens; 




Modified fertility 


G39; G340; G439; G470; 
G559;G615;G652; G671; 
G779; G962; G977; G988; 
G1000; G1063; G1067; G1075; 


Prevents or 
minimizes escape 
of the pollen of 
GMOs 
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G1266; G1311;G1321;G1326; 
G1367; G1386; G1421; G1453; 
G1471; G1453; G1560; G1594; 
G1635- G1750- G1947- G2011- 
G2094; G2113; G2115; G2130; 
G2143; G2147; G2294; G2510; 
G2893 






Early flowering 


G147; G157; G180; G183; 
G183;G184; G185;G208; 
G227; G294; G390; G390; 
G390;G391;G391;G427; 
G427; G490; G565; G590; 
G592; G720; G789; G865; 
G898; G898; G989; G989; 
G1037; G1037; G1142; G1225; 
G1225; G1226; G1242; G1305; 
G1305; G1380; G13S0; G1480; 
G1480; G1488; G1494; G1545; 
G1545; G1649; G1706; G1760; 
G1767; G1767; G1820; G1841; 
G1841; G1842; G1843; G1843; 
G1946; G1946; G2010; G2030; 
fPft^O* G9144- (t?9QS- 

G2295; G2347; G2348; G2348; 
G2373; G2373; G2509; G2509; 
G2555; G2555 


Faster generation 
time; synchrony of 
flowering; 
potential for 
introducing new 
traits to single 
variety 




Delayed flowering 


G8; G47; G192; G214; G234; 
G361;G362;G562;G568; 
G571;G591;G680;G736; 
G748' G859; G878; G910; 
G912; G913;G971;G994; 
G1051; G1052; G1073; G1079; 
G1335; G1435; G1452; G1478; 


Delayed time to 
pollen production 
ofGMO plants; 
synchrony of 
flowering; 
increased yield 
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G1789; G1804;G1865;G1865; 
G1895; G1900; G2007; G2133; 
G2155;G2291;G2465 






Extended flowering 
phase 


G1947 






Flower and leaf 
development 


G259; G353; G377; G580; 
G638 G652; G858; G869; 
G917; G922;G932; G1063; 
G1075; G1140; G1425; G1452; 
G1499; G1548; G1645; G1865; 
G1897; G1933; G2094; G2124; 
G2140; G2143; G2535; G2557 


Ornamental 
applications; 
decreased fertility 


* ii n 


Flower abscission 


G1897 


Ornamental: 
longer retention of 
flowers 



* When co-expressed with G669 and G663 



Significance of modified plant traits 

Currently, the existence of a series of maturity groups for different latitudes 
represents a major barrier to the introduction of new valuable traits. Any trait (e.g. 
disease resistance) has to be bred into each of the different maturity groups separately, 
a laborious and costly exercise. The availability of single strain, which could be 
grown at any latitude, would therefore greatly increase the potential for introducing 
new traits to crop species such as soybean and cotton. 

For many of the traits, listed in Table 6 and below, that may be conferred to 
plants, a single transcription factor gene maybe used to increase or decrease, advance 
or delay, or improve or prove deleterious to a given trait. For example, 
overexpression of a transcription factor gene that naturally occurs in a plant may 
cause early flowering relative to non-transformed or wild-type plants. By knocking 
out the gene, or suppressing the gene (with, for example, antisense suppression) the 
plant may experience delayed flowering. Similarly, overexpressing or suppressing 
one or more genes can impart significant differences in production of plant products, 
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such as different fatty acid ratios. Thus, suppressing a gene that causes a plant to be 
more sensitive to cold may improve a plant's tolerance of cold. 

Salt stress resistance . Soil salinity is one of the more important variables that 
determines where a plant may thrive. Salinity is especially important for the 
successful cultivation of crop plants, particular in many parts of the world that have 
naturally high soil salt concentrations, or where the soil has been over-utilized. Thus, 
presently disclosed transcription factor genes that provide increased salt tolerance 
during germination, the seedling stage, and throughout a plant's life cycle would find 
particular value for imparting survivability and yield in areas where a particular crop 
would not normally prosper. 

Osmotic stress resistance. Presently disclosed transcription factor genes that 
confer resistance to osmotic stress may increase germination rate under adverse 
conditions, which could impact survivability and yield of seeds and plants. 

Cold stress resistance. The potential utility of presently disclosed transcription 
factor genes that increase tolerance to cold is to confer better germination and growth 
in cold conditions. The germination of many crops is very sensitive to cold 
temperatures. Genes that would allow germination and seedling vigor in the cold 
would have highly significant utility in allowing seeds to be planted earlier in the 
season with a high rate of survivability. Transcription factor genes that confer better 
survivability in cooler climates allow a grower to move up planting time in the spring 
and extend the growing season further into autumn for higher crop yields. 

Tolerance to freezing . The presently disclosed transcription factor genes that 
impart tolerance to freezing conditions are useful for enhancing the survivability and 
appearance of plants conditions or conditions that would otherwise cause extensive 
cellular damage. Thus, germination of seeds and survival may take place at 
temperatures significantly below that of the mean temperature required for 
germination of seeds and survival of non-transformed plants. As with salt tolerance, 
this has the added benefit of increasing the potential range of a crop plant into regions 
in which it would otherwise succumb. Cold tolerant transformed plants may also be 
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planted earlier in the spring or later in autumn, with greater success than with non- 
transformed plants. 



Heat stress tolerance . The germination of many crops is also sensitive to high 
temperatures. Presently disclosed transcription factor genes that provide increased 
heat tolerance are generally useful in producing plants that germinate and grow in hot 
conditions, may find particular use for crops that are planted late in the season, or 
extend the range of a plant by allowing growth in relatively hot climates. 

Drought, low humidity tolerance . Strategies that allow plants to survive in 
low water conditions may include, for example, reduced surface area or surface oil or 
wax production. A number of presently disclosed transcription factor genes increase 
a plant's tolerance to low water conditions and provide the benefits of improved 
survivability, increased yield and an extended geographic and temporal planting 
range. 



Radiation resistance . Presently disclosed transcription factor genes have been 
shown to increase lutein production. Lutein, like other xanthophylls such as 
zeaxanthin and violaxanthin, are important in the protection of plants against the 
damaging effects of excessive light. Lutein contributes, directly or indirectly, to the 
rapid rise of non-photochemical quenching in plants exposed to high light. Increased 
tolerance of field plants to visible and ultraviolet light impacts survivability and vigor, 
particularly for recent transplants. Also affected are the yield and appearance of 
harvested plants or plant parts. Crop plants engineered with presently disclosed 
transcription factor genes that cause the plant to produce higher levels of lutein 
therefore would have improved photoprotection, leading to less oxidative damage and 
increase vigor, survivability and higher yields under high light and ultraviolet light 
conditions. 



Decreased herbicide sensitivity. Presently disclosed transcription factor genes 
that confer resistance or tolerance to herbicides (e.g., glyphosate) may find use in 
providing means to increase herbicide applications without detriment to desirable 
plants. This would allow for the increased use of a particular herbicide in a local 



107 



WO 03/013227 



PCT/US02/25805 



environment, with the effect of increased detriment to undesirable species and less 
harm to transgenic, desirable cultivars. 

Increased herbicide sensitivity . Knockouts of a number of the presently 
disclosed transcription factor genes have been shown to be lethal to developing 
embryos. Thus, these genes are potentially useful as herbicide targets. 

Oxidative stress . In plants, as in all living things, abiotic and biotic stresses 
induce the formation of oxygen radicals, including superoxide and peroxide radicals. 
This has the effect of accelerating senescence, particularly in leaves, with the resulting 
loss of yield and adverse effect on appearance. Generally, plants that have the highest 
level of defense mechanisms, such as, for example, polyunsaturated moieties of 
membrane lipids, are most likely to thrive under conditions that introduce oxidative 
stress (e.g., high light, ozone, water deficit, particularly in combination). Introduction 
of the presently disclosed transcription factor genes that increase the level of oxidative 
stress defense mechanisms would provide beneficial effects on the yield and 
appearance of plants. One specific oxidizing agent, ozone, has been shown to cause 
significant foliar injury, which impacts yield and appearance of crop and ornamental 
plants. In addition to reduced foliar injury that would be found in ozone resistant 
plant created by transforming plants with some of the presently disclosed transcription 
factor genes, the latter have also been shown to have increased chlorophyll 
fluorescence (Yu-Sen Chang et al. Bot. Bull. Acad. Sin. (2001) 42: 265-272). 

Heavy metal tolerance . Heavy metals such as lead, mercury, arsenic, 
chromium and others may have a significant adverse impact on plant respiration. 
Plants that have been transformed with presently disclosed transcription factor genes 
that confer improved resistance to heavy metals, through, for example, sequestering or 
reduced uptake of the metals will show improved vigor and yield in soils with 
relatively high concentrations of these elements. Conversely, transgenic transcription 
factors may also be introduced into plants to confer an increase in heavy metal uptake, 
which may benefit efforts to clean up contaminated soils. 

Light response . Presently disclosed transcription factor genes that modify a 
plant's response to light may be useful for modifying a plant's growth or 
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development, for example, photomorphogenesis in poor light, or accelerating 
flowering time in response to various light intensities, quality or duration to which a 
non-transformed plant would not similarly respond. Examples of such responses that 
have been demonstrated include leaf number and arrangement, and early flower bud 
appearances. 

Overall pla nt architecture . Several presently disclosed transcription factor 
genes have been introduced into plants to alter numerous aspects of the plant's 
morphology. For example, it has been demonstrated that a number of transcription 
factors may be used to manipulate branching, such as the means to modify lateral 
branching, a possible application in the forestry industry. Transgenic plants have also 
been produced that have altered cell wall content, lignin production, flower organ 
number, or overall shape of the plants. Presently disclosed transcription factor genes 
transformed into plants may be used to affect plant morphology by increasing or 
decreasing internode distance, both of which may be advantageous under different 
circumstances. For example, for fast growth of woody plants to provide more 
biomass, or fewer knots, increased internode distances are generally desirable. For 
improved wind screening of shrubs or trees, or harvesting characteristics of, for 
example, members of the Gramineae family, decreased internode distance may be 
advantageous. These modifications would also prove useful in the ornamental 
horticulture industry for the creation of unique phenotypic characteristics of 
ornamental plants. 

Increased stature. For some ornamental plants, the ability to provide larger 
varieties may be highly desirable. For many plants, including t fruit-bearing trees or 
trees and shrubs that serve as view or wind screens, increased stature provides 
obvious benefits. Crop species may also produce higher yields on larger cultivars. 

Reduced stature or dwarfism. Presently disclosed transcription factor genes 
that decrease plant stature can be used to produce plants that are more resistant to 
damage by wind and rain, or more resistant to heat or low humidity or water deficit. 
Dwarf plants are also of significant interest to the ornamental horticulture industry, 
and particularly for home garden applications for which space availability may be 
limited. 
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Fruit size and number . Introduction of presently disclosed transcription factor 
genes that affect fruit size will have desirable impacts on fruit size and number, which 
may comprise increases in yield for fruit crops, or reduced fhiit yield, such as when 
vegetative growth is preferred (e.g., with bushy ornamentals, or where fruit is 
undesirable, as with ornamental olive trees). 

Flower structure, inflorescence, and development. Presently disclosed 
transgenic transcription factors have been used to create plants with larger flowers or 
arrangements of flowers that are distinct from wild-type or non-transfomied cultivars. 
This would likely have the most value for the ornamental horticulture industry, where 
larger flowers or interesting presentations generally are preferred and command the 
highest prices. Flower structure may have advantageous effects on fertility, and could 
be used, for example, to decrease fertility by the absence, reduction or screening of 
reproductive components. One interesting application for manipulation of flower 
structure, for example, by introduced transcription factors could be in the increased 
production of edible flowers or flower parts, including saffron, which is derived from 
the stigmas of Crocus sativas. 

Number and development of trichomes . Several presently disclosed 
transcription factor genes have been used to modify trichome number and amount of 
trichome products in plants. Trichome glands on the surface of many higher plants 
produce and secrete exudates that give protection from the elements and pests such as 
insects, microbes and herbivores. These exudates may physically immobilize insects 
and spores, may be insecticidal or ant-microbial or they may act as allergens or 
irritants to protect against herbivores. Trichomes have also been suggested to decrease 
transpiration by decreasing leaf surface air flow, and by exuding chemicals that 
protect the leaf from the sun. 

Seed size, color and number . The introduction of presently disclosed 
transcription factor genes into plants that alter the size or number of seeds may have a 
significant impact on yield, both when the product is the seed itself, or when biomass 
of the vegetative portion of the plant is increased by reducing seed production. In the 
case of fruit products, it is often advantageous to modify a plant to have reduced size 
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or number of seeds relative to non-transformed plants to provide seedless or varieties 
with reduced numbers or smaller seeds. Presently disclosed transcription factor genes 
have also been shown to affect seed size, including the development of larger seeds. 
Seed size, in addition to seed coat integrity, thickness and permeability, seed water 
content and by a number of other components including antioxidants and 
oligosaccharides, may affect seed longevity in storage. This would be an important 
utility when the seed of a plant is the harvested crops, as with, for example, peas, 
beans, nuts, etc. Presently disclosed transcription factor genes have also been used to 
modify seed color, which could provide added appeal to a seed product. 

Root development, modifications. By modifying the structure or development 
of roots by transforming into a plant one or more of the presently disclosed 
transcription factor genes, plants may be produced that have the capacity to thrive in 
otherwise unproductive soils. For example, grape roots that extend further into rocky 
soils, or that remain viable in waterlogged soils, would increase the effective planting 
range of the crop. It may be advantageous to manipulate a plant to produce short 
roots, as when a soil in which the plant will be growing is occasionally flooded, or 
when pathogenic fungi or disease-causing nematodes are prevalent. 

Modifications to root hairs. Presently disclosed transcription factor genes that 
increase root hair length or number potentially could be used to increase root growth 
or vigor, which might in turn allow better plant growth under adverse conditions such 
as limited nutrient or water availability. 

Apical dominance. The modified expression of presently disclosed 
transcription factors that control apical dominance could be used in ornamental 
horticulture, for example, to modify plant architecture. 

Branching patterns. Several presently disclosed transcription factor genes have 
been used to manipulate branching, which could provide benefits in the forestry 
industry. For example, reduction in the formation of lateral branches could reduce 
knot formation. Conversely, increasing the number of lateral branches could provide 
utility when a plant is used as a windscreen, or may also provide ornamental 
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Leaf shape- color and modifications . It has been demonstrated in laboratory 
experiments that overexpression of some of the presently disclosed transcription 
factors produced marked effects on leaf development. At early stages of growth, these 
transgenic seedlings developed narrow, upward pointing leaves with long petioles, 
possibly indicating a disruption in circadian-clock controlled processes or nyctinastic 
movements. Other transcription factor genes can be used to increase plant biomass; 
large size would be useful in crops where the vegetative portion of the plant is the 
marketable portion. 

Siliques . Genes that later silique conformation in brassicates may be used to 
modify fruit ripening processes in brassicates and other plants, which may positively 
affect seed or fruit quality. 

Stem morphology and shoot modifications . Laboratory studies have 
demonstrated that introducing several of the presently disclosed transcription factor 
genes into plants can cause stem bifurcations in shoots, in which the shoot meristems 
split to form two or three separate shoots. This unique appearance would be desirable 
in ornamental applications. 

Diseases, pathogens and pests . A number of the presently disclosed 
transcription factor genes have been shown to or are likely to confer resistance to 
various plant diseases, pathogens and pests. The offending organisms include fungal 
pathogens Fusarium oxysporum, Botrytis cinerea, Sclerotinia sclerotiorum, and 
Erysiphe orontii. Bacterial pathogens to which resistance may be conferred include 
Pseudomonas syringae. Other problem organisms may potentially include 
nematodes, mollicutes, parasites, or herbivorous arthropods. In each case, one or 
more transformed transcription factor genes may provide some benefit to the plant to 
help prevent or overcome infestation. The mechanisms by which the transcription 
factors work could include increasing surface waxes or oils, surface thickness, local 
senescence, or the activation of signal transduction pathways that regulate plant 
defense in response to attacks by herbivorous pests (including, for example, protease 
inhibitors). 
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Increased tolerance of plants to nutrient-limited soils . Presently disclosed 
transcription factor genes introduced into plants may provide the means to improve 
uptake of essential nutrients, including nitrogenous compounds, phosphates, 
potassium, and trace minerals. The effect of these modifications is to increase the 
seedling germination and range of ornamental and crop plants. The utilities of 
presently disclosed transcription factor genes conferring tolerance to conditions of 
low nutrients also include cost savings to the grower by reducing the amounts of 
fertilizer needed, environmental benefits of reduced fertilizer runoff; and improved 
yield and stress tolerance. In addition, this gene could be used to alter seed protein 
amounts and/or composition that could impact yield as well as the nutritional value 
and production of various food products. 

Hormone sensitivity . One or more of the presently disclosed transcription 
factor genes have been shown to affect plant abscisic acid (ABA) sensitivity. This 
plant hormone is likely the most important hormone in mediating the adaptation of a 
plant to stress. For example, ABA mediates conversion of apical meristems into 
dormant buds. In response to increasingly cold conditions, the newly developing 
leaves growing above the meristem become converted into stiff bud scales that closely 
wrap the meristem and protect it from mechanical damage during winter. ABA in the 
bud also enforces dormancy; during premature warm spells, the buds are inhibited 
from sprouting. Bud dormancy is eliminated after either a prolonged cold period of 
cold or a significant number of lengthening days. Thus, by affecting ABA sensitivity, 
introduced transcription factor genes may affect cold sensitivity and survivability. 
ABA is also important in protecting plants from drought tolerance. 

Several other of the present transcription factor genes have been used to 
manipulate ethylene signal transduction and response pathways. These genes can thus 
be used to manipulate the processes influenced by ethylene, such as seed germination 
or fruit ripening, and to improve seed or fruit quality. 

Production of seed and leaf p ren yl lipids, including tocopherol . Prenyl lipids 
play a role in anchoring proteins in membranes or membranous organelles. Thus 
modifying the prenyl lipid content of seeds and leaves could affect membrane 
integrity and function. A number of presently disclosed transcription factor genes 
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have been shown to modify the tocopherol composition of plants. Tocopherols have 
both anti-oxidant and vitamin E activity. 

Production of seed and leaf phytosterols : Presently disclosed transcription 
factor genes that modify levels of phytosterols in plants may have at least two 
utilities. First, phytosterols are an important source of precursors for the manufacture 
of human steroid hormones. Thus, regulation of transcription factor expression or 
activity could lead to elevated levels of important human steroid precursors for steroid 
semi-synthesis. For example, transcription factors that cause elevated levels of 
campesterol in leaves, or sitosterols and stigmasterols in seed crops, would be useful 
for this purpose. Phytosterols and their hydrogenated derivatives phytostanols also 
have proven cholesterol-lowering properties, and transcription factor genes that 
modify the expression of these compounds in plants would thus provide health 
benefits. 

Production of seed and leaf glucosinolates . Some glucosinolates have anti- 
cancer activity; thus, increasing the levels or composition of these compounds by 
introducing several of the presently disclosed transcription factors might be of interest 
from a nutraceutical standpoint. (3) Glucosinolates form part of a plants natural 
defense against insects. Modification of glucosinolate composition or quantity could 
therefore afford increased protection from predators. Furthermore, in edible crops, 
tissue specific promoters might be used to ensure that these compounds accumulate 
specifically in tissues, such as the epidermis, which are not taken for consumption. 

Modified seed oil content . The composition of seeds, particularly with respect 
to seed oil amounts and/or composition, is very important for the nutritional value and 
production of various food and feed products. Several of the presently disclosed 
transcription factor genes in seed lipid saturation that alter seed oil content could be 
used to improve the heat stability of oils or to improve the nutritional quality of seed 
oil, by, for example, reducing the number of calories in seed, increasing the number of 
calories in animal feeds, or altering the ratio of saturated to unsaturated lipids 
comprising the oils. 
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Seed and leaf fatty acid composition . A number of the presently disclosed 
transcription factor genes have been shown to alter the fatty acid composition in 
plants, and seeds in particular. This modification may find particular value for 
improving the nutritional value of, for example, seeds or whole plants. Dietary fatty 
acids ratios have been shown to have an effect on, for example, bone integrity and 
remodeling (see, for example, Weiler, H.A., PediatrRes (2000) 47:5 692-697). The 
ratio of dietary fatty acids may alter the precursor pools of long-chain polyunsaturated 
fatty acids that serve as precursors for prostaglandin synthesis. In mammalian 
connective tissue, prostaglandins serve as important signals regulating the balance 
between resorption and formation in bone and cartilage. Thus dietary fatty acid ratios 
altered in seeds may affect the etiology and outcome of bone loss. 

Modified seed protein content . As with seed oils, the composition of seeds, 
particularly with respect to protein amounts and/or composition, is very important for 
the nutritional value and production of various food and feed products. A number of 
the presently disclosed transcription factor genes modify the protein concentrations in 
seeds would provide nutritional benefits, and may be used to prolong storage, increase 
seed pest or disease resistance, or modify germination rates. 

Production of flavonoids in leaves and other plant parts . Expression of 
presently disclosed transcription factor genes that increase flavonoid production in 
plants, including anthocyanins and condensed tannins, may be used to alter in pigment 
production for horticultural purposes, and possibly increasing stress resistance. 
Flavonoids have antimicrobial activity and could be used to engineer pathogen 
resistance. Several flavonoid compounds have health promoting effects such as the 
inhibition of tumor growth and cancer, prevention of bone loss and the prevention of 
the oxidation of lipids. Increasing levels of condensed tannins, whose biosynthetic 
pathway is shared with anthocyanin biosynthesis, in forage legumes is an important 
agronomic trait because they prevent pasture bloat by collapsing protein foams within 
the rumen. For a review on the utilities of flavonoids and their derivatives, refer to 
Dixon et al. (1999) Trends Plant Sci. 4:394-400. 

Production of diterpe nes in leaves and other plant parts . Depending on the 
plant species, varying amounts of diverse secondary biochemicals (often lipophilic 
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terpenes) are produced and exuded or volatilized by trichomes. These exotic 
secondary biochemicals, which are relatively easy to extract because they are on the 
surface of the leaf, have been widely used in such products as flavors and aromas, 
drugs, pesticides and cosmetics. Thus, the overexpression of genes that are used to 
produce diterpenes in plants may be accomplished by introducing transcription factor 
genes that induce said overexpression. One class of secondary metabolites, the 
diterpenes, can effect several biological systems such as tumor progression, 
prostaglandin synthesis and tissue inflammation. In addition, diterpenes can act as 
insect pheromones, termite allomones, and can exhibit neurotoxic, cytotoxic and 
antimitotic activities. As a result of this functional diversity, diterpenes have been the 
target of research several pharmaceutical ventures. In most cases where the metabolic 
pathways are impossible to engineer, increasing trichome density or size on leaves 
may be the only way to increase plant productivity. 

Production of anthocvanin in leaves and other plant parts . Several presently 
disclosed transcription factor genes can be used to alter anthocyanin production in 
numerous plant species. The potential utilities of these genes include alterations in 
pigment production for horticultural purposes, and possibly increasing stress 
resistance in combination with another transcription factor. 

Production of miscellaneous secondary metabolites . Microarray data suggests 
that flux through the aromatic amino acid biosynthetic pathways and primary and 
secondary metabolite biosynthetic pathways are up-regulated. Presently disclosed 
transcription factors have been shown to be involved in regulating alkaloid 
biosynthesis, in part by up-regulating the enzymes indole-3-glycerol phosphatase and 
strictosidine synthase. Phenylalanine ammonia lyase, chalcone synthase and trans- 
cinnamate mono-oxygenase are also induced, and are involved in phenylpropenoid 
biosynthesis. 

Sugar, star ch, hemicellulose composition . Overexpression of the presently 
disclosed transcription factors that affect sugar content resulted in plants with altered 
leaf insoluble sugar content. Transcription factors that alter plant cell wall 
composition have several potential applications including altering food digestibility, 
plant tensile strength, wood quality, pathogen resistance and in pulp production. The 
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potential utilities of a gene involved in glucose-specific sugar sensing are to alter 
energy balance, photosynthetic rate, carbohydrate accumulation, biomass production, 
source-sink relationships, and senescence. 

Hemicellulose is not desirable in paper pulps because of its lack of strength 
compared with cellulose. Thus modulating the amounts of cellulose vs. hemicellulose 
in the plant cell wall is desirable for the paper/lumber industry. Increasing the 
insoluble carbohydrate content in various fruits, vegetables, and other edible 
consumer products will result in enhanced fiber content. Increased fiber content 
would not only provide health benefits in food products, but might also increase 
digestibility of forage crops. In addition, the hemicellulose and pectin content of fruits 
and berries affects the quality of jam and catsup made from them. Changes in 
hemicellulose and pectin content could result in a superior consumer product. 

Plant response to sugars and sugar composition . In addition to their important 
role as an energy source and structural component of the plant cell, sugars are central 
regulatory molecules that control several aspects of plant physiology, metabolism and 
development. It is thought that this control is achieved by regulating gene expression 
and, in higher plants, sugars have been shown to repress or activate plant genes 
involved in many essential processes such as photosynthesis, glyoxylate metabolism, 
respiration, starch and sucrose synthesis and degradation, pathogen response, 
wounding response, cell cycle regulation, pigmentation, flowering and senescence. 
The mechanisms by which sugars control gene expression are not understood. 

Because sugars are important signaling molecules, the ability to control either 
the concentration of a signaling sugar or how the plant perceives or responds to a 
signaling sugar could be used to control plant development, physiology or 
metabolism. For example, the flux of sucrose (a disaccharide sugar used for 
systemically transporting carbon and energy in most plants) has been shown to affect 
gene expression and alter storage compound accumulation in seeds. Manipulation of 
the sucrose signaling pathway in seeds may therefore cause seeds to have more 
protein, oil or carbohydrate, depending on the type of manipulation. Similarly, in 
tubers, sucrose is converted to starch which is used as an energy store. It is thought 
that sugar signaling pathways may partially determine the levels of starch synthesized 
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in the tubers. The manipulation of sugar signaling in tubers could lead to tubers with a 
higher starch content. 

Thus, the presently disclosed transcription factor genes that manipulate the 
sugar signal transduction pathway may lead to altered gene expression to produce 
plants with desirable traits. In particular, manipulation of sugar signal transduction 
pathways could be used to alter source-sink relationships in seeds, tubers, roots and 
other storage organs leading to increase in yield. 

Plant growth rate and development . A number of the presently disclosed 
transcription factor genes have been shown to have significant effects on plant growth 
rate and development. These observations have included, for example, more rapid or 
delayed growth and development of reproductive organs. This would provide utility 
for regions with short or long growing seasons, respectively. Accelerating plant 
growth would also improve early yield or increase biomass at an earlier stage, when 
such is desirable (for example, in producing forestry products). 

Embryo development . Presently disclosed transcription factor genes that alter 
embryo development has been used to alter seed protein and oil amounts and/or 
composition which is very important for the nutritional value and production of 
various food products. Seed shape and seed coat may also be altered by these genes, 
which may provide for improved storage stability. 

Seed germination rate . A number of the presently disclosed transcription 
factor genes have been shown to modify seed germination rate, including when the 
seeds are in conditions normally unfavorable for germination (e.g., cold, heat or salt 
stress, or in the presence of ABA), and may thus be used to modify and improve 
germination rates under adverse conditions. 

Plant, seedling vigor . Seedlings transformed with presently disclosed 
transcription factors have been shown to possess larger cotyledons and appeared 
somewhat more advanced than control plants. This indicates that the seedlings 
developed more rapidly that the control plants. Rapid seedling development is likely 
to reduce loss due to diseases particularly prevalent at the seedling stage (e.g., 
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damping off) and is thus important for survivability of plants germinating in the field 
or in controlled environments. 



Senescence, cell death . Presently disclosed transcription factor genes may be 
used to alter senescence responses in plants. Although leaf senescence is thought to be 
an evolutionary adaptation to recycle nutrients, the ability to control senescence in an 
agricultural setting has significant value. For example, a delay in leaf senescence in 
some maize hybrids is associated with a significant increase in yields and a delay of a 
few days in the senescence of soybean plants can have a large impact on yield. 
Delayed flower senescence may also generate plants that retain their blossoms longer 
and this may be of potential interest to the ornamental horticulture industry. 

Modified fertility . Plants that overexpress a number of the presently disclosed 
transcription factor genes have been shown to possess reduced fertility. This could 
be a desirable trait, as it could be exploited to prevent or minimize the escape of the 
pollen of genetically modified organisms (GMOs) into the environment. 

Early and delayed flowering . Presently disclosed transcription factor genes 
that accelerate flowering could have valuable applications in such programs since 
they allow much faster generation times. In a number of species, for example, 
broccoli, cauliflower, where the reproductive parts of the plants constitute the crop 
and the vegetative tissues are discarded, it would be advantageous to accelerate time 
to flowering. Accelerating flowering could shorten crop and tree breeding programs. 
Additionally, in some instances, a faster generation time might allow additional 
harvests of a crop to be made within a given growing season. A number of 
Arabidopsis genes have already been shown to accelerate flowering when 
constitutively expressed. These include LEAFY, APETALA1 and CONSTANS 
(Mandel, M. et al., 1995, Nature 377, 522-524; Weigel, D. and Nilsson, O., 1995, 
Nature 377, 495-500; Simon et al., 1996, Nature 384, 59-62). 

By regulating the expression of potential flowering using inducible promoters, 
flowering could be triggered by application of an inducer chemical. This would allow 
flowering to be synchronized across a crop and facilitate more efficient harvesting. 
Such inducible systems could also be used to time the flowering of crop varieties to 
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different latitudes. At present, species such as soybean and cotton are available as a 
series of maturity groups that are suitable for different latitudes on the basis of their 
flowering time (which is governed by day-length). A system in which flowering could 
be chemically controlled would allow a single high-yielding northern maturity group 
to be grown at any latitude. In southern regions such plants could be grown for longer, 
thereby increasing yields, before flowering was induced. In more northern areas, the 
induction would be used to ensure that the crop flowers prior to the first winter frosts. 

In a sizeable number of species, for example, root crops, where the vegetative 
parts of the plants constitute the crop and the reproductive tissues are discarded, it 
would be advantageous to delay or prevent flowering. Extending vegetative 
development with presently disclosed transcription factor genes could thus bring 
about large increases in yields.. Prevention of flowering might help maximize 
vegetative yields and prevent escape of genetically modified organism (GMO) pollen. 

Extended flowering phase . Presently disclosed transcription factors that extend 
flowering time have utility in engineering plants with longer-lasting flowers for the 
horticulture industry, and for extending the time in which the plant is fertile. 

Flower and leaf development . Presently disclosed transcription factor genes 
have been used to modify the development of flowers and leaves. This could be 
advantageous in the development of new ornamental cultivars that present unique 
configurations. In addition, some of these genes have been shown to reduce a plant's 
fertility, which is also useful for helping to prevent development of pollen of GMOs. 

Flower abscission . Presently disclosed transcription factor genes introduced 
into plants have been used to retain flowers for longer periods. This would provide a 
significant benefit to the ornamental industry, for both cut flowers and woody plant 
varieties (of, for example, maize), as well as have the potential to lengthen the fertile 
period of a plant, which could positively impact yield and breeding programs. 
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A listing of specific effects and utilities that the presently disclosed 
transcription factor genes have on plants, as determined by direct observation and 
assay analysis, is provided in Table 4. 

XVI. Antisense and Co-suppression 

In addition to expression of the nucleic acids of the invention as gene 
replacement or plant phenotype modification nucleic acids, the nucleic acids are also 
useful for sense and anti-sense suppression of expression, e.g., to down-regulate 
expression of a nucleic acid of the invention, e.g., as a further mechanism for 
modulating plant phenotype. That is, the nucleic acids of the invention, or 
subsequences or anti-sense sequences thereof, can be used to block expression of 
naturally occurring homologous nucleic acids. A variety of sense and anti-sense 
technologies are known in the art, e.g., as set forth in Lichtenstein and Nellen (1997) 
Antisense Technology: A P ractical App roach JRL Press at Oxford University Press, 
Oxford, U.K.. In general, sense or anti-sense sequences are introduced into a cell, 
where they are optionally amplified, e.g., by transcription. Such sequences include 
both simple oligonucleotide sequences and catalytic sequences such as ribozymes. 

For example, a reduction or elimination of expression (i.e., a "knock-out") of a 
transcription factor or transcription factor homologue polypeptide in a transgenic 
plant, e.g., to modify a plant trait, can be obtained by introducing an antisense construct 
corresponding to the polypeptide of interest as a cDNA For antisense suppression, the 
transcription factor or homologue cDNA is arranged in reverse orientation (with 
respect to the coding sequence) relative to the promoter sequence in the expression 
vector. The introduced sequence need not be the full length cDNA or gene, and need 
not be identical to the cDNA or gene found in the plant type to be transformed. 
Typically, the antisense sequence need only be capable of hybridizing to the target 
gene or RNA of interest. Thus, where the introduced sequence is of shorter length, a 
higher degree of homology to the endogenous transcription factor sequence will be 
needed for effective antisense suppression. While antisense sequences of various 
lengths can be utilized, preferably, the introduced antisense sequence in the vector 
will be at least 30 nucleotides in length, and improved antisense suppression will 
typically be observed as the length of the antisense sequence increases. Preferably, 
the length of the antisense sequence in the vector will be greater than 100 nucleotides. 
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Transcription of an antisense construct as described results in the production of RNA 
molecules that are the reverse complement of mRNA molecules transcribed from the 
endogenous transcription factor gene in the plant cell. 

Suppression of endogenous transcription factor gene expression can also be 
achieved using a ribozyme. Ribozymes are RNA molecules that possess highly 
specific endoribonuclease activity. The production and use of ribozymes are 
disclosed in U.S. Patent No. 4,987,071 and U.S. Patent No. 5,543,508. Synthetic 
ribozyme sequences including antisense RNAs can be used to confer RNA cleaving 
activity on the antisense RNA, such that endogenous mRNA molecules that hybridize 
to the antisense RNA are cleaved, which in turn leads to an enhanced antisense 
inhibition of endogenous gene expression. 

Suppression of endogenous transcription factor gene expression can also be 
achieved using RNA interference , or RNAi. RNAi is a post-transcriptional, targeted 
gene-silencing technique that uses double-stranded RNA (dsRNA) to incite 
degradation of messenger RNA (mRNA) containing the same sequence as the dsRNA 
(Constans, (2002J The Scientist 16:36). Small interfering RNAs, or siRNAs are 
produced in at least two steps: an endogenous ribonuclease cleaves longer dsRNA 
into shorter, 21-23 nucleotide-long RNAs. The siRNA segments then mediate the 
degradation of the target mRNA (Zamore, (2001) Nature Struct. Biol, 8:746-50). 
RNAi has been used for gene function determination in a manner similar to antisense 
oligonucleotides (Constans, (2002) The Scientist 16:36). Expression vectors that 
continually express siRNAs in transiently and stably transfected have been engineered 
to express small hairpin RNAs (shRNAs), which get processed in vivo into siRNAs- 
like molecules capable of carrying out gene-specific silencing (Brummelkamp et al., 
(2002) Science 296:550-553, and Paddison, et al. (2002) Genes & Dev. 16:948-958). 
Post-transcriptional gene silencing by double-stranded RNA is discussed in further 
detail by Hammond et al. (2001) Nature Rev Gen 2: 1 10-1 19, Fire et al. (1998) Nature 
391: 806-811 and Timmons and Fire (1998) Nature 395: 854. 

Vectors in which RNA encoded by a transcription factor or transcription factor 
homologue cDNA is over-expressed can also be used to obtain co-suppression of a 
corresponding endogenous gene, e.g., in the manner described in U.S. Patent No. 
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5,231,020 to Jorgensen. Such co-suppression (also termed sense suppression) does 
not require that the entire transcription factor cDNA be introduced into the plant cells, 
nor does it require that the introduced sequence be exactly identical to the endogenous 
transcription factor gene of interest. However, as with antisense suppression, the 
suppressive efficiency will be enhanced as specificity of hybridization is increased, 
e.g., as the introduced sequence is lengthened, and/or as the sequence similarity 
between the introduced sequence and the endogenous transcription factor gene is 
increased. 



Vectors expressing an untranslatable form of the transcription factor mRNA, 
e.g., sequences comprising one or more stop codon, or nonsense mutation) can also be 
used to suppress expression of an endogenous transcription factor, thereby reducing or 
eliminating it's activity and modifying one or more traits. Methods for producing 
such constructs are described in U.S. Patent No. 5,583,021. Preferably, such 
constructs are made by introducing a premature stop codon into the transcription 
factor gene. Alternatively, a plant trait can be modified by gene silencing using 
double-strand RNA (Sharp (1999) Genes and Development 13: 13 9- 141). Another 
method for abolishing the expression of a gene is by insertion mutagenesis using the 
T-DNA of Agrobacterium twnefaciens. After generating the insertion mutants, the 
mutants can be screened to identify those containing the insertion in a transcription 
factor or transcription factor homologue gene. Plants containing a single transgene . 
insertion event at the desired gene can be crossed to generate homozygous plants for 
the mutation. Such methods are well known to those of skill in the art. (See for 
example Koncz et al. (1992) Methods in Arabidopsis Research. World Scientific.) 

Alternatively, a plant phenotype can be altered by eliminating an endogenous 
gene, such as a transcription factor or transcription factor homologue, e.g., by 
homologous recombination (Kempin et al. (1997) Nature 389:802-803). 

A plant trait can also be modified by using the Cre-lox system (for example, as 
described in US Pat. No. 5,658,772). A plant genome can be modified to include 
first and second lox sites that are then contacted with a Cre recombinase. If the lox 
sites are in the same orientation, the intervening DNA sequence between the two sites 
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is excised. If the lox sites are in the opposite-orientation, the intervening sequence is 
inverted. 

The polynucleotides and polypeptides of this invention can also be expressed 
in a plant in the absence of an expression cassette by manipulating the activity or 
expression level of the endogenous gene by other means. For example, by 
ectopically expressing a gene by T-DNA activation tagging (Ichikawa et al. (1997) 
Nature 390 698-701 ; Kakimoto et al. (1996) Science 274: 982-985). This method 
entails transforming a plant with a gene tag containing multiple transcriptional 
enhancers and once the tag has inserted into the genome, expression of a flanking 
gene coding sequence becomes deregulated. In another example, the transcriptional 
machinery in a plant can be modified so as to increase transcription levels of a 
polynucleotide of the invention {See, e.g., PCT Publications WO 96/06166 and WO 
98/53057 which describe the modification of the DNA-binding specificity of zinc 
finger proteins by changing particular amino acids in the DNA-binding motif). 

The transgenic plant can also include the machinery necessary for expressing 
or altering the activity of a polypeptide encoded by an endogenous gene, for example 
by altering the phosphorylation state of the polypeptide to maintain it in an activated 
state. 

Transgenic plants (or plant cells, or plant explants, or plant tissues) 
incorporating the polynucleotides of the invention and/or expressing the polypeptides 
of the invention can be produced by a variety of well established techniques as 
described above. Following construction of a vector, most typically an expression 
cassette, including a polynucleotide, e.g., encoding a transcription factor or 
transcription factor homologue, of the invention, standard techniques can be used to 
introduce the polynucleotide into a plant, a plant cell, a plant explant or a plant tissue 
of interest. Optionally, the plant cell, explant or tissue can be regenerated to produce 
a transgenic plant. 

The plant can be any higher plant, including gymnosperms, 
monocotyledonous and dicotyledenous plants. Suitable protocols are available for 
Leguminosae (alfalfa, soybean, clover, etc.), Umbelliferae (carrot, celery, parsnip), 
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Cniciferae (cabbage, radish, rapeseed, broccoli, etc.), Curciirbitaceae (melons and 
cucumber), Gramineae (wheat, corn, rice, barley, millet, etc.), Solanaceae (potato, 
tomato, tobacco, peppers, etc.), and various other crops. See protocols described in 
Ammirato et al. (1984) Handbook of Plant Cell Culture -Crop Sp ecies. Macmillan 
Publ. Co. Shimamoto et al. (1989) Nature 338:274-276; Fromm et al. (1990) 
Bio/Technology 8:833-839; and Vasil et al. (1990) Bio/Technology 8:429-434. 

Transformation and regeneration of both monocotyledonous and 
dicotyledonous plant cells is now routine, and the selection of the most appropriate 
transformation technique will be determined by the practitioner. The choice of 
method will vary with the type of plant to be transformed; those skilled in the art will 
recognize the suitability of particular methods for given plant types. Suitable methods 
can include, but are not limited to: electroporation of plant protoplasts; liposome- 
mediated transformation; polyethylene glycol (PEG) mediated transformation; 
transformation using viruses; micro-injection of plant cells; micro-projectile 
bombardment of plant cells; vacuum infiltration; and Agrobacterium tumefaciens 
mediated transformation. Transformation means introducing a nucleotide sequence 
into a plant in a manner to cause stable or transient expression of the sequence. 

Successful examples of the modification of plant characteristics by 
transformation with cloned sequences which serve to illustrate the current knowledge 
in this field of technology, and which are herein incorporated by reference, include: 
US. Patent Nos. 5,571,706; 5,677,175; 5,510,471; 5,750,386; 5,597,945; 5,589,615; 
5,750,871; 5,268,526; 5,780,708; 5,538,880; 5,773,269; 5,736,369 and 5,610,042. 

Following transformation, plants are preferably selected using a dominant 
selectable marker incorporated into the transformation vector. Typically, such a 
marker will confer antibiotic or herbicide resistance on the transformed plants, and 
selection of transformants can be accomplished by exposing the plants to appropriate 
concentrations of the antibiotic or herbicide. 

After transformed plants are selected and grown to maturity, those plants 
showing a modified trait are identified. The modified trait can be any of those traits 
described above. Additionally, to confirm that the modified trait is due to changes in 
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expression levels or activity of the polypeptide or polynucleotide of the invention can 
be determined by analyzing rnRNA expression using Northern blots, RT-PCR or 
microarrays, or protein expression using immunoblots or Western blots or gel shift 
assays. 

XVII. Integrated Systems - Sequence Identity 

Additionally, the present invention may be an integrated system, computer or 
computer readable medium that comprises an instruction set for determining the 
identity of one or more sequences in a database. In addition, the instruction set can be 
used to generate or identify sequences that meet any specified criteria. Furthermore, 
the instruction set may be used to associate or link certain functional benefits, such 
improved characteristics, with one or more identified sequence. 

For example, the instruction set can include, e.g., a sequence comparison or 
other alignment program, e.g., an available program such as, for example, the 
Wisconsin Package Version 10.0, such as BLAST, FASTA, PILEUP, 
FINDPATTERNS or the like (GCG, Madison, WI). Public sequence databases such 
as GenBank, EMBL, Swiss-Prot and PIR or private sequence databases such as 
PHYTOSEQ sequence database (Incyte Genomics, Palo Alto, CA) can be searched. 

Alignment of sequences for comparison can be conducted by the local 
homology algorithm of Smith and Waterman (1981) Adv. Appl. Math. 2:482, by the 
homology alignment algorithm of Needleman and Wunsch (1970) J. Mol. Biol. 
48:443-453, by the search for similarity method of Pearson and Lipman (1988) Proc. 
Natl. Acad. Sci. U.S.A. 85:2444-2448, by computerized implementations of these 
algorithms. After alignment, sequence comparisons between two (or more) 
polynucleotides or polypeptides are typically performed by comparing sequences of 
the two sequences over a comparison window to identify and compare local regions 
of sequence similarity. The comparison window can be a segment of at least about 20 
contiguous positions, usually about 50 to about 200, more usually about 100 to about 
150 contiguous positions. A description of the method is provided in Ausubel et al., 
supra. 
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A variety of methods for determining sequence relationships can be used, 
including manual alignment and computer assisted sequence alignment and analysis. 
This later approach is a preferred approach in the present invention, due to the 
increased throughput afforded by computer assisted methods. As noted above, a 
variety of computer programs for performing sequence alignment are available, or can 
be produced by one of skill. 



One example algorithm that is suitable for determining percent sequence 
identity and sequence similarity is the BLAST algorithm, which is described in 
Altschul et al. J. Mol. Biol 215:403-410 (1990). Software for performing BLAST 
analyses is publicly available, e.g., through the National Center for Biotechnology 
Information (see internet website at ncbi.nlm.nih.gov). This algorithm involves first 
identifying high scoring sequence pairs (HSPs) by identifying short words of length 
W in the query sequence, which either match or satisfy some positive-valued 
threshold score T when aligned with a word of the same length in a database 
sequence. T is referred to as the neighborhood word score threshold (Altschul et al., 
supra). These initial neighborhood word hits act as seeds for initiating searches to 
find longer HSPs containing them. The word hits are then extended in both directions 
along each sequence for as far as the cumulative alignment score can be increased. 
Cumulative scores are calculated using, for nucleotide sequences, the parameters M 
(reward score for a pair of matching residues; always > 0) and N (penalty score for 
mismatching residues; always < 0). For amino acid sequences, a scoring matrix is 
used to calculate the cumulative score. Extension of the word hits in each direction 
are halted when: the cumulative alignment score falls off by the quantity X from its 
maximum achieved value; the cumulative score goes to zero or below, due to the 
accumulation of one or more negative-scoring residue alignments; or the end of either 
sequence is reached. The BLAST algorithm parameters W, T, and X determine the 
sensitivity and speed of the alignment. The BLASTN program (for nucleotide 
sequences) uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, a cutoff 
of 100, M=5, N=-4, and a comparison of both strands. For amino acid sequences, the 
BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, 
and the BLOSUM62 scoring matrix (see Henikoff & Henikoff (1989 ) Proc. Natl. 
Acad. Sci. USA 89:10915). Unless otherwise indicated, "sequence identity" here 
refers to the % sequence identity generated from a tblastx using the NCBI version of 
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the algorithm at the default settings using gapped alignments with the filter "off (see, 
for example, internet website at ncbi.nlm.nih.gov). 

In addition to calculating percent sequence identity, the BLAST algorithm also 
performs a statistical analysis of the similarity between two sequences (see, e.g., 
Karlin & Altschul (1993) Proc. Natl. Acad. Sci. USA 90:5873-5787). One measure 
of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), 
which provides an indication of the probability by which a match between two 
nucleotide or amino acid sequences would occur by chance. For example, a nucleic 
acid is considered similar to a reference sequence (and, therefore, in this context, 
homologous) if the smallest sum probability in a comparison of the test nucleic acid to 
the reference nucleic acid is less than about 0.1, or less than about 0.01, and or even 
less than about 0.001. An additional example of a useful sequence alignment 
algorithm is PILEUP. PILEUP creates a multiple sequence alignment from a group of 
related sequences using progressive, pairwise alignments. The program can align, e.g., 
up to 300 sequences of a maximum length of 5,000 letters. 

The integrated system, or computer typically includes a user input interface 
allowing a user to selectively view one or more sequence records corresponding to the 
one or more character strings, as well as an instruction set which aligns the one or 
more character strings with each other or with an additional character string to 
identify one or more region of sequence similarity. The system may include a link of 
bne or more character strings with a particular phenotype or gene function. Typically, 
the system includes a user readable output element that displays an alignment 
produced by the alignment instruction set. 

The methods of this invention can be implemented in a localized or distributed 
computing environment. In a distributed environment, the methods may implemented 
on a single computer comprising multiple processors or on a multiplicity of 
computers. The computers can be linked, e.g. through a common bus, but more 
preferably the computer(s) are nodes on a network. The network can be a generalized 
or a dedicated local or wide-area network and, in certain preferred embodiments, the 
computers may be components of an intra-net or an internet. 
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Thus, the invention provides methods for identifying a sequence similar or 
homologous to one or more polynucleotides as noted herein, or one or more target 
polypeptides encoded by the polynucleotides, or otherwise noted herein and may 
include linking or associating a given plant phenotype or gene function with a 
sequence. In the methods, a sequence database is provided (locally or across an inter 
or intra net) and a query is made against the sequence database using the relevant 
sequences herein and associated plant phenotypes or gene functions. 

Any sequence herein can be entered into the database, before or after querying 
the database. This provides for both expansion of the database and, if done before the 
querying step, for insertion of control sequences into the database. The control 
sequences can be detected by the query to ensure the general integrity of both the 
database and the query. As noted, the query can be performed using a web browser 
based interface. For example, the database can be a centralized public database such 
as those noted herein, and the querying can be done from a remote terminal or 
computer across an internet or intranet. 

XVIII. Examples 

The following examples are intended to illustrate but not limit the present 
invention. The complete descriptions of the traits associated with each polynucleotide 
of the invention is fully disclosed in Table 4 and Table 6. 

Example I: Full Length Gene Identification and Cloning 

Putative transcription factor sequences (genomic or ESTs) related to known 
transcription factors were identified in the Arabidopsis thaliana GenBank database 
using the tblastn sequence analysis program using default parameters and a P-value 
cutoff threshold of -4 or -5 or lower, depending on the length of the query sequence. 
Putative transcription factor sequence hits were then screened to identify those 
containing particular sequence strings. If the sequence hits contained such sequence 
strings, the sequences were confirmed as transcription factors. 

Alternatively, Arabidopsis thaliana cDNA libraries derived from different 
tissues or treatments, or genomic libraries were screened to identify novel members of 
a transcription family using a low stringency hybridization approach. Probes were 
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synthesized using gene specific primers in a standard PCR reaction (annealing 
temperature 60° C) and labeled with 32 P dCTP using the High Prime DNA Labeling 
Kit (Boehringer Mannheim). Purified radiolabelled probes were added to filters 
immersed in Church hybridization medium (0.5 M NaP0 4 pH 7.0, 7% SDS, 1 % w/v 
bovine serum albumin) and hybridized overnight at 60°C with shaking. Filters were 
washed two times for 45 to 60 minutes with lxSCC, 1% SDS at 60° C. 

To identify additional sequence 5 ! or 3 ? of a partial cDNA sequence in a cDNA 
library, 5' and 3 1 rapid amplification of cDNA ends (RACE) was performed using the 
U.C. Marathon cDNA amplification kit (Clontech, Palo Alto, CA). Generally, the 
method entailed first isolating poly(A) mRNA, performing first and second strand 
cDNA synthesis to generate double stranded cDNA, blunting cDNA ends, followed 
by ligation of the U.C. Marathon Adaptor to the cDNA to form a library of adaptor- 
ligated ds cDNA. 

Gene-specific primers were designed to be used along with adaptor specific 
primers for both 5' and 3' RACE reactions. Nested primers, rather than single 
primers, were used to increase PCR specificity. Using 5' and 3 5 RACE reactions, 5' 
and 3' RACE fragments were obtained, sequenced and cloned. The process can be 
repeated until 5' and 3' ends of the full-length gene were identified. Then the full- 
length cDNA was generated by PCR using primers specific to 5' and 3' ends of the 
gene by end-to-end PCR. 

Example II: Construction of Expression Vectors 

The sequence was amplified from a genomic or cDNA library using primers 
specific to sequences upstream and downstream of the coding region. The expression 
vector was pMEN20 or pMEN65, which are both derived from pMON3 16 (Sanders et 
al, (1987 ) Nucleic Acids Research 15:1543-1558) and contain the CaMV 35S 
promoter to express transgenes. To clone the sequence into the vector, both pMEN20 
and the amplified DNA fragment were digested separately with Sail and NotI 
restriction enzymes at 37° C for 2 hours. The digestion products were subject to 
electrophoresis in a 0.8% agarose gel and visualized by ethidium bromide staining. 
The DNA fragments containing the sequence and the linearized plasmid were excised 
and purified by using a Qiaquick gel extraction kit (Qiagen, Valencia CA). The 
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fragments of interest were ligated at a ratio of 3:1 (vector to insert). Ligation 
reactions using T4 DNA ligase (New England Biolabs, Beverly MA) were carried out 
at 16° C for 1 6 hours. The ligated DNAs were transformed into competent cells of the 
E. coli strain DH5alpha by using the heat shock method. The transformations were 
plated on LB plates containing 50 mg/1 kanamycin (Sigma, St. Louis, MO). 
Individual colonies were grown overnight in five milliliters of LB broth containing 50 
mg/1 kanamycin at 37° C. Plasmid DNA was purified by using Qiaquick Mini Prep 
kits (Qiagen). 

Example III: Transformation of Agrobacterium with the Expression Vector 

After the plasmid vector containing the gene was constructed, the vector was 
used to transform Agrobacterium tumefaciens cells expressing the gene products. The 
stock of Agrobacterium tumefaciens cells for transformation were made as described 
by Nagel et al. (1990) FEMS Microbiol Letts . 67: 325-328. Agrobacterium strain 
ABI was grown in 250 ml LB medium (Sigma) overnight at 28°C with shaking until 
an absorbance (A 60 o) of 0.5 - 1.0 was reached. Cells were harvested by centrifiigation 
at 4,000 x g for 15 min at 4° C. Cells were then resuspended in 250 pi chilled buffer 
(1 mM HEPES, pH adjusted to 7.0 with KOH). Cells were centrifuged again as 
described above and resuspended in 125 ul chilled buffer. Cells were then 
centrifuged and resuspended two more times in the same HEPES buffer as described 
above at a volume of 100 pi and 750 pi, respectively. Resuspended cells were then 
distributed into 40 pi aliquots, quickly frozen in liquid nitrogen, and stored at -80° C. 

Agrobacterium cells were transformed with plasmids prepared as described 
above following the protocol described by Nagel et al. For each DNA construct to be 
transformed, 50-100 ng DNA (generally resuspended in 10 mM Tris-HCl, 1 mM 
EDTA, pH 8.0) was mixed with 40 pi of Agrobacterium cells. The DNA/cell mixture 
was then transferred to a chilled cuvette with a 2mm electrode gap and subject to a 2.5 
kV charge dissipated at 25 pF and 200 pF using a Gene Pulser H apparatus (Bio-Rad, 
Hercules, CA). After electroporation, cells were immediately resuspended in 1.0 ml 
LB and allowed to recover without antibiotic selection for 2 - 4 hours at 28° C in a 
shaking incubator. After recovery, cells were plated onto selective medium of LB 
broth containing 100 pg/ml spectinomycin (Sigma) and incubated for 24-48 hours at 
28° C. Single colonies were then picked and inoculated in fresh medium. The 
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presence of the plasmid construct was verified by PCR amplification and sequence 
analysis. 

Example IV: Transformation of Arabidopsis Plants with Agrobacterium 
hunefaciens with Expression Vector 

After transformation of Agrobacterium tumefaciens with plasmid vectors 
containing the gene, single Agrobacterium colonies were identified, propagated, and 
used to transform Arabidopsis plants. Briefly, 500 ml cultures of LB medium 
containing 50 mg/1 kanamycin were inoculated with the colonies and grown at 28° C 
with shaking for 2 days until an optical absorbance at 600 nm wavelength over 1 cm 
(Aeoo) of > 2.0 is reached. Cells were then harvested by centrifugation at 4,000 x g for 
10 min, and resuspended in infiltration medium (1/2 X Murashige and Skoog salts 
(Sigma), 1 X Gamborg's B-5 vitamins (Sigma), 5.0% (w/v) sucrose (Sigma), 0.044 
fiM benzylamino purine (Sigma), 200 jal/lSilwet L-77 (Lehle Seeds) until an A6oo of 
0.8 was reached. 

Prior to transformation, Arabidopsis thaliana seeds (ecotype Columbia) were 
sown at a density of -10 plants per 4" pot onto Pro-Mix BX potting medium 
(Hummert International) covered with fiberglass mesh (18 mm X 16 mm). Plants 
were grown under continuous illumination (50-75 nE/m 2 /sec) at 22-23° C with 65- 
70% relative humidity. After about 4 weeks, primary inflorescence stems (bolts) are 
cut off to encourage growth of multiple secondary bolts. After flowering of the 
mature secondary bolts, plants were prepared for transformation by removal of all 
siliques and opened flowers. 

The pots were then immersed upside down in the mixture of Agrobacterium 
infiltration medium as described above for 30 sec, and placed on their sides to allow 
draining into a V x T flat surface covered with plastic wrap. After 24 h, the plastic 
wrap was removed and pots are turned upright. The immersion procedure was 
repeated one week later, for a total of two immersions per pot. Seeds were then 
collected from each transformation pot and analyzed following the protocol described 
below. 
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Example V: Identification of Arabidopsis Primary Transformants 

Seeds collected from the transformation pots were sterilized essentially as 
follows. Seeds were dispersed into in a solution containing 0.1% (v/v) Triton X-100 
(Sigma) and sterile H 2 0 and washed by shaking the suspension for 20 min. The wash 
solution was then drained and replaced with fresh wash solution to wash the seeds for 
20 min with shaking. After removal of the second wash solution, a solution 
containing 0.1% (v/v) Triton X-100 and 70% ethanol (Equistar) was added to the 
seeds and the suspension was shaken for 5 min. After removal of the 
ethanol/detergent solution, a solution containing 0.1% (v/v) Triton X-100 and 30% 
(v/v) bleach (Clorox) was added to the seeds, and the suspension was shaken for 10 
min. After removal of the bleach/detergent solution, seeds were then washed five 
times in sterile distilled H 2 0. The seeds were stored in the last wash water at 4° C for 
2 days in the dark before being plated onto antibiotic selection medium (1 X 
Murashige and Skoog salts (pH adjusted to 5.7 with 1M KOH), 1 X Gamborg's B-5 
vitamins, 0.9% phytagar (Life Technologies), and 50 mg/1 kanamycin). Seeds were 
germinated under continuous illumination (50-75 uE/m 2 /sec) at 22-23° C. After 7-10 
days of growth under these conditions, kanamycin resistant primary transformants (T, 
generation) were visible and obtained. These seedlings were transferred first to fresh 
selection plates where the seedlings continued to grow for 3-5 more days, and then to 
soil (Pro-Mix BX potting medium). 

Primary transformants were crossed and progeny seeds (T 2 ) collected; 
kanamycin resistant seedlings were selected and analyzed. The expression levels of 
the recombinant polynucleotides in the transformants varies from about a 5% 
expression level increase to a least a 1 00% expression level increase. Similar 
observations are made with respect to polypeptide level expression. 

Example VI: Identification of Arabidopsis Plants with Transcription Factor Gene 
Knockouts 

The screening of insertion mutagenized Arabidopsis collections for null 
mutants in a known target gene was essentially as described in Krysan et al (1999) 
Plant Cell 1 1 :2283-2290. Briefly, gene-specific primers, nested by 5-250 base pairs 
to each other, were designed from the 5' and 3' regions of a known target gene. 
Similarly, nested sets of primers were also created specific to each of the T-DNA or 
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transposon ends (the "right" and "left" borders). All possible combinations of gene 
specific and T-DNA/transposon primers were used to detect by PCR an insertion 
event within or close to the target gene. The amplified DNA fragments were then 
sequenced which allows the precise determination of the T-DNA/transposon insertion 
point relative to the target gene. Insertion events within the coding or intervening 
sequence of the genes were deconvoluted from a pool comprising a plurality of 
insertion events to a single unique mutant plant for functional characterization. The 
method is described in more detail in Yu and Adam, US Application Serial No. 
09/177,733 filed October 23, 1998. 

Example VII: Identification of Modified Phenotypes in Overexpression or Gene 
Knockout Plants 

Experiments were performed to identify those transformants or knockouts that 
exhibited modified biochemical characteristics. Among the biochemicals that were 
assayed were insoluble sugars, such as arabinose, fucose, galactose, mannose, 
rhamnose or xylose or the like; prenyl lipids, such as lutein, beta-carotene, 
xanthophylM, xanthophyll-2, chlorophylls A or B, or alpha-, delta- or gamma- 
tocopherol or the like; fatty acids, such as 16:0 (palmitic acid), 16:1 (palmitoleic 
acid), 18:0 (stearic acid), 18:1 (oleic acid), 18:2 (linoleic acid), 20:0 , 18:3 (linolenic 
acid), 20:1 (eicosenoic acid), 20:2, 22:1 (erucic acid) or the like; waxes, such as by 
altering the levels of C29, C31, or C33 alkanes; sterols, such as brassicasterol, 
campesterol, stigmasterol, sitosterol or stigmastanol or the like, glucosinolates, 
protein or oil levels. 

Fatty acids were measured using two methods depending on whether the tissue 
was from leaves or seeds. For leaves, lipids were extracted and esterified with hot 
methanolic H 2 S0 4 and partitioned into hexane from methanolic brine. For seed fatty 
acids, seeds were pulverized and extracted in methanol:heptane:toluene:2,2- 
dimethoxypropane:H 2 S0 4 (39:34:20:5:2) for 90 minutes at 80°C. After cooling to 
room temperature the upper phase, containing the seed fatty acid esters, was subjected 
to GC analysis. Fatty acid esters from both seed and leaf tissues were analyzed with a 
Supelco SP-2330 column. 
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Glucosinolates were purified from seeds or leaves by first heating the tissue at 
95°C for 10 minutes. Preheated ethanolrwater (50:50) is and after heating at 95°C for 
a further 10 minutes, the extraction solvent is applied to a DEAE Sephadex column 
which had been previously equilibrated with 0.5 M pyridine acetate. 
Desulfoglucosinolates were eluted with 300 ul water and analyzed by reverse phase 
HPLC monitoring at 226 nm. 



For wax alkanes, samples were extracted using an identical method as fatty 
acids and extracts were analyzed on a HP 5890 GC coupled with a 5973 MSD. 
Samples were chromatographically isolated on a J&W DB35 mass spectrometer 
(J&W Scientific). 



To measure prenyl lipids levels, seeds or leaves were pulverized with 1 to 2% 
pyrogallol as an antioxidant. For seeds, extracted samples were filtered and a portion 
removed for tocopherol and carotenoid/chlorophyll analysis by HPLC. The 
remaining material was saponified for sterol determination. For leaves, an aliquot 
was removed and diluted with methanol and chlorophyll A, chlorophyll B, and total 
carotenoids measured by spectrophotometry by determining optical absorbance at 
665.2 nm, 652.5 nm, and 470 nm. An aliquot was removed for tocopherol and 
carotenoid/chlorophyll composition by HPLC using a Waters uBondapak CI 8 column 
(4.6 mm x 150 mm). The remaining methanolic solution was saponified with 10% 
KOH at 80°C for one hour. The samples were cooled and diluted with a mixture of 
methanol and water. A solution of 2% methylene chloride in hexane was mixed in 
and the samples were centrifuged. The aqueous methanol phase was again re- 
extracted 2% methylene chloride in hexane and, after centriragation, the two upper 
phases were combined and evaporated. 2% methylene chloride in hexane was added 
to the tubes and the samples were then extracted with one ml of water. The upper 
phase was removed, dried, and resuspended in 400 ul of 2% methylene chloride in 
hexane and analyzed by gas chromatography using a 50 m DB-5ms (0.25 mm ID, 
0.25 urn phase, J&W Scientific). 

Insoluble sugar levels were measured by the method essentially described by 
Reiter et al., (1997) Plant Journa l 12:335-345. This method analyzes the neutral sugar 
composition of cell wall polymers found in Arabidopsis leaves. Soluble sugars were 
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separated from sugar polymers by extracting leaves with hot 70% ethanol. The 
remaining residue containing the insoluble polysaccharides was then acid hydrolyzed 
with allose added as an internal standard. Sugar monomers generated by the 
hydrolysis were then reduced to the corresponding alditols by treatment with NaBH4, 
then were acetylated to generate the volatile alditol acetates which were then analyzed 
by GC-FID. Identity of the peaks was determined by comparing the retention times 
of known sugars converted to the corresponding alditol acetates with the retention 
times of peaks from wild-type plant extracts. Alditol acetates were analyzed on a 
Supelco SP-2330 capillary column (30 m x 250 um x 0.2 um) using a temperature 
program beginning at 1 80° C for 2 minutes followed by an increase to 220° C in 4 
minutes. After holding at 220° C for 10 minutes, the oven temperature is increased to 
240° C in 2 minutes and held at this temperature for 10 minutes and brought back to 
room temperature. 

To identify plants with alterations in total seed oil or protein content, 150mg 
of seeds from T2 progeny plants were subjected to analysis by Near Infrared 
Reflectance Spectroscopy (NIRS) using a Foss NirSystems Model 6500 with a 
spinning cup transport system. NIRS is a non-destructive analytical method used to 
determine seed oil and protein composition. Infrared is the region of the 
electromagnetic spectrum located after the visible region in the direction of longer 
wavelengths. 'Near infrared' owns its name for being the infrared region near to the 
visible region of the electromagnetic spectrum. For practical purposes, near infrared 
comprises wavelengths between 800 and 2500 nm. NIRS is applied to organic 
compounds rich in O-H bonds (such as moisture, carbohydrates, and fats), C-H bonds 
(such as organic compounds and petroleum derivatives), and N-H bonds (such as 
proteins and amino acids). The NIRS analytical instruments operate by statistically 
correlating NIRS signals at several wavelengths with the characteristic or property 
intended to be measured. All biological substances contain thousands of C-H, O-H, 
and N-H bonds. Therefore, the exposure to near infrared radiation of a biological 
sample, such as a seed, results in a complex spectrum which contains qualitative and 
quantitative information about the physical and chemical composition of that sample. 

The numerical value of a specific analyte in the sample, such as protein 
content or oil content, is mediated by a calibration approach known as chemometrics. 
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Chemometrics applies statistical methods such as multiple linear regression (MLR), 
partial least squares (PLS), and principle component analysis (PCA) to the spectral 
data and correlates them with a physical property or other factor, that property or 
factor is directly determined rather than the analyte concentration itself. The method 
first provides "wet chemistry" data of the samples required to develop the calibration. 

Calibration for Arabidopsis seed oil composition was performed using 
accelerated solvent extraction using 1 g seed sample size and was validated against 
certified canola seed. A similar wet chemistry approach was performed for seed 
protein composition calibration. 

Data obtained from MRS analysis was analyzed statistically using a nearest- 
neighbor (N-N) analysis. The N-N analysis allows removal of within-block spatial 
variability in a fairly flexible fashion which does not require prior knowledge of the 
pattern of variability in the chamber. Ideally, all hybrids are grown under identical 
experimental conditions within a block (rep). In reality, even in many block designs, 
significant within-block variability exists. Nearest-neighbor procedures are based on 
assumption that environmental effect of a plot is closely related to that of its 
neighbors. Nearest-neighbor methods use information from adjacent plots to adjust 
for within-block heterogeneity and so provide more precise estimates of treatment 
means and differences. If there is within-plot heterogeneity on a spatial scale that is 
larger than a single plot and smaller than the entire block, then yields from adjacent 
plots will be positively correlated. Information from neighboring plots can be used to 
reduce or remove the unwanted effect of the spatial heterogeneity, and hence improve 
the estimate of the treatment effect. Data from neighboring plots can also be used to 
reduce the influence of competition between adjacent plots. The Papadakis N-N 
analysis can be used with designs to remove within-block variability that would not 
be removed with the standard split plot analysis (Papadakis, 1973, Inst. d'Amelior. 
Plantes Thessaloniki (Greece) Bull. Scientif., No. 23; Papadakis, 1984, Proc. Acad. 
Athens, 59, 326-342). 

Experiments were performed to identify those transformants or knockouts that 
exhibited an improved pathogen tolerance. For such studies, the transformants were 
exposed to biotropic fungal pathogens, such asErysiphe orontii, and necrotropic 
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fungal pathogens, such as Fusarium oxysporwn. Fiisarium oxysporum isolates cause 
vascular wilts and damping off of various annual vegetables, perennials and weeds 
(Mauch-Mani and Slusarenko (1994) Molecular Plant-Microbe Interactions 7: 378- 
383). For Fusarium oxysporum experiments, plants grown on Petri dishes were 
sprayed with a fresh spore suspension of F. oxysporum. The spore suspension was 
prepared as follows: A plug of fungal hyphae from a plate culture was placed on a 
fresh potato dextrose agar plate and allowed to spread for one week. 5 ml sterile 
water was then added to the plate, swirled, and pipetted into 50 ml Armstrong 
Fusarium medium. Spores were grown overnight in Fusarium medium and then 
sprayed onto plants using a Preval paint sprayer. Plant tissue was harvested and 
frozen in liquid nitrogen 48 hours post infection. 

Eiysiphe orontii is a causal agent of powdery mildew. For Eiysiphe orontii 
experiments, plants were grown approximately 4 weeks in a greenhouse under 12 
hour light (20°C, ~30% relative humidity (rh)). Individual leaves were infected with 
E. orontii spores from infected plants using a camel's hairbrush, and the plants were 
transferred to a Percival growth chamber (20°C, 80% rh.). Plant tissue was harvested 
and frozen in liquid nitrogen 7 days post infection. 

Botrytis cinerea is a necrotrophic pathogen. Botrytis cinerea was grown on 
potato dextrose agar in the light. A spore culture was made by spreading 10 ml of 
sterile water on the fungus plate, swirling and transferring spores to 10 ml of sterile 
water. The spore inoculum (approx. 105 spores/ml) was used to spray 10 day-old 
seedlings grown under sterile conditions on MS (minus sucrose) media. Symptoms 
were evaluated every day up to approximately 1 week. 

Infection with bacterial pathogens Pseudomonas syringae pv maculicola (Psm) 
strain 4326 and pv maculicola strain 4326 was performed by hand inoculation at two 
doses. Two inoculation doses allows the differentiation between plants with enhanced 
susceptibility and plants with enhanced resistance to the pathogen. Plants were grown 
for 3 weeks in the greenhouse, then transferred to the growth chamber for the 
remainder of their growth. Psm ES4326 was hand inoculated with 1 ml syringe on 3 
fully-expanded leaves per plant (4 1/2 wk old), using at least 9 plants per 
overexpressing line at two inoculation doses, OD=0.005 and OD=0.0005. Disease 
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scoring occurred at day 3 post-inoculation with pictures of the plants and leaves taken 
in parallel. 



In some instances, expression patterns of the pathogen-induced genes (such as 
defense genes) was monitored by microarray experiments. cDNAs were generated by 
PCR and resuspended at a final concentration of- 100 ng/ul in 3X SSC or 150mM 
Na-phosphate (Eisen and Brown (1999) Methods Enzymol. 303:179-205). The 
cDNAs were spotted on microscope glass slides coated with polylysine. The prepared 
cDNAs were aliquoted into 384 well plates and spotted on the slides using an x-y-z 
gantry (OmniGrid) purchased from GeneMachines (Menlo Park, CA) outfitted with 
quill type pins purchased from Telechem International (Sunnyvale, CA). After 
spotting, the arrays were cured for a minimum of one week at room temperature, 
rehydrated and blocked following the protocol recommended by Eisen and Brown 
(1999; supra). 

Sample total RNA (10 ug) samples were labeled using fluorescent Cy3 and 
Cy5 dyes. Labeled samples were resuspended in 4X SSC/0.03% SDS/4 ug salmon 
sperm DNA/2 ug tRNA/ 50mM Na-pyrophosphate, heated for 95°C for 2.5 minutes, 
spun down and placed on the array. The array was then covered with a glass 
coverslip and placed in a sealed chamber. The Chamber was then kept in a water bath 
at 62°C overnight. The arrays were washed as described in Eisen and Brown (1999) 
and scanned on a General Scanning 3000 laser scanner. The resulting files are 
subsequently quantified using Imagene, a software purchased from BioDiscovery 
(Los Angeles, CA). 

Experiments were performed to identify those transformants or knockouts that 
exhibited an improved environmental stress tolerance. For such studies, the 
transformants were exposed to a variety of environmental stresses. Plants were 
exposed to chilling stress (6 hour exposure to 4-8° C ), heat stress (6 hour exposure to 
32-37° C), high salt stress (6 hour exposure to 200 mM NaCl), drought stress (168 
hours after removing water from trays), osmotic stress (6 hour exposure to 3 M 
mannitol), or nutrient hmitation (nitrogen, phosphate, and potassium) (Nitrogen: all 
components of MS medium remained constant except N was reduced to 20 mg/1 of 
NH4NO3, or Phosphate: All components of MS medium except KH 2 P0 4 , which was 
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replaced by K 2 S0 4 , Potassium: All components of MS medium except removal of 
KN0 3 and KH2PO4, which were replaced by NaH 4 P0 4 ). 

Experiments were performed to identify those transformants or knockouts that 
exhibited a modified structure and development characteristics. For such studies, the 
transformants were observed by eye to identify novel structural or developmental 
characteristics associated with the ectopic expression of the polynucleotides or 
polypeptides of the invention. 

Experiments were performed to identify those transformants or knockouts that 
exhibited modified sugar-sensing. For such studies, seeds from transformants were 
germinated on media containing 5% glucose or 9.4% sucrose which normally partially 
restrict hypocotyl elongation. Plants with altered sugar sensing may have either 
longer or shorter hypocotyls than normal plants when grown on this media. 
Additionally, other plant traits may be varied such as root mass. 

Flowering time was measured by the number of rosette leaves present when a 
visible inflorescence of approximately 3 cm is apparent Rosette and total leaf number 
on the progeny stem are tightly correlated with the timing of flowering (Koornneef et 
al (1991) Mol Gen. Genet 229:57-66. The vernalization response was measured. For 
vernalization treatments, seeds were sown to MS agar plates, sealed with micropore 
tape, and placed in a 4°C cold room with low light levels for 6-8 weeks. The plates 
were then transferred to the growth rooms alongside plates containing freshly sown 
non-vernalized controls. Rosette leaves were counted when a visible inflorescence of 
approximately 3 cm was apparent. 

Modified phenotypes observed for particular overexpressor or knockout plants 
are provided in Table 4. For a particular overexpressor that shows a less beneficial 
characteristic, it may be more useful to select a plant with a decreased expression of 
the particular transcription factor. For a particular knockout that shows a less 
beneficial characteristic, it may be more useful to select a plant with an increased 
expression of the particular transcription factor. 



140 



WO 03/013227 



PCT/US02/25805 



The sequences of the Sequence Listing or those in Tables 4 , 5 or those 
disclosed here can be used to prepare transgenic plants and plants with altered traits. 
The specific transgenic plants listed below are produced from the sequences of the 
Sequence Listing, as noted. Table 4 provides exemplary polynucleotide and 
polypeptide sequences of the invention. Table 4 includes, from left to right for each 
sequence: the first column shows the polynucleotide SEQ ID NO; the second column 
shows the Mendel Gene ID No., GID; the third column shows the trait(s) resulting 
from the knock out or overexpression of the polynucleotide in the transgenic plant; 
the fourth column shows the category of the trait; the fifth column shows the 
transcription factor family to which the polynucleotide belongs; the sixth column 
("Comment"), includes specific effects and utilities conferred by the polynucleotide 
of the first column; the seventh column shows the SEQ ID NO of the polypeptide 
encoded by the polynucleotide; and the eighth column shows the amino acid residue 
positions of the conserved domain in amino acid (AA) co-ordinates. 

Seed of plants overexpressing sequences G265 (SEQ ID NOs:871 and 872), 
G715 (SEQ ID NOs:925 and 926), G1471 (SEQ ID NOs:31 1 and 312), G1793 (SEQ 
ID NOs:365 and 366), G1838 (SEQ ID NOs:381 and 382), G1902 (SEQ ID NOs:405 
and 406), G286 (SEQ ID NOs:877 and 878), G2138 (SEQ ID NOs:865 and 866) and 
G2830 (SEQ ID NOs:875 and 876) was subjected to NIR analysis and a significant 
increase in seed oil content compared with seed from control plants was identified. 

G192: G192 (SEQ ID NO: 859) was expressed in all plant tissues and under 
all conditions examined. Its expression was slightly induced upon infection by 
Fusarium. G192 was analyzed using transgenic plants in which this gene was 
expressed under the control of the 35S promoter. G192 overexpressors were late 
flowering under 12 hour light and had more leaves than control plants. This 
phenotype was manifested in the three T2 lines analyzed. Results of one experiment 
suggest that G192 overexpressor was more susceptible to infection with a moderate 
dose of the fungal pathogen Erysiphe orontii. The decrease in seed oil observed for 
one line was replicated in an independent experiment. G192 overexpression delayed 
flowering. A wide variety of applications exist for systems that either lengthen or 
shorten the time to flowering, or for systems of inducible flowering time control. In 
particular, in species where the vegetative parts of the plants constitute the crop and 
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the reproductive tissues are discarded, it will be advantageous to delay or prevent 
flowering. Extending vegetative development can bring about large increases in 
yields. G192 can be used to manipulate the defense response in order to generate 
pathogen-resistant plants. G192 can be used to manipulate seed oil content, which 
can be of nutritional value. 

Closely Related Genes from Other Species 

G192 had some similarity within the conserved WRKY domain to non- 
Arabidopsis plant proteins. 

G1946: G1946 (SEQ ID NO: 801) was studied using transgenic plants in 
which the gene was expressed under the control of the 35S promoter. 
Overexpression of G1946 resulted in accelerated flowering, with 35S::G1946 
transformants producing flower buds up to a week earlier than wild-type controls (24- 
hour light conditions). These effects were seen in 12/20 primary transformants and in 
two independent plantings of each of the three T2 lines. Unlike many early flowering 
Arabidopsis transgenic lines, which are dwarfed, 35S::G1946 transformants often 
reached full-size at maturity, and produced large quantities of seeds, although the 
plants were slightly pale in coloration and had slightly flat leaves compared to wild- 
type. In addition, 35S::G1946 plants showed an altered response to phosphate 
deprivation. Seedlings of G1946 overexpressor plants showed more secondary root 
growth on phosphate-free media, when compared to wild-type control. In a repeat 
experiment, all three lines showed the phenotype. Overexpression of G1946 in 
Arabidopsis also resulted in an increase in seed glucosinolate M39501 in T2 lines 
land 3. An increase in seed oil and a decrease in seed protein was also observed in 
these two lines. G1946 was ubiquitously expressed, and does not appear to be 
significantly induced or repressed by any of the biotic and abiotic stress conditions 
tested at this time, with the exception of cold, which repressed G1946 expression. 
G1946 can be used to modify flowering time, as well as to improve the plant's 
performance in conditions of limited phosphate, and to alter seed oil, protein, and 
glucosinolate composition. 



WO 03/013227 



PCT/US02/25805 



Closely Related Genes from Other Species 

A comparison of the amino acid sequence of G1946 with sequences available 
from GenBank showed strong similarity with plant HSFs of several species 
(Lycopersicon peruvianum, Medicago truncatula, Lycopersicon esculentum, Glycine 
max, Solanum tuberosum, Oryza sativa and Hordeum vulgare subsp. vulgare). 

G375: The sequence of G375 (SBQ ID NO:239) was experimentally 
determined and G375 was analyzed using transgenic plants in which G375 was 
expressed under the control of the 35S promoter. Overexpression of G375 produced 
marked effects on leaf development. At early stages of growth, 35S::G375 seedlings 
developed narrow, upward pointing leaves with long petioles (possibly indicating a 
disruption in circadian-clock controlled processes or nyctinastic movements). 
Additionally, some seedlings were noted to have elongated hypocotyls, and some 
were rather small compared to wild-type controls. Comparable phenotypes were 
obtained by overexpression of an AP2 family gene, G21 13 (SEQ ID NO: 85). 
Following the switch to flowering, 35S::G375 plants showed reduced fertility, which 
possibly arose from a failure of stamens to fully elongate. One of the three T2 lines, 
(#41) was later flowering than wild-type controls, and also developed large numbers 
of small secondary rosette leaves in the axils of the primary rosette. Although these 
effects were not noted in the other two lines, the phenotypes obtained in line 41 were 
somewhat similar to those produced by overexpression of another Z-dof gene, G736 
(SEQ ID NO: 211). G3 75 was expressed in all tissues, although at different levels. It 
was expressed at low levels in the root and germinating seed, and expressed at high 
levels in the embryo. The effects of G375 on leaf architecture are of potential interest 
to the ornamental horticulture industry. 

Closely Related Genes from Other Species 

G375 showed some homology to non-Arabidopsis plant proteins within the 
conserved Dof domain. 

G1255: The sequence of G1255 (SEQ ID NO: 273) was experimentally 
determined and G1255 was analyzed using transgenic plants in which G1255 was 
expressed under the control of the 35S promoter. Plants overexpressing G1255 had 
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alterations in leaf architecture, a reduction in apical dominance, an increase in seed 
size, and showed more disease symptoms following inoculation with a low dose of the 
fungal pathogen Botrytis cinerea. G1255 was constitutively expressed and not 
significantly induced by any conditions tested. On the basis of the phenotypes 
produced by overexpression of G1255, G1255 can be used to manipulate the plant's 
defense response to produce pathogen resistance, alter plant architecture, or alter seed 
size. 

Closely Related Genes from Other Species 

G1255 showed strong homology to a putative rice zing finger protein 
represented by sequence AC0871 81_3. Sequence identity between these two protein 
extended beyond the conserved domain, and therefore, these genes can be orthologs. 

G865: The complete cDNA sequence of G865 (SEQ ID NO: 557) was 
determined. G865 was ubiquitously expressed in Arabidopsis tissues. G865 was 
analyzed using transgenic plants in which G865 was expressed under the control of 
the 35S promoter. Plants overexpressing G865 were early flowering, with numerous 
secondary inflorescence meristems giving them a bushy appearance. G865 
overexpressors were more susceptible to infection with a moderate dose of the fungal 
pathogens Erysiphe orontii and Botrytis cinerea. In addition, seeds from G865 
overexpressing plants showed a trend of increased protein and reduced oil content, 
although the observed changes were not beyond the criteria used forjudging 
significance except in one line. G865 can be used to control flowering time. G865 
can be used to manipulate the defense response in order to generate pathogen-resistant 
plants. G865 can be used to alter seed oil and protein content of a plant. 

Closely Related Genes from Other Species 

G865 and other non- Arabidopsis AP2/EREBP proteins were similar within the 
conserved AP2 domain. 

G2509: G2509 (SEQ ID NO: 23) was studied using transgenic plants in 
which the gene was expressed under the control of the 35S promoter. Overexpression 
of G2509 caused multiple alterations in plant growth and development, most notably, 
altered branching patterns, and a reduction in apical dominance, giving the plants a 
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shorter, more bushy stature than wild type. Twenty 35S::G2509 primary 
transformants were examined; at early stages of rosette development, these plants 
displayed a wild-type phenotype. However, at the switch to flowering, almost all Tl 
lines showed a marked loss of apical dominance and large numbers of secondary 
shoots developed from axils of primary rosette leaves. In the most extreme cases, the 
shoots had very short intemodes, giving the inflorescence a very bushy appearance. 
Such shoots were often very thin and flowers were relatively small and poorly fertile. 
At later stages, many plants appeared very small and had a low seed yield compared 
to wild type. In addition to the effects on branching, a substantial number of 
35S::G2509 primary transformants also flowered early and had buds visible several 
days prior to wild type. Similar effects on inflorescence development were noted in 
each of three T2 populations examined. The branching and plant architecture 
phenotypes observed in 35S::G2509 lines resemble phenotypes observed for three 
other AP2/EREBP genes: G865 (SEQ ID NO: 557), G141 1 (SEQ ID NO: 3), and 
G1794 (SEQ ID NO: 13). G2509, G865, and G141 1 form a small clade within the 
large AP2/EREBP family, and G1794, although not belonging to the clade, is one of 
the AP2/EREBP genes closest to it in the phylogenetic tree. It is thus likely that all 
these genes share a related function, such as affecting hormone balance. 
Overexpression of G2509 in Arabidopsis resulted in an increase in alpha-tocopherol 
in seeds in T2 lines 5 and 1 1. G2509 was ubiquitously expressed in Arabidopsis plant 
tissue. G2509 expression levels were altered by a variety of environmental or 
physiological conditions. G2509 can be used to manipulate plant architecture and 
development. G2509 can be used to alter tocopherol composition. Tocopherols have 
antioxidant and vitamin E activity. G2509 can be useful in altering flowering time. 
A wide variety of applications exist for systems that either lengthen or shorten the 
time to flowering. 

Closely Rel ated Genes from Other Species 

G2509 showed some sequence similarity with known genes from other plant 
species within the conserved AP2/EREBP domain. 

G2347: G2347 (SEQ ID NO: 1 1 19) was analyzed using transgenic plants in 
which G2347 was expressed under the control of the 35S promoter. Overexpression 
of G2347 markedly reduced the time to flowering in Arabidopsis. This phenotype 
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was apparent in the majority of primary transformants and in all plants from two out 
of the three T2 lines examined. Under continuous light conditions, 35S::G2347 plants 
formed flower buds up a week earlier than wild type. Many of the plants were rather 
small and spindly compared to controls. To demonstrate that overexpression of 
G2347 could induce flowering under less inductive photoperiods, two T2 lines were 
re-grown in 12 hour conditions; again, all plants from both lines bolted early, with 
some initiating flower buds up to two weeks sooner than wild-type. As determined by 
RT-PCR, G2347 was highly expressed in rosette leaves and flowers, and to much 
lower levels in embryos and siliques. No expression of G2347 was detected in the 
other tissues tested. G2347 expression was repressed by cold, and by auxin 
treatments and by infection by Erysiphe. G2347 is also highly similar to the 
Arabidopsis protein G2010 (SEQ ID NO: 1 121). The level of homology between 
these two proteins suggested they could have similar, overlapping, or redundant 
functions in Arabidopsis. In support of this hypothesis, overexpression of both G2010 
and G2347 resulted in early flowering phenotypes in transgenic plants. 

Closely Related Genes from Other Species 

The closest relative to G2347 is the Antirrhinum protein, SBP2 (CAA63061). 
The similarity between these two proteins is extensive enough to suggest they might 
have similar functions in a plant. 

G988: G988 (SEQ ID NO: 43) was analyzed using transgenic plants in which 
G988 was expressed under the control of the 35S promoter. Plants overexpressing 
G988 had multiple morphological phenotypes. The transgenic plants were generally 
smaller than wild-type plants, had altered leaf, inflorescence and flower development, 
altered plant architecture, and altered vasculature. In one transgenic line 
overexpressing G988 (line 23), an increase in the seed glucosinolate M39489 was 
observed. The phenotype of plants overexpressing G988 was wild-type in all other 
assays performed. In wild-type plants, G988 was expressed primarily in flower and 
silique tissue, but was also present at detectable levels in all other tissues tested. 
Expression of G988 was induced in response to heat treatment, and repressed in 
response to infection with Erysiphe. Based on the observed morphological 
phenotypes of the transgenic plants, G988 can be used to create plants with larger 
flowers. This can have value in the ornamental horticulture industry. The reduction 
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in the formation of lateral branches suggests that G988 can have utility on the forestry 
industry. The Arabidopsis plants overexpressing G98S also had reduced fertility. 
This can be a desirable trait in some instances, as it can be exploited to prevent or 
minimize the escape of GMO (genetically modified organism) pollen into the 
environment. 

Closely Related Genes from Other Species 

The amino acid sequence for the Capsella rubella hypothetical protein 
represented by GenBank accession number CRU303349 was significantly identical to 
G988 outside of the SCR conserved domains. The Capsella rubella hypothetical 
protein is 90% identical to G988 over a stretch of roughly 450 amino acids. 
Therefore, it is likely that the Capsella rubella gene is an ortholog of G988. 

G2346: G2346 (SEQ ID NO: 459) was analyzed using transgenic plants in 
which the gene was expressed under the control of the 35S promoter. 35S::G2346 
seedlings from all three T2 populations had slightly larger cotyledons and appeared 
somewhat more advanced than controls. This indicated that the seedlings developed 
more rapidly that the control plants. At later stages, however, G2346 overexpressing 
plants showed no consistent differences from control plants. The phenotype of these 
transgenic plants was wild-type in all other assays performed. According to RT-PCR 
analysis, G2346 is expressed ubiquitously. 

Closely Related Genes from Other Species 

G2346 shows some sequence similarity with known genes from other plant 
species within the conserved SBP domain. 

G1354: The complete sequence of G1354 (SEQ ED NO: 285) was determined. 
G1354 was analyzed using transgenic plants in which G1354 was expressed under the 
control of the 35S promoter. Overexpression of G1354 produced highly deleterious 
effects on growth and development. Only three 35S::G1354 Tl plants were obtained; 
all were extremely tiny and slow developing. After three weeks of growth, each of 
the plants comprised a completely disorganized mass of leaves and root that had no 
clear axis of growth. Since these individuals would not have survived transplantation 
to soil, they were harvested for RT-PCR analysis; all three plants showed moderate 
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levels of G1354 overexpression compared to whole wild-type seedlings of an 
equivalent size. Only a very small number of transformants were obtained from two 
selection attempts on separate batches of TO seed. Usually between 15 and 120 
transformants are obtained from each aliquot of 300 mg TO seed from wild-type 
plants. The low transformation frequency obtained in this experiment suggests that 
high levels of G1354 overexpression might have completely lethal effects and prevent 
transformed seeds from germinating. As determined by RT-PCR, G1354 was 
uniformly expressed in all tissues and under all conditions tested in RT-PCR. 
However, the gene was repressed in leaf tissue in response to Erysiphe infection. 

Closely Related Genes from Other Species 

G1354 is closely related to a NAM protein encoded by polynucleotide from 
rice (AC005310). Similarity between G1354 and this rice protein extends beyond the 
signature motif of the family to a level that would suggest the genes are orthologs. 

G1063: G1063 (SEQ ID NO: 1 19) is a member of a clade of highly related 
HLH/MYC proteins that also includes G779 (SEQ ID NO: 1 13), G1499 (SEQ ID NO: 
7), G2143 (SEQ ID NO: 129), and G2557 (SEQ ID NO: 133). All of these genes 
caused similar pleiotropic phenotypic effects when overexpressed, the most striking 
of which was the production of ectopic carpelloid tissue. These genes can be 
considered key regulators of carpel development. A spectrum of developmental 
alterations was observed amongst 35S::G1063 primary transformants and the majority 
were markedly small, dark green, and had narrow curled leaves. The most severely 
affected individuals were completely sterile and formed highly abnormal 
inflorescences; shoots often terminated in pin-like structures, and flowers were 
replaced by filamentous carpelloid structures. In other cases, flowers showed 
internode elongation between floral whorls, with a central carpel protruding on a 
pedicel-like organ. Additionally, lateral branches sometimes failed to develop and 
tiny patches of carpelloid tissue formed at axillary nodes of the inflorescence. In lines 
with an intermediate phenotype, flowers contained defined whorls of organs, but 
sepals were converted to carpelloid structures or displayed patches of carpelloid 
tissue. In contrast, lines with a weak phenotype developed relatively normal flowers 
and produced a reasonable quantity of seed. Such plants were still distinctly smaller 
than wild-type controls. Since the strongest 35S::G1063 lines were sterile, three lines 
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with a relatively weak phenotype, that had produced sufficient seed for biochemical 
and physiological analysis, were selected for further study. Two of the T2 
populations (T2-28,37) were clearly small, darker green and possessed narrow leaves 
compared to wild type. Plants from one of these populations (T2-28) also produced 
occasional branches with abnormal flowers like those seen in the Tl. The final T2 
population (T2-30) displayed a very mild phenotype. Overexpression of G1063 in 
Arabidopsis resulted in a decrease in seed oil content in T2 lines 28 and 37. No 
altered phenotypes were detected in any of the physiological assays, except that the 
plants were noted to be somewhat small and produce anthocyanin when grown in 
Petri plates. G1063 was expressed at low to moderate levels in roots, flowers, rosette 
leaves, embryos, and germinating seeds, but was not detected in shoots or siliques. It 
was induced by auxin. G1063 can be used to manipulate flower form and structure or 
plant fertility. One application for manipulation of flower structure can be in the 
production of saffron, which is derived from the stigmas of Crocus sativus. G1063 
has utility in manipulating seed oil and protein content. 

Closely Related Genes from Other Species 

G1063 protein shared extensive homology in the basic helix loop helix region 
with a protein sequence encoded by Glycine max cDNA clone (AW832545) as well 
as a tomato root, plants pre-anthesis Lycopersicon sculentum cDNA (BE451 174). 

G2143: G2143 (SEQ ID NO: 129) is a member of a clade of highly related 
HLH/MYC proteins that also includes G779 (SEQ ID NO: 113), G1063 (SEQ ID NO: 
119), G1499 (SEQ ID NO: 7), and G2557 (SEQ ID NO: 133). All of these genes 
caused similar pleiotropic phenotypic effects when overexpressed, the most striking 
of which was the production of ectopic carpelloid tissue. These genes can be 
.considered key regulators of carpel development. Twelve out of twenty 35S::G2143 
Tl lines showed a very severe phenotype; these plants were markedly small and had 
narrow, curled, dark-green leaves. Such individuals were completely sterile and 
formed highly abnormal inflorescences; shoots often terminated in pin-like structures, 
and flowers were replaced by filamentous carpelloid structures, or a fused mass of 
carpelloid tissue. Furthermore, lateral branches usually failed to develop, and tiny 
patches of stigmatic tissue often formed at axillary nodes of the inflorescence. 
Strongly affected plants displayed the highest levels of transgene expression 
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(determined by RT-PCR). The remaining Tl lines showed lower levels of G2143 
overexpression; these plants were still distinctly smaller than wild type, but had 
relatively normal inflorescences and produced seed. Since the strongest 35S::G2143 
lines were sterile, three lines with a relatively weak phenotype, that had produced 
sufficient seed for biochemical analysis, were selected for further study. T2-1 1 plants 
displayed a very mild phenotype and had somewhat small, narrow, dark green leaves. 
The other two T2 populations, however, appeared wild-type, suggesting that 
transgene activity might have been reduced between the generations. Reduced 
seedling vigor was noted in the physiological assays. G2143 expression was detected 
at low levels in flowers and siliques, and at higher levels in germinating seed. G2143 
can be used to manipulate flower form and structure or plant fertility. One application 
for manipulation of flower structure can be in the production of saffron, which is 
derived from the stigmas of Crocus sativus. 

Closely Related Genes from Other Species 

G2143 protein shared extensive homology in the basic helix loop helix region 
with a protein encoded by Glycine max cDNA clones (AW832545, BG726819 and 
BG1 54493) and a Lycopersicon esculentum cDNA clone (BE45 1 174). There was 
lower homology outside of the region. 

G2557: G2557 (SEQ ID NO: 133) is a member ofaclade of highly related 
HLH/MYC proteins that also includes G779 (SEQ ID NO: 113), G1063 (SEQ ID NO: 
119), G1499 (SEQ ID NO: 7), and G2143 (SEQ ID NO: 129). All of these genes 
caused similar pleiotropic phenotypic effects when overexpressed, the most striking 
of which was the production of ectopic carpelloid tissue. These genes can be 
considered key regulators of carpel development. The flowers of 35S::G2557 primary 
transformants displayed patches of stigmatic papillae on the sepals, and often had 
rather narrow petals and poorly developed stamens. Additionally, carpels were also 
occasionally held outside of the flower at the end of an elongated pedicel like 
structure. As a result of such defects, 35S::G2557 plants often showed very poor 
fertility and formed small wrinkled siliques. In addition to such floral abnormalities, 
the majority of primary transformants were also small and darker green in coloration 
than wild type. Approximately one third of the Tl plants were extremely tiny and 
completely sterile. Three Tl lines (#7,9,12), that had produced some seeds, and 
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showed a relatively weak phenotype, were chosen for further study. All three of the 
T2 populations from these lines contained plants that were distinctly small, had 
abnormal flowers, and were poorly fertile compared to controls. Stigmatic tissue was 
not noted on the sepals of plants from these three T2 lines. Another line (#4) that had 
shown a moderately strong phenotype in the Tl was sown for only morphological 
analysis in the T2 generation. T2-4 plants were small, dark green, and produced 
abnormal flowers with ectopic stigmatic tissue on the sepals, as had been seen in the 
parental plant. G2557 expression was detected at low to moderate levels in all tissues 
tested except shoots. It was induced by cold, heat, and salt, and repressed by 
pathogen infection. G1063 can be used to manipulate flower form and structure or 
plant fertility. One application for manipulation of flower structure can be in the 
production of saffron, which is derived from the stigmas of Crocus sativus. 

Closely Related Genes from Other Species 

G2557 protein shows extensive sequence similarity in the region of basic helix 
loop helix with a protein encoded by Glycine max cDNA clone (BE3478 11). 

G2430: The complete sequence of G2430 (SEQ ID NO: 697) was 
determined. G2430 is a member of the response regulator class of GARP proteins 
(ARR genes), although one of the two conserved aspartate residues characteristic of 
response regulators is not present. The second aspartate, the putative phosphorylated 
site, is retained so G2430 can have response regulator function. G2430 is specifically 
expressed in embryo and silique tissue. In morphological analyses, plants 
overexpressing G2430 showed more rapid growth than control plants at early stages, 
and in two of three lines examined produced large, flat leaves. Early flowering was 
observed for some lines, but this effect was inconsistent between plantings. G2430 
can regulate plant growth. Overexpression of G2430 in Arabidopsis also resulted in 
seedlings that are slightly more tolerant to heat in a germination assay. Seedlings 
from G2430 overexpressing transgenic plants were slightly greener than the control 
seedlings under high temperature conditions. In a repeat experiment on individual 
lines, G2430 line 15 showed the strongest heat tolerant phenotype. G2430 can be 
useful to promote faster development and reproduction in plants. 
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Closely Related Genes from Other Species 

G2430 had some similarity within of the conserved GARP and response- 
regulator domains to non-Arabidopsis proteins. 

G1478: The sequence of G1478 (SEQ ID NO: 831) was determined and 
G1478 was analyzed using transgenic plants in which G1478 was expressed under the 
control of the 35S promoter. Plants overexpressing G1478 had a general delay in 
progression through the life cycle, in particular a delay in flowering time. G1478 is 
expressed at higher levels in flowers, rosettes and embryos but otherwise expression 
is constitutive. Based on the phenotypes produced through G1478 overexpression, 
G1478 can be used to manipulate the rate at which plants grow, and flowering time. 

Closely Related Genes from Other Species 

G1478 shows some homology to non-Arabidopsis proteins within the 
conserved domain. 

G681 : G681 (SEQ ID NO: 579) was analyzed using transgenic plants in 
which the gene was expressed under the control of the 35S promoter. Approximately 
half of the 35S::G681 primary transformants were markedly small and formed narrow 
leaves compared to controls. These plants often produced thin inflorescence stems, 
had rather poorly formed flowers with low pollen production, and set few seeds. 
Three Tl lines with relatively weak phenotypes, which had produced reasonable 
quantities of seed, were selected for further study. Plants from one of the T2 
populations were noted to be slightly small, but otherwise the T2 lines displayed no 
consistent differences in morphology from controls. In leaves of two of the T2 lines, 
overexpression of G681 resulted in an increase in the percentage of the glucosinolate 
M39480. According to RT-PCR analysis, G681 expression was detected at very low 
levels in flower and rosette leaf tissues. G681 was induced by drought stress. G681 
can be used to alter glucosinolate composition in plants. Increases or decreases in 
specific glucosinolates or total glucosinolate content are desirable depending upon the 
particular application. For example: (1) Glucosinolates are undesirable components 
of the oilseeds used in animal feed, since they produce toxic effects. Low- 
glucosinolate varieties of canola have been developed to combat this problem. (2) 
Some glucosinolates have anti-cancer activity; thus, increasing the levels or 
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composition of these compounds might be of interest from a nutraceutical standpoint. 
(3) Glucosinolates form part of a plants natural defense against insects. Modification 
of glucosinolate composition or quantity could therefore afford increased protection 
from predators. Furthermore, in edible crops, tissue specific promoters can be used to 
ensure that these compounds accumulate specifically in tissues, such as the epidermis, 
which are not taken for consumption. 

Closely Related Genes from Other Species 

G681 shows some sequence similarity with known genes from other plant 
species within the conserved Myb domain. 

G878: G878 (SEQ ID NO: 61 1) was studied using transgenic plants in which 
the gene was expressed under the control of the 35S promoter. Analysis of primary 
transformants revealed that overexpression of G878 delays the: onset of flowering in 
Arabidopsis. 1 1/20 of the 35S::G878 Tl plants flowered approximately one week 
later than wild type under continuous light conditions. These plants were also darker 
green, had shorter stems, and senesced later than controls. G878 was ubiquitously 
expressed. G878 can be used to modify flowering time and senescence, and a wide 
variety of applications exist for systems that either lengthen or shorten the time to 
flowering. 

Closely Related Genes from Other Species 

G878 was highly related to other WRKY proteins from a variety of plant 
species, such as the Nicotiana tabacum DNA-binding protein 2 (WRKY2) 
(AF096299), and a Cucumis sativus SPFl-like DNA-binding protein (L44134). 

G374: G374 (SEQ ID NO: 47) was expressed at low levels throughout the 
plant and was induced by salicylic acid. G374 was investigated using lines carrying a 
T-DNA insertion in this gene. The T-DNA insertion was approximately three 
quarters of the way into the protein coding sequence and should result in a null 
mutation. Homozygosity for a T-DNA insertion within G374 caused lethality at early 
stages of embryo development. In an initial screen for G374 knockouts, heterozygous 
plants were identified. Seed from those individuals was sown to soil and eleven 
plants were PCR-screened to identify homozygotes. No homozygotes were obtained; 
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6 of the progeny were heterozygous whilst the other 5 were wild type. This raised the 
prospect that homozygosity for the G374 insertion was lethal To examine this 
possibility further, heterozygous KO.G374 plants were re-grown. These individuals 
looked wild type, but their siliques were examined for seed abnormalities. When 
green siliques were dissected, around 25% of developing seeds were white or aborted. 
Embryos from these siliques were cleared using Hoyers solution, and examined under 
the microscope. It was apparent that embryos from the white seeds had arrested at 
early (globular or heart) stages of development, whilst embryos from the normal seeds 
were fully developed. Such arrested or aborted seeds most likely represented 
homozygotes for the G374 insertion. To support this conclusion, seed was collected 
from heterozygous plants and sown to kanamycin plates (the T-DNA insertion carried 
the NPT marker gene). Of the seedlings that germinated, 160 were kanamycin 
resistant and 107 were kanamycin sensitive. These data more closely fitted a 2: 1 (chi- 
sq., ldf, = 5.5, 0.05>P>0.01) than a 3:1 (chi-sq., ldf, = 32, PO.001) ratio. Such a 
segregation ratio suggested that a homozygous class of kanamycin resistant seedlings 
was absent from the progeny of KO.G374 plant. G374 can be a herbicide target. 

Closely Related Genes from Other Species 

Similar sequences to G374 are present in tomato and Medicago truncatula, and 
these sequences can be orthologs. 

Example VIQ: Identification of Homologous Sequences 

Homologous sequences from Arabidopsis and plant species other than 
Arabidopsis were identified using database sequence search tools, such as the Basic 
Local Alignment Search Tool (BLAST) (Altschul et al. (1990) J. Mol Biol. 215:403- 
410; and Altschul et al. (1997) Nucl. Acid Res. 25: 3389-3402). The tblastx sequence 
analysis programs were employed using the BLOSUM-62 scoring matrix (Henikoff, 
S. and Henikoff, J. G. (1992) Proc. Natl. Acad. Sci.USA 89: 10915-10919). 

Identified non-Arabidopsis sequences homologous to the Arabidopsis 
sequences are provided in Table 5. The percent sequence identity among these 
sequences can be as low as 47%, or even lower sequence identity. The entire NCBI 
GenBank database was filtered for sequences from all plants except Arabidopsis 
thaliana by selecting all entries in the NCBI GenBank database associated with NCBI 
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taxonomic ID 33090 (Viridiplantae; all plants) and excluding entries associated with 
taxonomic ID 3701 (Arabidopsis thaliana). These sequences are compared to 
sequences representing genes of SEQ IDs NOs:2 - 2N, where N = 2-561 , using the 
Washington University TBLASTX algorithm (version 2.0al9MP) at the default 
settings using gapped alignments with the filter "off*. For each gene of SEQ IDs 
NOs:2 - 2N, where N = 2-561, individual comparisons were ordered by probability 
score (P-value), where the score reflects the probability that a particular alignment 
occurred by chance. For example, a score of 3.6e-40 is 3.6 x 10" 40 . In addition to P- 
values, comparisons were also scored by percentage identity. Percentage identity 
reflects the degree to which two segments of DNA or protein are identical over a 
particular length. Examples of sequences so identified are presented in Table 5. 
Homologous or orthologous sequences are readily identified and available in 
GenBank by Accession number (Table 5; Test sequence ID). The identified 
homologous polynucleotide and polypeptide sequences and homologues of the 
Arabidopsis polynucleotides and polypeptides may be orthologs of the Arabidopsis 
polynucleotides and polypeptides (TBD: to be determined). 

Example IX Introduction of polynucleotides into dicotyledonous plants 

SEQ ID NOs:l-(2N - 1), wherein N = 2-561, paralogous, orthologous, and 
homologous sequences recombined into pMEN20 or pMEN65 expression vectors are 
transformed into a plant for the purpose of modifying plant traits. The cloning vector 
may be introduced into a variety of cereal plants by means well-known in the art such 
as, for example, direct DNA transfer or Agrobacterium tumefacieTis-medizted 
transformation. It is now routine to produce transgenic plants using most dicot plants 
(see Weissbach and Weissbach, (1989; supra; Gelvin et aL, (1990) supra; Heirera- 
Estrella et al. (1983) supra; Bevan (1984) supra; and Klee (1985) supra). Methods 
for analysis of traits are routine in the art and examples are disclosed above. 



Example X Transformation of Cereal Plants with an Expression Vector 

Cereal plants such as com, wheat, rice, sorghum or barley, may also be 
transformed with the present polynucleotide sequences in pMEN20 or pMEN65 
expression vectors for the puipose of modifying plant traits. For example, pMEN020 
may be modified to replace the Nptn coding region with the BAR gene of 
Streptomyces hygroscopicus that confers resistance to phosphinothricin. The Kpnl 
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and Bglll sites of the Bar gene are removed by site-directed mutagenesis with silent 
codon changes. 

The cloning vector may be introduced into a variety of cereal plants by means 
well-known in the art such as, for example, direct DNA transfer or Agrobacterinm 
tumefaciens-mediated transformation. It is now routine to produce transgenic plants 
of most cereal crops (Vasil, L, Plant Molec. Biol. 25: 925-937 (1994)) such as corn, 
wheat, rice, sorghum (Cassas, A. et al, Proc. Natl. Acad Sci USA 90: 11212-11216 
(1993) and barley (Wan, Y. and Lemeaux, P. Plant Physiol. 104:37-48 (1994). DNA 
transfer methods such as the microprojectile can be used for corn (Fromm. et al. 
Bio/Technology 8: 833-839 (1990); Gordon-Kamm et al. Plant Cell 2: 603-618 
(1990); Ishida, Y., Nature Biotechnology 14:745-750 (1990)), wheat (Vasil, et al. 
Bio/Technology 10:667-674 (1992) ; Vasil et al., Bio/Technology 11:1553-1558 
(1993); Weeks et al, Plant Physiol. 102:1077-1084 (1993)), rice (Christou 
Bio/Technology 9:957-962 (1991); Hiei et al. Plant J. 6:271-282 (1994); Aldemita 
and Hodges, Planta 199:612-617; Hiei et al., Plant Mol Biol. 35:205-18 (1997)). For 
most cereal plants, embryogenic cells derived from immature scutellum tissues are the 
preferred cellular targets for transformation (Hiei et al, Plant Mol Biol. 35:205-18 
(1997); Vasil, Plant Molec. Biol. 25: 925-937 (1994)). 

Vectors according to the present invention may be transformed into corn 
embryogenic cells derived from immature scutellar tissue by using microprojectile 
bombardment, with the A188XB73 genotype as the preferred genotype (Fromm, et 
al., Bio/Technology 8: 833-839 (1990); Gordon-Kamm et al., Plant Cell 2: 603-618 
(1990)). After microprojectile bombardment the tissues are selected on 
phosphinothricin to identify the transgenic embryogenic cells (Gordon-Kamm et al., 
Plant Cell 2: 603-618 (1990)). Transgenic plants are regenerated by standard corn 
regeneration techniques (Fromm, et al., Bio/Technology 8: 833-839 (1990); Gordon- 
Kamm et al., Plant Cell 2: 603-618 (1990)). 

The plasmids prepared as described above can also be used to produce 
transgenic wheat and rice plants (Christou, Bio/Technology 9:957-962 (1991); Hiei et 
al., Plant J. 6:271-282 (1994); Aldemita and Hodges, Planta 199:612-617 (1996); 
Hiei et al., Plant Mol Biol. 35:205-18 (1997)) that coordinated express genes of 
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interest by following standard transformation protocols known to those skilled in the 
art for rice and wheat Vasil, et al. Bio/Technology 10:667-674 (1992) ; Vasil et al, 
Bio/Technology 1 1:1553-1558 (1993); Weeks et al, Plant Physiol. 102:1077-1084 
(1993)), where the bar gene is used as the selectable marker. 

All references, publications, patent documents, web pages, and other 
documents cited or mentioned herein are hereby incorporated by reference in their 
entirety for all purposes. Although the invention has been described with reference to 
specific embodiments and examples, it should be understood that one of ordinary skill 
can make various modifications without departing from the spirit of the invention. 
The scope of the invention is not limited to the specific embodiments and examples 
provided. 
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We claim: 

1. A transgenic plant comprising a recombinant polynucleotide having a 
nucleotide sequence selected from the group consisting of: 

(a) a nucleotide sequence encoding a polypeptide comprising a sequence selected 
from those of SEQ IDNOs: 860, 802, 240, 274, 558, 24, 1120, 44, 460, 286, 120, 
130, 134, 698, 832, 580, 612, and 48, or a complementary nucleotide sequence 
thereof; 

(b) a nucleotide sequence of SEQ ID NOs: 859, 801, 239, 273, 557, 23, 1 1 19, 43, 459, 
285, 119, 129, 133, 697, 831, 579, 61 1, 47, or a complementary nucleotide sequence 
thereof; and 

(c) a nucleotide sequence that hybridizes under stringent conditions to a nucleotide 
sequence of one or more polynucleotides of: (a) or (b). 

2. The transgenic plant of claim 1 wherein the transgenic plant possesses an 
altered trait as compared to another plant, or the transgenic plant exhibits an altered 
phenotype as compared to another plant, or the transgenic plant expresses an altered 
level of one or more genes associated with a plant trait as compared to another plant, 
wherein the other plant does not comprise the recombinant polynucleotide. 

3. The transgenic plant of claim 1 wherein the plant possesses an altered trait as 
compared to another plant wherein the trait is an alteration in one or more physical 
characteristics selected from the group consisting of: the number of trichomes, fruit 
and seed size and number, yield of stems, leaves, inflorescences, or roots, stability of 
seeds during storage, susceptibility of the seed to shattering, root hair length and 
quantity, intemode distances, or the quality of seed coat. 

4. The transgenic plant of claim 1 wherein the plant possesses an altered trait as 
compared to another plant wherein the trait is an alteration in a plant growth 
characteristic selected from the group consisting of: growth rate, germination rate of 
seeds, vigor of plants and seedlings, leaf and flower senescence, male sterility, 
apomixis, flowering time, flower abscission, rate of nitrogen uptake, osmotic 
sensitivity to soluble sugar concentrations, biomass or transpiration characteristics, 
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apical dominance, branching patterns, number of organs, organ identity, and organ 
shape or size. 



5. The transgenic plant of claim 1 wherein the plant possesses an altered trait as 
compared to another plant wherein the trait is an alteration in one or more 
characteristics selected from the group consisting of protein or oil production, seed 
protein or oil production, insoluble sugar level, soluble sugar level, and starch 
composition. 

6. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO: 860. 

7. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:802. 

8. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:240. 

9. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:274. 

10. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:558. 

11. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:24. 

12. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:l 120. 

13. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:44. 
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14. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:460. 

1 5 . The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:286. 

1 6. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO: 120. 

17. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO: 130. 

18. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO: 134. 

1 9. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:698. 

20. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:832. 

2 1 . The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:580. 

22. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:612. 

23. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:48. 

24. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:859. 
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25. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO: 801. 

26. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:239. 

27. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:273. 

28. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:557. 

29. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:23. 

30. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:l 1 19. 

31. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:43. 

32. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:459. 

33. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:285. 

34. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:l 19. 

35. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO: 129. 
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36. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO: 133. 

37. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:697. 

38. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:831. 

39. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:579. 

40. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:61 1 . 

41 . The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:47. 

42. The transgenic plant of claim 1, further comprising a constitutive, inducible, 
or tissue-specific promoter operably linked to said nucleotide sequence. 

44. The transgenic plant of claim 1 , wherein the plant is selected from the group 
consisting of: soybean, wheat, corn, potato, cotton, rice, oilseed rape, sunflower, 
alfalfa, sugarcane, turf, banana, blackberry, blueberry, strawberry, raspberry, 
cantaloupe, carrot, cauliflower, coffee, cucumber, eggplant, grapes, honeydew, 
lettuce, mango, melon, onion, papaya, peas, peppers, pineapple, pumpkin, spinach, 
squash, sweet corn, tobacco, tomato, watermelon, mint and other labiates, rosaceous 
fruits, and vegetable brassicas. 

44. The transgenic plant of claim 1 wherein the encoded polypeptide is expressed 
and regulates transcription of a gene. 
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45. A method of using the transgenic plant of claim 1 to grow a progeny plant 
from a parent plant, the method comprising crossing the transgenic plant with another 
plant, selecting seed, and growing the progeny plant from the seed. 

46. An isolated or recombinant polynucleotide comprising a nucleotide sequence 
selected from the group consisting of: 

(a) a nucleotide sequence encoding a polypeptide comprising a sequence selected 
from SEQ ID NOs: 240, 274, 558, 286, 698, and 832, or a complementary nucleotide 
sequence thereof; 

(b) a nucleotide sequence of SEQ ID NOs:239, 273, 557, 285, 697, 831, or a 
complementary nucleotide sequence thereof; and 

(c) a nucleotide sequence that hybridizes under stringent conditions to a nucleotide 
sequence of one or more of: (a) or (b). 

47. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQIDNO:240. 

48. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQIDNO:274. 

49. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQIDNO:558. 

50. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQE)NO:286. 

5 1 . The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQIDNO:698. 
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52. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQIDNO:832. 

53. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:239. 

54. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:273. 

55. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:557. 

56. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:285. 

57. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:697. 

58. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:831. 

59. The isolated or recombinant polynucleotide of claim 46, further comprising a 
constitutive, inducible, or tissue-specific promoter operably linked to the nucleotide 
sequence. 

60. The isolated or recombinant polynucleotide of claim 46 wherein the encoded 
polypeptide is expressed and regulates transcription of a gene. 

61. A vector comprising the isolated or recombinant poljrciucleotide of claim 46. 

62. A host cell comprising the vector of claim 61 . 
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63. A method of using the isolated or recombinant polynucleotide of claim 46 for 
producing a plant having a modified trait, the method comprising selecting a 
polynucleotide that encodes a polypeptide, inserting the polynucleotide into an 
expression vector, introducing the vector into a plant or a cell of a plant to 
overexpress the polypeptide, thereby producing a modified plant, and selecting a 
modified plant for a modified trait. 



64. The method of claim 63 wherein the plant possesses a modified trait as 
compared to another plant wherein the trait is an alteration in one or more physical 
characteristics selected from the group consisting of: the number of trichomes, fruit 
and seed size and number, yield of stems, leaves, inflorescences, or roots, stability of 
seeds during storage, susceptibility of the seed to shattering, root hair length and 
quantity, internode distances, or the quality of seed coat. 

65. The method of claim 63 wherein the plant possesses a modified as compared 
to another plant wherein the trait is an alteration in a plant growth characteristic 
selected from the group consisting of: growth rate, germination rate of seeds, vigor of 
plants and seedlings, leaf and flower senescence, male sterility, apomixis, flowering 
time, flower abscission, rate of nitrogen uptake, osmotic sensitivity to soluble sugar 
concentrations, biomass or transpiration characteristics, apical dominance, branching 
patterns, number of organs, organ identity, and organ shape or size. 

66. The method of claim 63 wherein the plant possesses a modified trait as 
compared to another plant wherein the trait is an alteration in one or more 
characteristics selected from the group consisting of protein or oil production, seed 
protein or oil production, insoluble sugar level, soluble sugar level, and starch 
composition. 



67. A modified plant produced by the method of claim 63. 



68. A method of using the plant of claim 67 to grow a progeny plant from a parent 
plant, the method comprising crossing the transgenic plant with another plant, 
selecting seed, and growing the progeny plant from the seed. 
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69. The plant produced by the method of claim 68. 
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SEQUENCE LISTING 

<110> Mendel Biotechnology, Inc. 
Ratcliffe, Oliver 
Riechmann, Jose Luis 
Adam, Luc J. 
Dubell, Arnold T. 
Heard, Jacqueline E. 
Pilgrim, Marsha L. 
Jiang, Cai-Zhong 
Reuber, T. Lynne 
Creelman, Robert A. 
Pineda, Omaira 
Yu, Guo-Liang 
Broun, Pierre E. 

<12 0> YIELD -RELATED POLYNUCLEOTIDES AND 
POLYPEPTIDES IN PLANTS 

<130> 514442002041 

<150> 60/310,847 
<151> 2001-08-09 

<150> 60/336,049 
<151> 2001-11-19 

<150> 60/338,692 
<151> 2001-12-11 

<150> 10/171,468 
<151> 2002-06-14 

>G1275 (58.. 579) 

CCAAGAAAAGGGAAGATCACGCATTCTTATAGGCGTAATTCGTAAATAGTGGTGAGTATG 
AATGATGCAGACACAAACTTGGGGAGTAGTTTCAGCGATGATACTCACTCTGTGTTCGAG 
TTTCCGGAGCTAGACTTGTCAGATGAATGGATGGATGATGATCTTGTGTCTGCGGTTTCC 
GGGATGAATCAGTCTTATGGTTATCAGACTAGTGATGTTGCTGGTGCTTTATTCTCAGGT 
TCTTCTAGCTGTTTCAGTCATCCTGAATCTCCAAGTACCAAAACTTATGTTGCTGCTACA 
GCCACTGCTTCTGCCGACAACCAAAACT^GAAAGAAAAGAAAAAAATTAAAGGGAGAGTT 
GCGTTCAAGACACGGTCCGAGGTGGAAGTGCTTGACGACGGGTTCAAGTGGAGAAAGTAT 
GGGAAGAAGATGGTGAAGAACAGCCCACATCQAAGAAACTACTACAAATGTTCAGTTGAT 
GGCTGTCCCGTGAAGAAAAGGGTTGAACGAGACAGAGATGATCCGAGCTTTGTGATAACA 
ACTTACGAGGGTTCCCACAATCACTCAAGCATGAACTAAGACTCGAACTAAGGCTCAAGG 
CGACCATGCTATATTCAGCACATCTTATTTTCTATGGTTACGAACGATACTTAAAACTGC 
TTCTAGTTCTTTATATCCATTGTAAACTGGTTGCAGGTTCACAAATTTTGAGAGGTTTAT 
GACATTCTAAATCTGTAGTACTTATATA 

>G1275 Amino Acid Sequence (domain in AA coordinates: 113-169) 
Ml^ADTNLGSSFSDDTHSVFEFPELDLSDEWMDDDLVSAVSGMNQSYGYQTSDVAGALFS 
GSS SCFSHPES PSTKTYVAATATAS ADNQNKKEKKKI KGRVAFKTRSEVEVLDDGFKWRK 
YGKKMVTCNSPHPRNYYKCSTOGCPVTCKRVERDRDDPSFV^ 
>G1411 (110. .856) 

TAAAGAAAAACTGAACAACCCTAAAGTACTGTATAAATCCTATATCAAATTTTTTTTTTG 
GAAGAAAAGGCTATATTTAAAAGAAAATCAAGCAAAAGTAGATCCTCGGATGTATGGGAA 
GAGGCCTTTTGGAGGTGATGAATCTGAAGAAAGGGAAGAAGATGAGAACTTGTTCCCGGT 
CTTCTCGGCCCGATCTCAACACGACATGCGTGTTATGGTCTCGGCCTTGACTCAAGTAAT 
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I™ ccc ^ gagagggtoc ^^ 

acaaatggaaccacaaagtataccgaactataatcaatactatcatgat^ga^gtS 

TGATATGCTAAGTTTTAAT^^ 

^ A ^f AATAGTACTACGACTGCTGCTAC ^ CTTC TTCGTCT^ 
GC^CAAGAAGAGCAAGATTATGCCAGAlTCTGGCGCTTTGGGGATTCTTcS^^ 

tcattcgggatattaattaggagatttgatcagttacttgtgatSaIctSS 
tcccgtcaaaattgagatgatcatatgcttcctgaatgtttttgS 

taesaalaydeaalkfkgskaki^fpervqi^snstyyssnqipSpS^™ 

>G1488 (1..996) 
MGaAASMOAAQa^TaAATTCTT^ 

accaccataaccgacagctctaacttctccgctgctgatcttcccagStccac^^ 

GTTCAAGACGGCACTAGCTTCTCCGGTGACCTTTGTATACcS^^^ 

G ™ AGAGTGGCTTTC ^ 

ctcgagctaatatccggttttaagagtcgaccggacccgaaatccgataS 

GAAAACCOSAATAGCAG^GTCCGATTTITACTACCGAcSS 

aga ^ g ^ cgctcacgg ^cgctgcgtgtaattgggcctcacgtgggot^ 

AC ^™ ACGA ^ GTCOT ^^ 

CACAACGTTGGCCCAGATTTCAGACAGCTTATTTGA 1AUTTGATCCAC 



^f SGDLClpsDD ^ EL ^^^ 

^SSPIFTTDVSVPAKARSKRSRA^^ 
P_PTSPPL L MAP^KKQAVD^^ 



>G1499 (159.. 833) 



CAGCCATC 
TGAGGATC 
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TACGCTATGTCAAGTTCTTGAAACGGCAGATCCGGCTACTCAATAATAATACCGGATATA 

CTCCTCCGCCGCCGCAAGATCAAGCTTCTCAGGCGGTGACGACGTCATGGGTTTCACCGC 

CACCACCGCCAAGTTTCGGCCGTGGGGGAAGAGGAGTAGGAGAATTAATCTAGACAAGAT 

GACATTTCCATTAGTAGTAACTAAATTATGCTATAATGTGTGAGTAATGGTGCAATTATG 
GA 

>G1499 Amino Acid Sequence (domain in AA coordinates: 118-181) 
MNNYNMNPSLFQNYTWNNIIN 

HFPPLSSSLTTTTLLSGDQEDDEDEEEPLEELGAMKEMMYKIAAMQSVDIDPATVKKPKR 
RNVRISDDPQSVAARHRRERISERIRILQRLVPGGTKMDTASMLDEAIRYVKFLKRQIRL 
LNNNTGYTPPPPQDQASQAVTTSWSPPPPPSFGRGGRGVGELI* 
>G1543 (1..828) 

ATGATAAAACTACTATTTACGTACATATGCACATACACATATAAACTATATGCTCTATAT 

CATATGGATTACGCATGCGTGTGTATGTATAAATATAAAGGCATCGTCACGCTTCAAGTT 

TGTCTCTTTTATATTAAACTGAGAGTTTTCCTCTCAAACTTTACCTTTTGTTCTTCGATC 

CTAGCTCTTAAGAACCCTAATAATTCATTGATCAAAATAATGGCGATTTTGCCGGAAAAC 

TCTTCAAACTTGGATCTTACTATCTCCGTTCCAGGCTTCTCTTCATCCCCTCTCTCCGAT 

GAAGGAAGTGGCGGAGGAAGAGACCAGCTAAGGCTAGACATGAATCGGTTACCGTCGTCT 

GAAGACGGAGACGATGAAGAATTCAGTCACGATGATGGCTCTGCTCCTCCGCGAAAGAAA 

CTCCGTCTAACCAGAGAACAGTCACGTCTTCTTGAAGATAGTTTCAGACAGAATCATACC 

CTTAATCCCAAACAAAAGGAAGTACTTGCCAAGCATTTGATGCTACGGCCAAGACAAATT 

GAAGTTTGGTTTCAAAACCGTAGAGCAAGGAGCAAATTGAAGCAAACCGAGATGGAATGC 

GAGTATCTCAAAAGGTGGTTTGGTTCATTAACGGAAGAAAACCACAGGCTCCATAGAGAA 

GTAGAAGAGCTTAGAGCCATAAAGGTTGGCCCAACAACGGTGAACTCTGCCTCGAGCCTT 

ACTATGTGTCCTCGCTGCGAGCGAGTTACCCCTGCCGCGAGCCCTTCGAGGGCGGTGGTG 

CCGGTTCCGGCTAAGAAAACGTTTCCGCCGCAAGAGCGTGATCGTTGA 

>G1543 Amino Acid Sequence (domain in AA coordinates: 135-195) 

M I KLLFTY I CT YTYKL YAL YHMD YACVCM YKYKG I VTLQVCLF YI KLRVFLSNFTFSS S I 

IJUiKNPNNSLIKIMAILPENSSNLDLTISVPGFSSSPLSDEGSGGGRDQLRLDMNRLPSS 

EDGDDEEFSHDDGSAPPRKiCLRLTREQSRLLEDSFRQNHTLNPKQKEVIiAKHLM 

EWFQNRRARSKLKQTEMECEYLKRWFGSLTEENHRLHREVEELRAIK^GPTTVWSASSL 
TMCPRCERVTPAAS PSRAWPVPAKKTFPPQERDR* 
>G1635 (1..1164) 

ATGGCGTCGTCTCCGTTGACTGCAAATGTTCAGGGTACCAACGCTTCTTTGAGGAATAGA 
GATGAAGAAACTGCAGACAAGCAGATACAATTCAATGACCAAAGTTTTGGGGGAAATGAC 
TATGCACCCAAGGTACGGAAGCCATACACGATAACAAAAGAGAGAGAGAGATGGACAGAT 
GAAGAGCACAAGAAGTTTGTTGAAGCCTTGAAATTATACGGGCGAGCTTGGAGACGAATA 
GAAGAA(^TGTGGGCTCAAAGACCGCAGTTCAGATTCGAAGCCATGCTCAGAAGTTTTTC 
TCTAAGGTTGCTCGAGAAGCAACTGGAGGTGATGGGAGCTCAGTAGAGCCGATTGTAATA 
CCTCCTCCTCGTCCCAAGAGAAAGCCAGCGCATCCGTACCCTCGTAAGTTTGGGAACGAG 
GCAGATCAAACAAGTAGATCGGTTTCTCCCTCAGAACGTGATACTCAATCTCCAACCTCT 
GTGTTGTCCACTGTTGGATCAGAAGCATTGTGtTCCCTTGATTCGAGTTCACCCAATCGA 
AGCTTGTCCCGAGTTTCTTCTGCATCACC^CCAGCTGCTCTTACAACCACTGCAAATGCA 
CCTGAAGAGCTTGAGACTCTGAAGCTGGAGTTGTTTCCTAGTGAGAGACTCTTAAACAGG 
GAGAGCTCGATCAAGGAACCAACGAAGCAAAGTCTTAAACTCTTTGGGAAGACAGTTTTG 
GTATCTGATT(^GGCATGTCCTCTTCTCTAACAACTTCAACATATTGTAAATCCCCAATT 
CAGCCATTACCACGGAAACTCTCATCATCCAAGACACTACCCATAATAAGAAACTCACAA 
GAAGAACTCTTGAGCTGCTGGATACAAGTCCCTCTTAAGCAAGAAGATGTGGAAAATAGA 
TGTTTGGATTCAGGAAAGGCTGTCCAAAACGAAGGATCATCGACTGGATCAAACACTGGT 
TCGGTGGATGATACGGGACACACGGAAAAGACCACAGAACCCGAAACAATGCTATGTCAA 
TGGGAGTTTAAACCAAGTGAGAGGTCTGCATTTTCTGAGCTCAGAAGAACAAACTCCGAG 
TCAAATTCAAGAGGATTTGGTCCATACAAGAAGAGAAAGATGGTAACAGAAGAAGAAGAG 
CATGAGATTCATCTCCACTTATAA 

>G1635 Amino Acid Sequence (domain in AA coordinates: 44-104) 
MASS PLTANVQGTNASLRNRDEETADKQIQFNDQS FGGNDYAPKVRKPYTITKERERWTD 
EEHKKFVEALKLYGRAWRRIEEHVGSKTAVQIRSHAQKFFSKVAREATGGDGSSVEPIVI 
PPPRPKRKPAHPYPRKFGNEADQTSRSVSPSERDTQSPTSVLSTVGSEALCSLDSSSPNR 
SLSPVSSASPPAALTTTANAPEELETLKLELFPSERLLNRESSIKEPTKQSLKLFGKTVL 
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VSDSGMSSSLTTSTYCKSPIQPLPRKLSSSKTLPIIRNSQEELLSCWIQVPLKQEDVENR 

CLDSGKAVQNEGSSTGSNTGSVDDTGHTEKTTEPETMLCQWEFKPSERSAFSELRRTNSE 

SNSRGFGPYKKRKIWTEEEEHEIHLHL* 

>G1794 (160.. 1335) 

TCTTTCTTTCTTCCTCTTTGTCTCTGTTTCTTGTTTCTCTCTCTCTCTCTCTACAGAGTT 

TTCTTTCCCTCGAAGAAAAAGAATATTTTTAAATTTAATTTTCTCTGCGTTTATAAGCTT 

TAAGTTTCAGAGGAGGAGGATTTAGAAGGAGGGTTTTGTATGTGTGTCTTAAAAGTGGCA 

AATCAGGAAGATAACGTTGGCAAAAAAGCCGAGTCTATTAGAGACGATGATCATCGGACG 

TTATCTGAAATCGATCAATGGCTTTACTTATTCGCAGCCGAAGACGACCACCACCGTCAT 

AGCTTCCCTACGCAGCAGCCGCCTCCATCGTCGTCGTCCTCATCTCTTATCTCAGGTTTC 

AGTAGAGAGATGGAGATGTCTGCTATTGTCTCTGCTTTGACTCACGTTGTTGCTGGAAAT 

GTTCCTCAGCATCAACAAGGAGGCGGTGAAGGTAGCGGAGAAGGGACTTCGAATTCGTCT 

TCTTCCTCGGGGCAGAAAAGGAGGAGAGAGGTGGAGGAAGGTGGCGCCAAAGCGGTTAAG 

GCAGCTAATACTTTGACGGTTGATCAATATTTCTCCGGTGGTAGCTCTACTTCTAAAGTG 

AGAGAAGCTTCGAGTAACATGTCAGGTCCGGGCCCAACATACGAGTATACAACTACGGCA 

ACTGCTAGTAGCGAAACGTCGTCGTTTAGTGGGGACCAACCTCGGCGAAGATACAGAGGA 

GTTAGACAAAGACCATGGGGAAAGTGGGCGGCTGAGATTCGAGATCCATTTAAAGCAGCT 

AGAGTTTGGCTCGGTACGTTCGACAATGCTGAATCAGCAGCAAGAGCTTACGACGAAGCT 

GCACTTCGGTTTAGAGGCAACAAAGCCAAACTCAACTTCCCTGAAAACGTCAAACTCGTT 

AGACCTGCTTCAACCGAAGCACAACCTGTGCACCAAACCGCTGCTCAAAGACCGACCCAG 

TCAAGGAACTCGGGTTCAACGACTACCCTTTTGCCCATAAGACCTGCTTCGAATCAAAGC 

GTTCATTCGCAGCCGTTGATGCAATCATACAACTTGAGTTACTCTGAAATGGCTCGTCAA 

CAACAACAGTTTCAGCAACATCATCAACAATCTTTGGATTTATACGATCAAATGTCGTTT 

CCGTTGCGTTTCGGTCACACTGGAGGTTCAATGATGCAATCTACGTCGTCATCATCATCT 

CATTCTCGTCCTCTGTTTTCCCCGGCTGCTGTTCAGCCGCCACCAGAATCAGCTAGCGAA 

ACCGGTTATCTCCAGGATATACAATGGCCATCAGACAAGACTAGTAATAACTACAATAAT 

AGTCCATCCTCCTGATGACTTGCTTCATTTTATTTGTTTCACTATAGAGTAATAGAAAAC 

AGGAAAATGATTATATGTTATAGAGTTATTTTTCCAAATATTATAGGGTTTAGGTTGTTT 

GTATTGTTCTGCTTTCATCCTCTCATGCTTTTTTTCTTAATTTATTATATTTTTGCATTA 

TAATTTCGTTTCATTGTAACAAACATTAAAAAGACCACATGGAGAAAGGAAAAAAAAGAG 
AG 

>G1794 Amino Acid Sequence (domain in AA coordinates: TBD) 

MCVLKVANQEDNVGKKAES IRDDDHRTLS EI DQWLYLFAAEDDHHRHSFPTQQPPPS S SS 

SSLISGFSREMEMSAIVSALTHWAGNVPQHQQGGGEGSGEGTSNSSSSSGQKRRREVEE 

GGAKAVKAAOTLTVDQYFSGGSSTSKA^EASSNMSGPGPTYEYTTTATASSETSSFSGDQ 

PRRRYT^GVRQRPWGKWAAEIRDPFKAARWLGTFDNAESAARAYDEAALRFRGNKAKLNF 

PEWKLVRPASTEAQPVHQTAAQRPTQSRNSGSTTTLLPIRPASNQSVHSQPLMQSYNLS 

YSEMARQQQQFQQHHQQSLDLYDQMSFPLRFGHTGGSMMQSTSSSSSHSRPLFSPAAVQP 

PPESASETGYLQDIQWPSDKTSNNYNNSPSS * 

>G1839 (38.. 592) 

ATCACAGTTATGTTTCCATTCATTGGCTATAAAA 

ACACCATTTGCAGGAAAAAATGAATAGTTGTCAGTCTAATCCCACCAAAATGGATAATTC 
AGAAAATGTTCTATTTAATGATCAAAACGAAAATTT 

TTCTTCGTACTTGACAAGAGATCAAGAGCACGAGATCATGGTCTCTGCTCTGCGACAAGT 

GATATCTAACTCCGGAGCTGACGACGCGTCATCATCAAACTTGATCATCACAAGCGTTCC 

GCCTCCAGACGCTGGCCCTTGTCCTCTCTGTGGCGTCGCCGGTTGCTACGGCTGCACATT 

ACAACGGCCGCACCGAGAGGTAAAGAAGGAGAAGAAATACAAAGGAGTAAGGAAAAAACC 

ATCGGGTAAATGGGGGGCGGAGATATGGGATCCGAGATCGAAATCAAGGAGGTGGCTTGG 

AACGTTTCTTACGGCGGAGATGGCGGCACAATCTTACAATGATGCGGCGGCTGAGTATCG 

AGCAAGACGTGGTAAAACAAACGGAGAAGGAATTAAACGGCGGTGGAGATGACTGAGAAG 

GACATGGTCGGTGATCATACACGGCGAGGTGGAAATGTTATATTTACTATTGAAAACTAA 

ATTATTTATTATAGAGGGAGATATTACTCTTTACGCTT^ 

GTTTTAAAGTATTTTATTGTTATAAAAAAAAAAAAAAAAAAAAAA 

>G1839 Amino Acid Sequence (domain in AA coordinates: TBD) 

MLTPFCSSHHLQEKMNSCQSNPTKMDNSENVLFNDQNENFTLVAPHPSSSYLTRDQEHEI 

MVSAIiRQVlSNSGADDASSSNLIITSVPPPDAGPCPLCGVAGCYGCTLQRPHREVKKEKK 

YKGVRKKPSGKWAAEI WDPRSKSRRWLGTFLTAEMAAQS YNDAAAEYRARRGKTNGEG I K 
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RRWR* 

>G2108 (35.. 694) 

GAGAGAGAAACATTGATCTCTGAATATTGTGAACATGTTGAAATCAAGTAACAAGAGAAA 
AAGCAAAGAAGAGAAGAAGTTACAAGAAGGGAAGTACCTTGGAGTGAGGAGACGTCCATG 
GGGAAGATATGCAGCTGAAATCAGAAACCCTTTTACTAAAGAAAGACATTGGCTTGGAAC 
GTTTGATACAG CCG AAGAAGCTG CTTTTGCATATGACGTTGCTGCTCG ATCCATCAGCGG 
CTCTCTAGCTACAACAAACTTCTTCTACACTGAAAACACCTCTTTAGAAAGACATCCACA 
ACAGTCTTTGGAGCCTCATATGACTTGGGGATCTTCTAGTCTCTGTCTTCTTCAAGATCA 
GCCTTTTGAAAACAACCATTTTGTTGCTGATCCTATCTCTTCTTCTTTTTCTCAAAAACA 
AGAGTCTTCTACCAATCTCACTAACACTTTCTCACATTGTTATAATGATGGTGATCATGT 
TGGCCAAAGCAAAGAGATTTCTTTACCTAATGATATGTCAAACAGTTTATTCGGTCATCA 
GGACAAAGTCGGTGAACATGACAATGCAGACCATATGAAGTTTGGCTCAGTTCTCAGCGA 
CGAACCTCTCTGCTTTGAGTATGACTACATTGGGAATTATCTTCAGAGTTTTCTCAAAGA 
TGTCAACGACGATGCTCCACAGTTTCTTATGTGAGCTTGTATTACCGATCCTTCAATTTA 
TG 

>G2108 Amino Acid Sequence (domain in AA coordinates: 18-85) 

MLKSSNKRKSKEEKKLQEGKYLGVRRRPWGRYAAEIRNPFTKERHWLGTFDTAEEAAFAY 

DVAARSISGSLATTNFFYTENTSLERHPQQSLEPHMTWGSSSLCLLQDQPFENNHFVADP 

ISSSFSQKQESSTNIiTNTFSHCYlTOGDHVGQSKIIISLPNDMSNSLFGHQDKVGEHDNADH 

MKFGSVLSDEPLCFEYDYIGNYLQSFLKDVNDDAPQFLM* 

>G2291 (27. .797) 

GCTTTCTCACCTTTATAAAATAGAAAATGGAAAACAGCTACACCGTTGATGGTCACCGTC 

TTCAATATTCCGTTCCGTTAAGCTCCATGCATGAAACCAGTCAAAACTCCGAAACTTACG 

GATTATCCAAAGAGTCGCCGTTGGTCTGCATGCCCTTGTTCGAAACCAACACTACTTCAT 

TCGATATCTCTTCTCTTTTCTCGTTTAACCCAAAACCAGAACCCGAAAACACGCATCGTG 

TCATGGACGATTCCATCGCCGCCGTCGTGGGCGAAAACGTTCTTTTCGGTGATAAAAACA 

AAGTCTCTGATCACTTGACCAAAGAAGGTGGTGTGAAGCGGGGGCGGAAGATGCCGCAGA 

AGACCGGAGGATTCATGGGAGTGAGAAAACGGCCGTGGGGGAGATGGTCGGCGGAGATAA 

GAGACAGGATAGGGCGGTGCAGACACTGGTTAGGAACGTTCGACACGGCGGAAGAGGCAG 

CGCGTGCGTATGACGCGGCGGCGAGGAGGCTTAGAGGGACCAAAGCCAAGACCAATTTCG 

TGATTCCTCCGCTTTTTCCCAAGGAAATAGCTCAGGCTCAGGAGGATAATAGGATGAGGC 

AGAAGCAGAAGAAGAAGAAGAAGAAAAAAGTGAGTGTGAGGAAGTGTGTTAAAGTCACAT 

CGGTTGCACAGTTGTTCGATGATGCCAATTTTATAAATTCTTCTAGTATTAAAGGAAATG 

TGATTAGTTCTATTGATAATCTTGAAAAAATGGGTCTAGAGCTTGATTTGAGTTTAGGGT 

TGTTGTCTAGGAAGTGATAAAGCACTCGTAGTTAAGTAGTTGTAGTT 

>G2291 Amino Acid Sequence (domain in AA coordinates: TBD) 

MENSYTVDGHRLQYSVPLSSMHETSQNSETYGLSKESPLVCMPLFETNTTSFDISSLFSF 

NPKPEPENTHRVMDDSIAAWGElTtfbFGDKNKVS 

KRPWGRWSAEIRDRIGRQ^HPOjGTFDTAEEAARAYDAAARRLRGTKAKTN 
lAQAQEDNRMRQKQKKKKKKKVSVRKCVKVTSV^ 
KMGLELDLSLGLLSRK* 
>G2452 (1..804) 

ATGTCATCGTCGACGATGTACAGAGGAGTTAATATGTTTTCACCGGCAAACACAAACTGG 
ATTTTTCAAGAAGTCAGAGAAGCCACGTGGACGGCGGAGGAGAACAAACGGTTCGAGAAA 
GCTCTCGCTTATCTGGACGACAAAGACAATCTTGAGAGCTGGTCCAAGATCGCAGATTTG 
ATTCCCGGCAAAACAGTAGCTGACGTCATTAAACGATACAAGGAGCTAGAGGATGATGTC 
AGCGACATCGAAGCCGGACTTATCCCCATTCCGGGATACGGCGGCGACGCCTCCTCCGCT 
GCAAACTIGTGACTA^TTCTTTGGTCTAGAAAACTCCAGCTACGGTTATGATTACGTCGTT 
GGAGGAAAGAGGAGTTCGCCGGCGATGACK^TTGTTTTAGGTCTCCGATGCCGGAAAAG 
GAGAGGAAGAAAGGAGTTCCGTGGACCGAGGACGAACACCTACGATTTCTGATGGGTTTG 
AAGAAATATGGAAAAGGAGATTGGAGAAACATAGCAAAAAGCTTTGTGACGACTCGAACG 
CCGACGCAAGTCGCTTCACACGCTCAGAAATATTTTCTTCGACAACTCACAGATGGTAAA 
GACAAAAGACGATCAAGTATTCACGATATCACCACTGTTAACATCCCTGACGCAGACGCA 
TCCGCAACCGCCACGACCGCTGACGTAGCACTCTCTCCTACTCCAGCCAATTCn^TTGAC 
GTTTTCCTTCAGCCTy^TCCTCATTACAGTTTCGCGTCTGCGTCTGCGTCTAGCTATTAT 
AATGCGTTTCCGCAGTGGAGTTAA 

>G2452 Amino Acid Sequence (conserved domain in AA coordinates : 27 -2 13) 



5 



BNSDOCID: <WO__03013227A2J_> 



wo 03/013227 pcrmmnsm 



MSSSTMYRGVNMFSPANTNWIFQEVREATWTAEENKRFEKALAYLDDKDNLESWSKIADL 
IPGKTVADVIKRYKELEDDVSDIEAGLIPIPGYGGDASSAANSDYFFGLENSSYGYDYW 
GGKRSSPAMTDCFRS PMPEKERKKGVPWTEDEHLRFLMGLKKYGKGDWRNI AKS FVTTRT 
PTQVASHAQKYFLRQLTDGKDKRRSS IHDITTVNI PDADASATATTADVALS PTPANS FD 
VFLQPNPHYS FASAS AS S YYNAFPQWS * 
>G2509 (143.. 934) 

ATATATTCCCTCTTTCATTCTCCTTCTTCGTCTTTTCTTTGTTTCTCATATTCAAGACAT 
CCTCAATTCCAAATCTTAAACCCTAAATTTACAGACACAATCGAGATCACCTGAAAAAAG 
AGGTTTAAAGATTTTAGCAAAGATGGCGAATTCAGGAAATTATGGAAAGAGGCCCTTTCG 
AGGCGATGAATCGGATGAAAAGAAAGAAGCCGATGATGATGAGAACATATTCCCTTTCTT 
CTCTGCCCGATCCCAATATGACATGCGTGCCATGGTCTCAGCCTTGACTCAAGTCATTGG 
AAACCAAAGCAGCTCTCATGATAATAACCAACATCAACCTGTTGTGTATAATCAACAAGA 
TCCTAACCCACCGGCTCCTCCAACTCAAGATCAAGGGCTATTGAGGAAGAGGCACTATAG 
AGGGGTAAGACAACGACCATGGGGAAAGTGGGCAGCTGAAATTCGGGATCCGCAAAAGGC 
AGCACGGGTGTGGCTCGGGACATTTGAGACTGCTGAAGCTGCGGCTTTAGCTTATGATAA 
CGCAGCTCTTAAGTTCAAAGGAAGCAAAGCCAAACTCAATTTCCCTGAGAGAGCTCAACT 
AGCAAGTAACACTAGTACAACTACCGGTCCACCAAACTATTATTCTTCTAATAATCAAAT 
TTACTACTCAAATCCGCAGACTAATCCGCAAACCATACCTTATTTTAACCAATACTACTA 
TAACCAATATCTTCATCAAGGGGGGAATAGTAACGATGCATTAAGTTATAGCTTGGCCGG 
TGGAGAAACCGGAGGCTCAATGTATAATCATCAGACGTTATCTACTACAAATTCTTCATC 
TTCTGGTGGATCTTCAAGGCAACAAGATGATGAACAAGATTACGCCAGATATTTGCGTTT 
TGGGGATTCTTCACCTCCTAATTCTGGTTTTTGAGATCTTCAATAAACTGATAATAAAGG 
ATTTGGGTCACTTGTTATGAGGGGATCATATGTTTTCTAA 

>G2509 Amino Acid Sequence (domain in aa coordinates: 89-156) 
MANSGI^GKRPFRGDESDEKKEADDDENIFPFFSARSQYDMRAMVSALTQVIGNQSSSHD 
NNQHQPVVYNQQDPNPPAPPTQDQGLLRKRHYRGWQRPWGKWAAEIRDPQKAARVWLGT 
FETAEAAALAYDNAALKFKGSKAICLNFPERAQIASOT 

NPQT I PYFNQ YYYNQ YLHQGGNSNDALS YSLAGGETGGSMYNHQTLS TTNS S S SGGS SRQ 

QDDEQDYARYLRFGDSSPPNSGF* 

>G390 (1..2526) 

ATGATGGCTCATCACTCCATGGACGATAGAGACTCTCCTGATAAAGGATTTGATTCCGGC 

AAGTACGTTAGATACACGCCGGAACAAGTTGAAGCTCTTGAGAGAGTTTATGCTGAGTGT 

CCTAAACCTAGCTCTCTGAGAAGACAACAGCTTATTCGTGAATGTCCCATTCTCTGTAAC 

ATCGAGCCTCGACAGATCAAAGTTTGGTTCCAGAATCGCAGATGTCGAGAGAAGCAGAGG 

AAAGAGTCAGCTCGTCTTCAGACAGTGAACAGGAAGCTGAGTGCTATGAACAAGCTTTTG 

ATGGAAGAGAATGATCGTTTGCAGAAGCAAGTCTCCAACTTGGTTTATGAGAATGGATTC 

ATGAAACATCGAATCC^CACTGCTTCTGGGACGACCACAGACAACAGCTGTGAGTCTGTG 

GTCGTGAGTGGTCAGCAACGTCAGCAGCAAAACCC7UVCACATCAGCATCCTCAGCGTGAT 

GTTAACAACCCAGCTAATCTTCTCTCGATTGCGGAGGAGACCTTGGCGGAGTTCCTTTGC 

AAGGCTACAGGAACTGCTGTCGACTGGGTCCAGATGATTGGGATGAAGCCTGGTCCGGAT 

TCTATTGGTATCGTAGCTGTTTCACGCAACTGCAGTGGAATAGCAGCACGTGCCTGTGGC 

CTCGTGAGTTTAGAACCCATGAAGGTCGCTGAAATCCTCAAAGATCGTCCATCTTGGTTC 

CGTGACTGTCGATGTGTCGAGACTCTGAATGTTATACCCACTGGAAATGGTGGTACTATC 

GAGCTTGTCAACACTCAGATTTATGCTCCTACAACATTAGCAGCAGCTCGTGACTTTT^ 

ACGCTGAGATATAGTACAAGTCTAGAAGATGGAAGCTATGTGGTCTGTGAGAGATCACTC 

ACTTCTGCAACTGGTGGCCCCAATGGTCCACTTTCTTCAAGCTTCGTGAGAGCCAAAATG 

CTGTCAAGCGGGTTTCTTATCCGTCCTTGTGATGGTGGTGGTTCCATTATTCACATC 

GATCATGTGGACTTGGATGTCTCAAGTGTTCCTGAAGTCCT 

TCCAAAATCCTTGCTCAAAAAATGACTGTCGCTGCTCTGAGACATGTGCGCCAAATTGCT 

CAAGAGACTAGTGGAGAAGTCCAGTATAGTGGTGGACGCCAGCCTGCAGTTTTAAGGACT 

TTCAGCCAGAGACTCTGCCGGGGTTTCAATGATGCTGTAAATGGTTTTGTCGATGATGGA 

TGGTCTCCAATGAGTAGTGATGGAGGAGAGGATATTACGATCATGATTAACTCTTCCTCT 

GCTAAATTTGCTGGCTCCCAATACGGTAGCTCAT 

CTCTGTGCCAAAGCTTCTATGCTGTTGCAGAATC 

CTGAGAGAACyVCCGAGCTGAATGGGCAGACTATGGTGTCGATGCCTATTCTGCTGCATCT 
CTCAGAGC^CTCCATATGCTGTTCCATGCXSTCAGAACCGGTGGGTTCCCGAGTAACC^A 
GTCATTCTTCCTCTCGCACAGACACTCGAACATGAAGAGTTTCTCGAAGTGGTTAGACTT 
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GGAGGTCATGCTTACTCACCTGAAGACATGGGCTTATCCCGGGATATGTATTTACTGCAG 
CTTTGTAGCGGCGTTGATGTUVAATGTGGTTGGAGGTTGTGCTCAGCTTGTCTTTGCCCCA 
ATCGATGAATCATTTGCTGATGATGCACCTTTGCTTCCTTCTGGTTTCCGTGTCATACCA 
CTCGACCAAAAAACAAATCCGAATGATCATCAATCTGCAAGTCGAACACGGGATCTAGCA 
TCGTCCCTAGATGGTTCCACCAAAACCGATTCGGAAACAAACTCTAGATTGGTCTTAACA 
ATAGCCTTCCAGTTCACGTTTGATAACCATTCCAGAGACAATGTTGCTACAATGGCGAGA 
CAGTATGTGAGGAACGTTGTTGGTTCGATTCAGAGAGTGGCTCTAGCCATTACGCCTCGT 
CCTGGCTCAATGCAACTTCCCACTTCCCCTGAAGCTCTCACTCTTGTCCGTTGGATCACC 
CGTAGTTACAGTATTCATACAGGTGCAGATCTGTTTGGAGCTGATTCTCAGTCCTGTGGA 
GGAGACACATTGCTTAAGCAACTCTGGGACCATAGTGATGCCATATTGTGCTGCTCCCTG 
AAAACTAATGCCTCACCGGTATTCACATTTGCAAACCAAGCTGGTTTAGACATGCTTGAA 
ACTACACTTGTGGCACTTCAGGATATAATGCTCGACAAAACACTTGATGACTCTGGTCGT 
AGAGCTCTTTGCTCCGAGTTCGCCAAGATCATGCAGCAGGGATATGCGAATCTTCCGGCA 
GGAATATGTGTGTCGAGCATGGGCAGACCGGTTTCGTATGAGCAAGCGACGGTGTGGAAA 
GTTGTTGATGACAACGAATCAAACCACTGCTTGGCTTTTACCCTCGTTAGTTGGTCGTTT 
GTTTGA 

>G390 Amino Acid Sequence (domain in AA coordinates: 18-81) 

MMAHHSMDDRDSPDKGFDSGKYVRYTPEQVEALERVYAECPKPSSLRRQQLIRECPILCN 

IEPRQIKVWFQNRRCREKQRKESARLQTVNRKLSAMNKLLMEENDRLQKQVSNLVYENGF 

MKHRIHTASGTTTDNSCESVWSGQQRQQQNPTHQHPQRDVNNPANLLSIAEETLAEFLC 

KATGTAVDWVQMIGMKPGPDS I G I VAVSRNCSGI AARACGLVSLEPMKVAE I LKDRPS WF 

RDCRCVETLNVIPTGNGGTIELVNTQIYAPTTLAAARDFWTLRYSTSLEDGSYW 

TSATGGPNGPLSSSFVRAKMLSSGFLIRPCDGGGS I IHI VDHVDLDVSSVPEVLRPLYES 

SKILAQKMTVAALRHVRQIAQETSGEVQYSGGRQPAVLRTFSQRLCRGFNDAVNGFVDDG 

WSPMSSDGGEDITIMINSSSAKFAGSQYGSSFLPSFGSGVLCAKASMLLQNVPPLVLIRF 

LREHRAEWADYGVDAYSAASLRATPYAVPCVRTGGFPSNQVILPLAQTLEHEEFLEWRL 

GGHAYSPEDMGLSRDMYLLQLCSGVDENWGGCAQLVFAPIDESFADDAPLLPSGFRVIP 

LDQKTNPNDHQSASRTRDLASSLDGSTKTDSETNSRLVLTIAFQFTFDNHSRDNVATMAR 

QYVRNWGSIQRVALAITPRPGSMQLPTSPEALTLWWITRSYSIHTGADIjFGADSQSCG 

GDTLLKQLWDHSDAILCCSLKTNAS PVFTFANQAGLDMLETTLVALQD I MLDKTLDDS GR 

RALCSEFAKIMQQGYANLPAGIOTSSMGRPVSYEQATWKVTOD^ 

v* 

>G391 (1..2559) 

ATGATGATGGTCCATTCGATGAGCAGAGATATGATGAACAGAGAGTCGCCGGATAAAGGG 
TTAGATTCCGGCAAGTATGTGAGGTACACGCCGGAGCAAGTGGAAGCTCTCGAGAGAGTT 
TACACTGAGTGTCCTAAGCCAAGTTCTCTAAGAAGACAACAACTCATACGTGAATGTCCG 
ATTCTCTCTAACATCGAGCCTAAGCAGATCAAAGTTTGGTTTCAGAACCGCAGATGTCGT 
GAGAAGCAGAGGAAAGAAGCTGCTCGTCTTCAAACAGTGAACAGAAAACTCAATGCCATG 
AACAAACTCTTGATGGAAGAGAATGATCGTTTGCAGAAGCAAGTTTCTAACTTGGTCTAT 
GAGAATGGCCACATGAAACATC^CTTCAC^CTC 

TGTGAGTCTGTGGTCGTGAGTGGTCAGCAACATCAACAGCAAAACCCAAATCCT 

CAGCAACGTGATGCTAACAACCCAGCAGGACTCCTTTCTATAGCAGAGGAGGCCCTAGCA 

GAGTTCCTTTCCAAGGCTACAGGAACTGCTGTTGACTGGGTTCAGATGATTGGGATGAAG 

CCTGGTCCGGATTCTATTGGCATAGTCGCTATTTCGCGCAACTGCAGCGGAATTGCAGCA 

CGTGCCTGCGGCCTCGTGAGTTTAGAACCCATGAAGGTTGCTGAAATTCTCAAAGATCGT 

CCATCTTGGCTCCGAGATTGTCGAAGTGTGGATACTCTGAGTGTGATACCTGCTGGAAAC 

GGTGGGACGATCGAGCTTATTTACACGCAGATGTATGCTCCTACGACTTTAGCAGCAGCT 

CGTGACTTTTGGACGCTGAGATATAGCACATGTTTGGAAGATGGAAGCTATGTGGTTTGT 

GAAAGGTCGCTTACTTCTGCAACTGGTGGCCCCACTGGGCCACCTTCTTCAAACTTTGTG 

AGAGCTG7^TGAAAGC^GCGGGTTTCTCATCCGTCCTTGCGATGGTGGTGGTTCCATT 

CTCCACATTGTTGATCATGTTGATCTGGATGCCTGGAGTGTCCCTGAAGTCATGAGGCCT 

CTCTATGAATCATCGAAGATTCTTGCTCAGAAAATGACTGTTGCTGCTTTGAGACATGTA 

AGACAAATTGC^CAAGAAACAAGTGGAGAAGTTCAGTATGGTGGAGGGCGCCAACCTGCG 

GTTTTAAGAACCTTCAGTCAAAGACTC^^ 

GTGGATGATGGATGGTCACCAATGGGTAGCGATGGTGCAGAGGATGTTACTGTAATGATA 
AACTTGTCCCCTGGGAAGTTTGGTGGGTCTCAGTACGGTAATTC^TTCCTTCCAAGCTTT 
GGTAGTGGCGTGCTTTGTGCCAAGGCATCTATGTTGCTTCAGAACGTTCCACCCGCTGTG 
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CTGGTTCGATTCCTTAGAGAACACCGATCTGAATGGGCTGATTATGGCGTGGATGCTTAT 
GCTGCTGCATCGCTCAGAGCAAGTCCTTTTGCTGTTCCTTGTGCTAGAGCTGGGGGGTTC 
CCAAGTAACCAAGTCATTCTTCCTCTTGCGCAGACAGTTGAACATGAAGAGTCACTTGAG 
GTGGTTAGACTTGAAGGTCACGCTTACTCACCCGAAGACATGGGTTTAGCTCGGGATATG 
TATTTGCTACAGCTTTGTAGCGGTGTTGATGAAAATGTGGTTGGAGGTTGTGCACAGCTT 
GTATTTGCCCCTATCGATGAATCATTTGCTGATGATGCACCTTTGCTTCCTTCCGGTTTC 
CGCATCATACCTCTTGAACAGAAATCTACTCCGAACGGTGCATCTGCAAACCGTACCCTG 
GATTTAGCCTCAGCTTTAGAAGGATCCACACGTCAAGCTGGTGAAGCCGACCCAAATGGC 
TGTAACTTTAGGTCGGTACTAACCATAGCATTCCAGTTCACATTTGATAACCATTCAAGA 
GACAGTGTTGCTTCAATGGCACGTCAGTACGTGCGAAGCATAGTAGGATCGATTCAGAGG 
GTTGCTCTAGCCATTGCTCCTCGTCCTGGCTCCAATATCAGTCCAATATCTGTTCCCACT 
TCCCCTGAAGCTCTCACTCTGGTCCGTTGGATCTCCCGGAGTTACAGCCTTCACACTGGT 
GCAGATCTCTTTGGATCTGATTCTCAAACCAGTGGTGACACGTTGCTGCATCAACTCTGG 
AATCACTCTGATGCAATCTTGTGCTGCTCCCTCAAAACAAACGCTTCACCGGTTTTCACA 
TTCGCAAACCAAACCGGTTTAGACATGCTGGAAACGACTCTTGTAGCCCTTCAAGACATA 
ATGCTAGACAAGACCCTTGACGAACCTGGTCGTAAAGCTCTTTGCTCTGAGTTCCCCAAG 
ATCATGCAACAGGGCTATGCTCATCTGCCGGCAGGAGTATGTGCGTCAAGCATGGGAAGG 
ATGGTATCTTACGAGCAGGCAACGGTGTGGAAAGTTCTTGAAGACGATGAATCAAACCAC 
TGCTTAGCTTTCATGTTCGTGAATTGGTCGTTCGTTTGA 

>G391 Amino Acid Sequence (domain in AA coordinates: 25-85) 
MMMVHSMSRDMM^ 

ILSNIEPKQIKVWFQNRRCREKQRKEAARLQTW 

ENGHMKHQLHTASGTTTDNSCESVWSGQQHQQQNPNPQHQQRDANNPAGLLSIAEEALA 

EFLSKATGTAVDWVQMIGMKPGPDSIGIVAISRNCSGIAARACGLVSLEPMKVAEILKDR 

PSWLRDCRSVDTLSVIPAGNGGTIELIYTQMYAPTTLAAARDFWTLRYSTCLEDGSYWC 

ERSLTSATGGPTGPPSSNFVRAEMKPSGFLIRPCDGGGSILHIVDHVDLDAWSVPEVMRP 

LYESSKILAQKMTVAALRHVRQIAQETSGEVQYGGGRQPAVLRTFSQRLCRGFNDAVNGF 

VDDGWSPMGSDGAEDVTVMINLSPGKFGGSQYGNSFLPSFGSGVLCAKASMLLQNVPPAV 

LWFLREHRSEWADYGVDAYAAASLRASPFAVPCARAGGFPSNQVILPLAQTVEHEESLE 

VVRLEGHAYSPEDMGLARDMYLLQLCSGVDENWGGCAQLVFAPIDESFADDAPLLPSGF 

RIIPLEQKSTPNGASANRTLDLASALEGSTRQAGEADPNGCNFRSVLTIAFQFTFDNHSR 

DSYASMARQYVRSIVGSIQRVALAIAPRPGSNISPISVPTSPEALTLVRWISRSYSLHTG 

ADLFGSDSQTSGDTLLHQLWNHSDAILCCSLKTNASPVFTFANQTGLDMLETTLVALQDI 

MLDKTLDEPGRKALCSEFPKIMQQGYAHLPAGVCAS^ 

CLAFMFVNWSFV* 

>G438 (188.. 2716) 

CGGGGTACCCAAGCCACGACCGTAGAATCTTCTTTTGTCTGAAAAGAATTACAATTTACG 

TTTCTCTTACGATACGACGGACTTTCCGAAGAAATTAATTTAAAGAGAAAAGAAGAAGAA 

GCCAAAGAAGAAGAAGAAGCTAGAAGAAACAGTAAAGTTTGAGACTTTTTTTGAGGGTCG 

AGCTAAAATGGAGATGGCGGTGGCTAACCACCGTGAGAGAAGCAGTGACAGTATGAATAG 

ACATTTAGATAGTAGCGGTAAGTACGTTAGGTACACAGCTGAGC^AGTCGAGGCTCTTGA 

GCGTGTCTACGCTGAGTGTCCTAAGCCTAGCTCTCTCCGTCGACAACAATTGATCCGTGA 

ATGTTCCATTTTGGCCAATATTGAGCCTAAGCAGATCAAAGTCTGGTTTCAGAACCGCAG 

GTGTCGAGATAAGCAGAGGAAAGAGGCGTCGAGGCTCCAGAGCGTAAACCGGAAGCTCTC 

TGCGATGAATAAACTGTTGATGGAGGAGAATGATAGGTTGCAGAAGCAGGTTTCTCAGCT 

TGTCTGCGAAAATGGATATATGAAACAGCAGCTAACTACTGTTGTTAACGATCCAAGCTG 

TGAATCTGTGGTCACAACTCCTCAGCATTCGCTTAGAGATGCGAATAGTCCTGCTGGATT 

GCTCTCAATCGCAGAGGAGACTTTGGCAGAGTTCCTATCCAAGGCTACAGGAACTGCT 

TGATTGGGTTCAGATGCCTGGGATGAAGCCTGGTCCGGATTCGGTTGGCATCTTTGCCAT 

TTCGCAAAGATGCAATGGAGTGGCAGCTCGAGCCTGTGGTCTTGTTAGCTTAGAACCTAT 

GAAGATTGCAGAGATCCTCAAAGATCGGCCATCTTGGTTCCGTGACTGTAGGAGCCTTGA 
AGTTTTCACTATGTTCCC^^ 

GTATGCACCAACGACTCTGGCTCCTGCCCGCX^ 

CCTCGACAATGGGAGTTTTGTGGTTTGTGAGAGGTCGCTATCTGGCTCTGGAGCTGGGCC 
TAATGCTGCTTCAGCTTCTC^GTTTGTGAGAGCAGAAATGCTTTCTAGTGGGTATTTAAT 
AAGGCCTTGTGATGGTGGTGGTTCTATTATTCACATTC 
TTGGAGTGTTCCGGATGTGCTTCGACCCCTITATGAGT 
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AATGACCATTTCCGCGTTGCGGTATATCAGGCAATTAGCCCAAGAGTCTAATGGTGAAGT 
AGTGTATGGATTAGGAAGGCAGCCTGCTGTTCTTAGAACCTTTAGCCAAAGATTAAGCAG 
GGGCTTCAATGATGCGGTTAATGGGTTTGGTGACGACGGGTGGTCTACGATGCATTGTGA 
TGGAGCGGAAGATATTATCGTTGCTATTAACTCTACAAAGCATTTGAATAATATTTCTAA 
TTCTCTTTCGTTCCTTGGAGGCGTGCTCTGTGCCAAGGCTTCAATGCTTCTCCAAAATGT 
TCCTCCTGCGGTTTTGATCCGGTTCCTTAGAGAGCATCGATCTGAGTGGGCTGATTTCAA 
TGTTGATGCATATTCCGCTGCTACACTTAAAGCTGGTAGCTTTGCTTATCCGGGAATGAG 
ACCAACAAGATTCACTGGGAGTCAGATCATAATGCCACTAGGACATACAATTGAACACGA 
AGAAATGCTAGAAGTTGTTAGACTGGAAGGTCATTCTCTTGCTCAAGAAGATGCATTTAT 
GTCACGGGATGTCCATCTCCTTCAGATTTGTACCGGGATTGACGAGAATGCCGTTGGAGC 
TTGTTCTGAACTGATATTTGCTCCGATTAATGAGATGTTCCCGGATGATGCTCCACTTGT 
TCCCTCTGGATTCCGAGTCATACCCGTTGATGCTAAAACGGGAGATGTACAAGATCTGTT 
AACCGCTAATCACCGTACACTAGACTTAACTTCTAGCCTTGAAGTCGGTCCATCACCTGA 
GAATGCTTCTGGAAACTCTTTTTCTAGCTCAAGCTCGAGATGTATTCTCACTATCGCGTT 
TCAATTCCCTTTTGAAAACAACTTGCAAGAAAATGTTGCTGGTATGGCTTGTCAGTATGT 
GAGGAGCGTGATCTCATCAGTTCAACGTGTTGCAATGGCGATCTCACCGTCTGGGATAAG 
CCCGAGTCTGGGCTCCAAATTGTCCCCAGGATCTCCTGAAGCTGTTACTCTTGCTCAGTG 
GATCTCTCAAAGTTACAGTCATCACTTAGGCTCGGAGTTGCTGACGATTGATTCACTTGG 
AAGCGACGACTCGGTACTAAAACTTCTATGGGATCACCAAGATGCCATCCTGTGTTGCTC 
ATTAAAGCCACAGCCAGTGTTCATGTTTGCGAACCAAGCTGGTCTAGACATGCTAGAGAC 
AACACTTGTAGCCTTACAAGATATAACACTCGAAAAGATATTCGATGAATCGGGTCGTAA 
GGCTATCTGTTCGGACTTCGCCAAGCTAATGCAACAGGGATTTGCTTGCTTGCCTTCAGG 
AATCTGTGTGTCAACGATGGGAAGACATGTGAGTTATGAACAAGCTGTTGCTTGGAAAGT 
GTTTGCTGCATCTGAAGAAAACAACAACAATCTGCATTGTCTTGCCTTCTCCTTTGTAAA 
CTGGTCTTTTGTGTGATTCGATTGACAGAAAAAGACTAATTTAAATTTACGTTAGAGAAC 
TCAAATTTTTGGTTGTTGTTTAGGTGTCTCTGTTTTGTTTTTTAAAATTATTTTGATCAA 
A 

>G438 Amino Acid Sequence (domain in AA coordinates : 22-85) 

MEMAVAimERSSDSI^RHLDSSGKYVRYTAEQVEALERVYAECPKPSSLRRQQLIRECS 

ILANIEPKQIKVWQNRRCRDKQRKEASRLQ 

ENGYMKQQLTTVVNDPSCESWTTPQHSLRDANSPAGLLSIAEETLAEFLSKATGTAVDW 
VQMPGMKPGPDS VGI FAI SQRCNGVAARACGLVSLEPMKIAEILKDRPSWFRDCRSLEVF 
TMFPAGNGGTIELVYMQTYAPTTIAPARDFWTLRYTTSLDNGSFWCERSLSGSGAGPNA 
ASASQFVRAEMLSSGYLIRPCDGGGSIIHIVDHIJ^EAWSVPDVLRPLYESSKVVAQKMT 
ISALRYIRQLAQESNGEVVYGLGRQPAVLRTFSQRLSRGFNDAVNGFGDDGWSTMHCDGA 
EDIIVAINSTKHLIWISNSLSFLGGVLCAKASML^^ 

AYSAATLKAGS FAYPGMRPTRFTGSQI IMPLGHTIEHEEMLEWRLEGHSLAQEDAFMSR 

DVHLLQICTGIDENAVGACSELI FAPINEMFPDDAPLVPSGFRVT PVDAKTGDVQDLLTA 

NHRTLDLTS S LE VGPS PENASGNS FS S S S SRCILTI AFQFPFENNLQENVAGMACQYVRS 

VISSVQRVAMAISPSGISPSLGSKLSPGSPEAVTLAQWISQSYSHHLGSELLTIDSLGSD 

DSVLKLLTOHQDAILCCSLKPQPVFMFANQAGLDMLETTLVALQDITLEklFDESGRKAI 

CSDFAKLMQQGFACLPSGICVSTMGRHVSYEQAVAWKVFAAS^ 

FV* 

>G47 (38.. 472) 

CTTCTTCTTCACATCGATCATCATACAACAACAAAAAATGGATTACAGAGAATCCACCGG 
TGAAAGTCAGTCAAAGTACAAAGGAATCCGTCGTCGGAAATGGGGCAAATGGGTAXCAGA 
GATTAGAGTTCCGGGAACTCGTGACCGTCTCTGGTTAGGTTCATTCTCAACAGCAGAAGG 
TGCCGCCGTAGCACACGACGTTGCTTTCTTCTGTTTACACCAACCTGATTCTTTAGAATC 
TCTCAATTTCCCTCATTTGCTTAATCCTTCACT 

CCAGCAAGCTGCTTCTAACGCCGGCATGGCCATTGACGCCGGAATCGTCCACAGTACCAG 

CGTGAACTCTGGATGCGGAGATACGACGACGTATTACGAGAATGGAGCTGATCAAGTGGA 

GCCGTTGAATATTTCAGTGTATGATTATCTGGGCGGCCACGATCACGTTTGATTTATCTC 

GACGGTCATGATC^CGTTTGATCTTCTTTTGAGTAAGATTTTGTACC^TAATCAAAAC^G 

GTGTGGTGCTAAAATCTTACTCAAAACAAGATTAGGTACCACAGAGAAACAATCAAATGG 

TTGTGAATATACATTATAAGGTTTTGATTAATGTTTGTTTCACTGATTTAGTG 

GTCCATTGTATACAAATCTATTCAAGAAACCTAGCGCGAGATCATGTTTCGTGATTGAAG 

ATTGAGATTTTTAAGTATTCGTAATATTTTTGTAAAATACAAATAAAAAAAAAAAAAAAA 
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AAAAA 

>G47 Amino Acid Sequence (domain in AA coordinates: 11-80) 
MDYRESTGESQSKYKGIRRRKWGKWVSEIRVPGTRDRLWLGSFSTAEGAAVAHDVAFFCL 

HQPDSLESLNFPHLLNPSLVSRTSPRSIQQAASNAGMAIDAGIVHSTSVNSGCGDTTTYY 
ENGADQVEPLNI S VYDYLGGHDHV* 
>G559 (89.. 1285) 

aaagttgctagctttaatttgccaacttactattcttatgtgtaataatcgtttgcaggg 

tcgttgatttggtgataagtcagtagaaATGgataaggagaaatctccagcacctccttg 

tggaggtcttcctcctccatctccatcaggtcgatgctctgcattctcagaagctggtcc 

cattggtcatggttcagatgctaatcgaatgagtcatgatattagccgtatgcttgataa 

cccacctaagaagattggacatcggcgagctcattctgaaatacttactctccctgatga 

tttgagctttgatagtgatcttggtgtggttggtaatgctgctgatggagcttctttctc 

tgatgagactgaagaagatttgctctctatgtatcttgatatggataagtttaattcttc 

tgctacatcttctgcccaagttggtgagccatcaggaactgcttggaaaaatgagacaat 

gatgcagacaggcacaggctcaacttccaatcctcagaatacggttaatagtcttggcga 

aaggccaagaatcaggcatcaacatagccaatctatggatggttcaatgaatatcaatga 

gatgcttatgtcgggaaatgaagatgattctgctattgatgctaagaagtctatgtctgc 

tactaaacttgctgagcttgctctcattgatcctaaacgtgctaagaggatatgggcaaa 

caggcagtccgcagcacgatcaaaagaaaggaagacgagatacatatttgagcttgagag 

aaaagtacagactttgcaaacagaggctacaactctctcagcccagttgaccctcttaca 

gagagacacaaatggcttgactgttgaaaacaatgagctgaagctgcggttacaaacaat 

ggagcagcaggttcacttgcaggatgaactaaacgaagcactaaaggaggaaatccagca 

tctgaaggtgttgactggccaagttgctccatcagcgttgaactatgggtcgtttggatc 

aaaccagcagcaattctattccaacaatcagtcaatgcaaacaatcttagctgcaaaaca 

gttccagcaacttcagattcattcacagaagcagcaacaacaacaacaacaacaacaaca 

gcaacaccaacagcagcagcagcaacagcaacagtatcagtttcaacagcaacagatgca 

acagcttatgcagcagcggcttcaacagcaagaacaacaaaatggagtaagactcaagcc 

ttcacaagcccagaaagagaacTGAggaatatgaatatgtcccacgtaagtgagaggttc 

tccttctgaacaattcctttctcattcataaattgttgttcatccatcacttgcagtctc 
ttggattttagggttttagctaacaca 

>G559 Amino Acid Sequence (domain in AA coordinates: 203-264) 

MDKEKSPAPPCGGLPPPSPSGRCSAFSEAGPIGHGSDANRMSHDISRMLDNPPKKIGHRR 

AHSEILTLPDDLSFDSDLGWGNAADGASFSDETEEDLLSMYLDMDKFNSSATSSAQVGE 

PSGTAWKNETMMQTGTGSTSNPQNTTOSLGERPRIRHQHSQ 

SAIDAKKSMSATKLAELALIDPKRAKRIWANRQSAA 

TTLSAQLTLLQRDTNGLTVEI^ELKLRLQ™^ 

PSALNYGSFGSNQQQFYSNNQSMQTILAAKQFQQLQIHSQKQQQQQQQQQQQHQQQQQQQ 
QQYQFQQQQMQQLMQQRLQQQEQQNGVRLKPSQAQKEN* ~ " ** * 

>G568 (141.. 995) 

GACCGGCTAAAGTCAAGAACCTCTCTCTGAGCTCTCACCACTTTCTCTCTCTACTCCCTC 

TCTGCGTGTAGGATACTACTAGACAATTGACAACCAAAGACTAAAGCTGTGTTGTTGGTT 

CACTTCTGTTCTCTTTTCCAATGTTGTCATCAGCTAAGCATCAGAGAAACCATAGA 

CTGCTACAAACAAGAACCAGACTCTCACCAAAGTTTCTTCCATTTCATCCTCATCACCAT 

CGTCTTCTTCTTCATCATCATCAACCTCATCATCATCTCCTTTACCTTCTCAAGACTCTC 

AAGCCt^GAAGAGATCTCTTGTCACCATGGAAGAAGTTTGGAATGACATCAACCTTGCTT 

CCATCCACCACCTAAACCGACACAGCCCTCATCCACAACACAACCACGAGCCAAGGTTCA 

GGGGCCAAAACCACCACAACCAAAACCCTAACTCAATCTTCCAAGATTTTCTCAAAGGAT 

CTTTGAACCAGGAACCAGCACCCACAAGCCAGACCACGGGTTCTGCGCCTAATGGCGATT 

CC^CCACGGTCACTGTTCTTTACAGCTCTCCTTTTCCACCTCCTGCAACTGTTCTGAGCT 

TGAATOCCGGCGCTGGCTTCGAGTTTCTCGATAACCAA 

CTAATCTTCATACCCACCATCACCTCTCAAACGCT(^TGCCTTCAACACCTCTTTCGAGG 
CTCTGGTTCCATCCAGTTCTTTTGGTAAGAAAAGAGGCCAAGATTCCAATGAAGGTTCAG 
GG7VATAGAAGACATAAGCGTATGATCAAGAACAGAGAATCTGCAGCTCGTTCCCGCGCTA 
GG AAACAGGCTTATACAAACGAGTTAGAACTTGAAGTTGCTCACTTGCAGGCAGAAAATG 
CAAGACTGAAGAGACAACAAGATCAAAAAATGGCTGCAGCAATT(^ 

ACACACTTCAACGGTCTTCCACAGCTCCATTTTGAGAAATCTACAAGTCCTTGT^ 
TTTGGGGATTGAGATTGTCTC^TGAAGAAGTGAAAAAATGGC^W^GTTTGTACCCT^ 
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TTATTAGCTATAAGTATAACTAAGCCTAAAATTGTAGAACTAAGATATTGTAGGGGAAAA 
AAGAAGATGT7VAAACAAAAGACCCGGAAAGAGAAAAGGATCTTTCAATTTCCTAAGGCAC 
AGGAACACCTGTCCTGGGTCCTCTCTTAATGTTCTGTCGTTTTCCTATGCAAACCCTTTT 
TTCACTTCTGTACTAACTTATACTTGTATTCTTG 

>G568 Amino Acid Sequence (domain in AA coordinates: 215-265) 

MLSSAKHQRNHRLSATNKNQTLTKVSSISSSSPSSSSSSSSTSSSSPLPSQDSQAQKRSL 

VTMEEVWNDINLASIHHLNRHSPHPQHNHEPRFRGQNHHNQNPNSIFQDFLKGSLNQEPA 

PTSQTTGSAPNGDSTTVTVLYSSPFPPPATVLSLNSGAGFEFLDNQDPLVTSNSNLHTHH 

HLSNAHAFNTSFEALVPSSSFGKKRGQDSNEGSGNRRH^ 

ELELEVAHLQAENARLKRQQDQKMAAAIQQPKKNTLQRSSTAPF* 

>G580 (43.-747) 

CCAAAAAACAAAGCATTCTATGCTATTCTGTTCTGTTCTCCAATGTTGTCATCAGCAAAG 
CATAATAAGATCAACAACCATAGTGCCTTTTCAATTTCCTCTTCATCATCATCATTATCA 
ACATCATCCTCCCTAGGCCATAACAAATCTCAAGTCACCATGGAAGAAGTATGGAAAGAA 
ATCAACCTTGGTTCACTTCACTACCATCGGCAACTAAACATTGGTCATGAACCAATGTTA 
AAGAACCAAAACCCTAATAACTCCATCTTTCAAGATTTCCTCAACATGCCTCTGAATCAA 
CCACCACCACCACCACCACCACCTTCCTCTTCCACCATTGTCACTGCTCTCTATGGCTCT 
CTGCCTCTTCCGCCTCCTGCCACTGTCCTCAGCTTAAACTCCGGTGTTGGATTCGAGTTT 
CTTGATACCACAGAAAATCTTCTTGCTTCTAACCCTCGCTCCTTTGAGGAATCTGCAAAG 
TTTGGTTGTCTTGGTAAGAAAAGAGGCCAAGATTCTGATGATACTAGAGGAGACAGAAGG 
TATAAGCGTATGATCAAGAACAGAGAATCTGCTGCTCGTTCAAGGGCTAGGAAGCAGGCA 
TATACAAACGAACTTGAGCTTGAAATTGCTCACTTGCAGACAGAGAATGCAAGACTCAAG 
ATACAACAAGAGCAGCTGAAAATAGCCGAAGCAACTCAAAACCAAGTAAAGAAAACACTA 
CAACGGTCTTCCACAGCTCCATTTTGAGAAAAATCTACTATTTCTTTTTGGGGGAGTTTC 
AAGTGTTTCTTATGAAGATGAGAAAAACAGAAAAAGTTTGTACATTTTAGCTAAGTTAAA 
TTTGTGGTGGTAAGTAATGTAAAAGAAAAGTGTGTGTAGAAGAAAAGTGTCTAGAAAAAG 
AAAGCAACTAACTTTCTTCTTCTTCTCTGGTTTCCTATCAACTCTTTTGACTTTTGTACT 
TTTTTTCTTCTCTACTTAACCTCTATTATTGTAATGCCAAGTCAAGTCCTTATCTAGCTA 
GTACATGAGTTTCTGTTTTCACTGGTTAAGCCAT 

>G580 Amino Acid Sequence (domain in AA coordinates : 162-218) 
MLSSAKHNKINNHSAFSISSSSSSLSTSSSLGHNKSQVTMEEVWKEINLGSLHYHRQLNI 
GHEPMLKNQNPNWSIFQDFLNMPLNQPPPPPPPPSSSTIVTALYGSLPLPPPATVLSLNS 
GVGFEFLDTTENLIiASNPRSFEESAKFGCLGKKRGQDSDDTRGDRRYKRMIKNRESAARS 
RARKQAYTNELELE I AHLQTENARLKI QQEQLKIAEATQNQVKKTLQRS STAPF * 
>G615 (197.. 1252) 

TTTTTTCTTTTCTTTCTTTTTTTGCTGGTGTGAGAAATTGTACGCTTACTATCTCTCTCT 
CTCTCTGCCAGATTCTCTCTTTTTGATGATGTGAAAGTTGTGCTTTTGTTTCTTAAGAAA 
AAGGCATATTTTTAATACTTGATTCTTGGTTCTTGATTCTTGATTCTTGGTTTTTTTTAG 
CTTCTTAAGTTCGGTGATGTCGTCTTCCACCAATGACTAC^ACGATGGTAATAACAATGG 
AGTGTACCCTCTCTCTCTTTACCTTTCTTCACTCTCTGGCCATCAAGACATCATTCATAA 
TCCCTACAACCATCAGTTAAAAGCATCTCCG 

TCTGATCGATTACATGGCGTTTAAGTCAAATAATGTTGTGAATCAACAAGGCTTTGAGTT 
TCCTGAGGTGTCAAAGGAAATCAAGAAGGTGGTGAAGAAGGACCGACATAGCAAGATTCA 
AACGGCACAAGGGATTAGAGACAGGAGGGTTAGGCTTTTTATTGGGATTGCTCGCCAATT 
CTTTGATCTTCAGGATATGTTGGGGTTTGATAAAGCTAGTAAAACGTTAGACTGGCTGCT 
CAAGAAGTCAAGAAAAGCCATCAAAGAGGTCGTACAAGCAAAAAACCTCAACAATGATGA 
TGAAGATTTTGGAAACATTGGAGGCGATGTAGAACAAGAAGAGGAGAAGGAGGAGGATGA 
CAATGGCGATAAGAGCTTCGTGTATGGTTTGAGCCCCGGGTACGGTGAAGAAGAAGTGGT 
ATGTGAGGCCACGAAGGCAGGGATAAGAAAGAAGAAGAGTGAGTTGAGAAACATCTCATC 
AAAGGGGCTAGGAGCGAAAGCTAGAGGAAAAGCAAAGGAGCGAACAAAAGAGATGATGGC 
CTATGATAATCCAGAGACTGCCTCTGATATTACACAATCTGAAATCATGGACCCATTCAA 
GAGGTCTATAGTCTTCAATGAAGGAGAAGATATGACACACCT^ 

CGAGGAGTTTGATAATC^GAATCTATCTTAACCAATATGACTCTACCAACGAAGATGGG 

TCAAAGTTACAATCAAAATAATGGGATACTTATGTTGGTAGATCAGAGTTCTAGCAGCAA 

CTATAATACATTTCTGCCTCAAAATTTGGATTATAGTTATGATCAAAACCCTTTTCAT 

CCAAACCTTATATGTAGTCACCGACAAAAATTTCCCCAAAGGTTTCCTATAAATCTCGAC 

AGTTTTGAAGGACTATGCATGATCAAGTT^^ 
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CTCTGAATGTATACAAAATCTATAGTTATGTATATCTGTTCCTTTTTAACGTATCTTTAT 

TGATCTTCTGTGCCTTGATCAAAATTGTCATTTTAAGATTCAGTTTGTGTAATATTTTAG 

CTACAACTTTTAAGTGGTATTATTGTAACCTTTTGAACTATATATTTTGAAGATGAATAA 
GAACATGTTTATATAAAAA 

>G615 Amino Acid Sequence (domain in AA coordinates : 88-147) 
MSSSTODYNDGNNNGVYPLSLYLSSLSGHQDIIHNPYNHQLKASPGHMVSAVPESLIDYM 
AFKSNNWNQQGFE FPE VS KE I KKWKKDRHS KI QTAQG I RDRRVRLF I G I ARQFFDLQD 

MLGFDKASKTLDWLLKKSRKAIKEWQAKNLNNDDEDFGNIGGDVEQEEEKEEDDNGDKS 
F VYGLS PG YGEEE WCEATKAG IRKKKSELRN I S S KGLGAKARGKAKERTKEMMAYDNPE 
TASDITQSEIMDPFKRSIVFNEGEDMTHLFYKEPIEEFDNQESILTNMTLPTKMGQSYNQ 
NNGILMLVDQSSSSNYNTFLPQNLDYSYDQNPFHDQTLYWTDKNFPKGFL* 
>G732 (73.. 588) 

AAAAAAACCAAACATAAAACATAAAACTCTGTCCTTTTTTTGTCTTCTTGTAACTTTTCT 

TGTTAAAAATCAATGGCGTCATCTAG CAGCACATACCGGAG CTCAAGCTCTTCCGACGGT 

GGTAATAATAACCCGTCGGACTCCGTCGTCACCGTCGACGAACGAAAACGTAAAAGAATG 

TTATCGAACAGAGAATCTGCACGTAGGTCAAGGATGCGTAAACAGAAACACGTTGATGAT 

CTAACGGCTCAGATCAATCAGCTATCAAACGACAACCGTCAGATCTTGAACAGCCTCACC 

GTAACATCTCAGCTTTACATGAAGATCCAAGCCGAGAACTCTGTTCTCACCGCTCAGATG 

GAGGAGCTTAGCACCAGACTCCAATCTCTCAACGAGATCGTTGATCTTGTTCAATCCAAC 

GGTGCAGGATTTGGTGTTGACCAGATCGACGGCTGTGGTTTTGATGATCGTACGGTTGGG 

ATCGACGGATATTACGATGATATGAATATGATGAGTAATGTTAATCATTGGGGTGGTTCG 

GTTTACACTAACCAACCCATTATGGCTAATGATATCAATATGTATTGATTAATAAAATTA 

ATTAAAATAATTAGATGCCCCTTTTTTGTCTTTTTATTTTAAAATTTAGCCCATTTTGGT 

GTTTTTGGGTTGGTGTGATGATGTAATTATAGTACATGCATCTTTGATTGGTTGGAAGGA 

TAAATATAAACTTTATATATATATTGGGGCATATATATATGAGTTGTACTTTGCATGTAT 

TGGTGTGTGTTTTGTTATAATTATATGATTATATATGTTTATGTTAAAAAAAAA 

>G732 Amino Acid Sequence (domain in AA coordinates: 31-91) 

MASSSSTYRSSSSSDGGNNNPSDSVVTVDERKRKRMLSNRESARRSRM 

INQLSNDNRQILNSLTVTSQLYMKIQAENSVLTAQMEELSTRLQSLNEIVDLVQSNGAGF 

GVDQIIXSCGFDDRTVGIDGYYDDMN^ 

>G988 (1..1338) 

ATGCTTACTTCCTTCAAATCCTCTAGCTCCTCCTCCGAAGATGCCACCGCTACCACCACC 

GAGAATCCTCCTCCTTTGTGCATCGCCTCCTCCTCGGCCGCAACCTCCGCCTCACATCAC 

CTCCGTCGTCTTCTTTTC^CCGCTGCGAATTTCGTCTCCCAGTCAAACTTCACCGCCGCT 

CAAAACTTACTCTCAATCCTCTCCCTTAACTCTTCTCCTCACGGCGACTCCACCGAGCGA 

CTTGTACACCTCTTCACTAAAGCCTTGTCCGTACGAATCAACCGTCAGCAACAAGATCAG 

ACGGCTGAAACGGTTGCCACGTGGACGACGAACGAAATGACGATGAGTAACTCCACGGTG 

TTCACGAGCAGTGTATGCAAAGAACAGTTCTTGITrCGAACCAAGAACAACAATTCTGAC 

TTCGAGTCTTGTTACTATCTTTGGCTAAACCAACTAACGCCGTTTATTCGGTTCGGTCAT 

TTAACGGCGAACCAAGCTATCCTCGACGCGACGGAGACAAACGATAACGGAGCTCTACAT 

ATACTTGATTTAGATATATCACAAGGACTTCAATGGCCTCCATTGATGCAAGCCCTAG(^ 

GAGAGGTCATCAAACCCTAGCAGTCCACCTCCATCTCTCCGCATAACCGGATGCGGTCGA 

GATGTAACCGGATTAAACCGAACTGGAGACCGGTTAACCCGGTTCGCTGACTCTTTAGGT 

CTCCAATTCCAGTTTCACACGCTAGTGATCGTAGAAGAAGATCTCGCCGGACTTTTGCTA 

CAGATCCGATTGTTAGCTCTCTCAGCCGTACAAGGAGAGACCATTGCCGTCT^TTGTGTT 

CACTTCCTCCACAAAATATTTAACGACGATGGAGATATGATCGGTCACTTCTTGTCAGCG 

ATCAAGAGCTTAAACTCTAGAATCGTTACAATGGCAGAGAGAGAAGCTAATCATGGAGAT 

CACTCGTTCTTGAATAGATTCTCTGAGGCAGTGGATCATTACATGGCGATCTTTGATTCG 

TTGGAAGCGACGTTGCCGCCAAATAGCCGAGAGAGACTAACCCTAGAGCAACGGTGGTTC 

GGTAAGGAGATTTTGGATGTTGTGGCGGCGGAAGAGACGGAGAGAAAGCAAAGACATCGG 

AGGTTTGAGATTTGGGAAGAGATGATGAAGAGGTTTGGTTTCGTTAACGTTCCTATTGGA 

AGCTTTGCTTTGTCTCAAGCTAAGCTTCTTCTTAGACTTCATTATCCTTCAGAAGGT^ 

AATCTTCAGTTCCTTAACAATTCTTTGTTTCTTGGCTGGCAAAATCGTCCCCTCTT 
GTTTCGTCGTGGAAATGA 

>G988 Amino Acid Sequence (domain in AA coordinates : 178-195) 

MLTSFKSSSSSSEDATATTTENPPPLCIASSSAATSASHHLRRLLFTAANFVSQSNFTAA 
QNLLSILSLNSSPHGDSTERLVHLFTKALSVRI^ 
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FTSSVCKEQFLFRTKNNNSDFESCYYLWLNQLTPFIRFGHLTANQAILDATETNDNGALH 

ILDLDISQGLQWPPLMQALAERSSNPSSPPPSLRITGCGRDVTGLNRTGDRLTRFADSLG 

LQFQFHTLVIVEEDLAGLLLQIRLLALSAVQGETIAVNCVHFLHKIFNDDGDMIGHFLSA 

IKSLNSRIVTMAEREANHGDHSFLNRFSEAVDHYMAIFDSLEATLPPNSRERLTLEQRWF 

GKEILDWAAEETERKQRHRRFEIWEEMMKRFGFVNVPIGSFALSQAKLLLRLHYPSEGY 

NLQFLNNSLFLGWQNRPLFSVSSWK* 

>G1519 (1. .1146) 

ATGAGGCTTAATGGGGATTCGGGTCCGGGTCAGGATGAACCCGGTTCGAGCGGGTTTCAC 
GGCGGAATCAGACGATTCCCGTTAGCAGCTCAGCCGGAGATTATGAGAGCTGCTGAGAAA 
GACGATCAATACGCTTCTTTCATCCACGAAGCTTGCCGCGATGCCTTCCGACACCTTTTC 
GGTACAAGAATCGCTGTTGCTTACCAGAAGGAGATGAAGCTACTTGGACAGATGCTTTAC 
TATGTTCTTACGACAGGTTCAGGGCAACAAACTTTAGGAGAGGAATATTGTGACATTATA 
CAGGTTGCAGGGCCTTATGGACTCTCTCCTACACCAGCTAGACGTGCTTTGTTCATATTG 
TACCAGACCGCAGTTCCATATATCGCAGAGAGAATTAGCACTCGAGCTGCTACGCAAGCA 
GTCACCTTTGATGAGTCTGATGAGTTTTTTGGTGATAGTCATATCCACTCACCAAGAATG 
ATAGATCTTCCATCTTCATCTCAAGTTGAAACTTCAACTTCTGTAGTATCTAGGTTAAAC 
GATAGACTTATGAGATCGTGGCACCGAGCTATTCAGCGATGGCCTGTGGTTCTTCCTGTT 
GCCCGCGAAGTCTTACAACTGGTTTTGCGTGCCAATCTGATGCTCTTCTACTTTGAAGGT 
TTTTATTATCATATATCGAAACGTGCATCCGGGGTTCGTTATGTTTTCATAGGAAAGCAA 
CTGAATCAGAGACCTAGATACCAAATTCTTGGGGTTTTCCTTCTAATCCAATTGTGCATC 
CTTGCTGCTGAGGGCTTGCGTCGGAGTAATTTGTCATCTATCACTAGCTCCATTCAGCAG 
GCTTCTATAGGATCTTATCAAACTTCAGGAGGGAGAGGTTTACCTGTTTTAAATGAAGAG 
GGGAATTTGATAACTTCGGAAGCTGAAAAGGGAAACTGGTCTACCTCCGATTCAACTTCA 
ACGGAGGCAGTAGGGAAATGCACTCTCTGCTTAAGCACCCGTCAGCACCCAACGGCCACT 
CCTTGTGGTCATGTGTTTTGTTGGAGCTGCATTATGGAATGGTGCAACGAGAAGCAAGAA 
TGCCCTCTTTGTCGAACGCCCAATACCCATTCAAGTTTGGTTTGTTTGTATCATTCTGAT 
TTTTAG 

>G1519 Amino Acid Sequence (domain in AA coordinates: 327-364) 
MRLNGDSGPGQDEPGSSGFHGGIRRFPLAAQPEIMRAAEKDDQYASFIHEACRDAFRHLF 
GTRIALAYQKEMKLLGQMLYYVLTTGSGQQTLGEEYCDI IQVAGPYGLS PTPARRALFIL 
YQTAVPYIAERISTRAATQAVTFDESDEFFGDSHIHSPRMIDLPSSSQVETSTSWSRLN 
DRLMRSWHRAIQRWPVV1jPVAREVI;QLVL^ 

LNQRPRYQILGVFLLIQLCILAAEGLRRSNLSSITSSIQQASIGSYQTSGGRGLPVLNEE 
GNLITSEAEKGNWSTSDSTSTEAVGKCTLCLSTRQHPTATPCGHVFCWSCIMEWCNEKQE 
CPLCRTPNTHS SLVCLYHSDF * 
>G374 (1..1359) 

ATGGACAACAAAAATGATCAGGATATTGATGTTAGATCAGTGGTTGAAGCTGTTTCCGCC 
GATCTTTCCTTTGGTGCTCCCCTCTATGTGGTTGAGAGCATGTGCATGCGCTGCCAAGAA 
AATGGAACAACCAGATTTCTATTGACCTTAATTCCTCACTTCAGAAAGGTCTTAATATCT 
GCATTTGAATGTCCGCATTGCGGGGAAAGGAATAATGAAGTTCAGTTCGCAGGCGAGATT 
CAACCCCGTGGATGCTGTTACAATCTAGAGGTTCTAGCTGGTGATGTGAAGATATTTGAC 
CGGCAAGTTGTGAAATCTGAATCAGCCACTATTAAGATTCCTGAACTGGATTTTGAGATT 
CCACCAGAGGCCCAACGTGGAAGTTTGTCTACTGTGGAAGGGATATTAGCACGGGCTGCT 
GATGAACTGAGTGCCCTTCAAGAAGAACGCAAGAAAGTTGATCCTAAAACTGCTGAAGCA 
ATAGACCAATTCTTGTCCAAACTGAGAGCTTGTGCTAAAGCAGAGACATCCTTCACCTTC 
ATTTTGGATGATCCTGCTGGAAACAGTTTCATTGAGAACCCACATGCTCCATCACCAGAT 
CCCTCTCTAACCATCAAATTCTATGAGCGAACACCAGAGCAACAAGCAACACTTGGATAT 
GTTGCTAACCCATCTCAGGCTGGACAATCAGAAGGAAGCCTTGGCGCACCTGTGATGACT 
TTCCCTTCAACTTGCGGAGCATGTACGGAGCCGTGTGAGACACGGATGTTCAAAATAGAA 
ATCCCGTACTTTCAGGAAGTTATTGTCATGGCATCTACATGTGACAGTTGTGGCTATCGT 
AATTCTGAGTTGAAGCCTGGTGGTGCAATTCCTGAAAAGGG7UUVGAAGATTACTCTCTCT 
GTGAGGAACATTACAGACCTTAGCCGAGATGTTATCAAGTCGGACACTGCAGGAGTGATA 
ATCCCAGAACTTGATCTGGAGCTAGCTGGTGGTACACTTGGTGGAATGGTAACAACAGTT 
GAAGGGTTGGTTACA^GATCAGAGAAAGCCTAGCGAGAGTTCACGGATTCACTTTTGGT 
GATAGTATGGAAGAGAGTAAGTTGAACAAATGGAGAGAATTTGGAGCCAGGCTCACTAAG 
CTCCTAAGCTTTGAACAGCCGTGGACATTGATTCTTGATGATGAATTAGCAAATTCCTTT 
ATTGCACCAGTAACAGATGATATCAAAGATGACCATCAGCTCACATTTGAAGAGTACGAG 
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AGGTCATGGGATCAAAACGAGGAGTTGGGTCTCAACGACATAGATACTTCTTCAGCTGAT 
GCTGCTTATGAATCCACAGAGACGACTAAATTACCTTAA 

>G374 Amino Acid Sequence (domain in aa coordinates: 35-67, 245-277) 

MDNKNDQDIDVRSWEAVSADLSFGAPLYWESMCMRCQENGTTRFLLTLIPHFRKVLIS 

AFECPHCGERNNEVQFAGEIQPRGCCYNLEVLAGDVKIFDRQWKSESATIKIPELDFEI 

PPEAQRGSLSTVEGILARAADELSALQEERKKVDPKTAEAIDQFLSKLRACAKAETSFTF 

I LDDPAGNS FIENPHAPS PDPSLT I KF YERTPEQQATLGYVANPS QAGQSEGSLGAP VMT 

FPSTCGACTEPCETRMFKIEI PYFQEVIVMASTCDSCGYRNSELKPGGAI PEKGKKITLS 

VRWITDLSRDVIKSDTAGVIIPELDLELAGGTLGGMVTTVEGLVTQIRESLARVHGFTFG 

DSMEES KLNKWRE FG ARLTKLLS FEQPWTL I LDDELANS F I APVTDDI KDDHQLTFEE YE 

RSWDQNEELGLNDIDTSSADAAYESTETTKLP* 

>G877 (397.. 2460) 

CAAAGATTAGACTAATCCGACTGTGTTTTTAATCAATCATCATTTTATTTAGGGGAGAGA 
AGTTGTAAAGTTTTGATTTTTTTTTCTGGGTTTTTTCTGTGAGACCCAGAAGAAGAACAG 
AGAGAGGAAGAAGGAGAAGAAAAAAATATCTCTTTCTCTCCGGCTTTCAACAAAATCTCT 
CTTTTTTCCTTCATCAGTGTTAAATTCGGATCCGGGTCGGGTGGGTTTTCGGTTTTTGGT 
GTTCGGATCAGAGCACAGTTGGATGTTAGCGACGGAACTGAGGATTTCAGTTTGCGGCTG 
CGGCGGCTGTGACGGTGTTTGTGTGTCGTCTTCTTTTATCAATCAGGAGTTTCATCACAG 
TTTGATCAGAGATTCAGCCAAATTCTTGGATACTAAATGGCTGGTTTTGATGAAAATGTT 
GCTGTGATGGGAGAATGGGTGCCTCGTAGTCCTAGTCCCGGGACACTTTTCTCCTCTGCT 
ATTGGAGAAGAGAAGAGCTCGAAACGTGTTCTTGAAAGAGAGTTATCTTTGAATCATGGT 
CAAGTTATTGGTTTAGAAGAAGACACTAGTAGTAATCATAACAAGGATTCTTCACAAAGC 
AATGTTTTTCGAGGTGGTCTCAGTGAAAGAATTGCTGCAAGAGCTGGATTTAATGCTCCA 
AGGTTGAACACTGAGAATATCCGCACCAACACCGACTTTTCCATTGACTCTAACCTTCGA 
TCTCCTTGCTTAACCATCTCTTCTCCTGGCCTTAGCCCTGCAACACTCTTGGAATCTCCT 
GTTTTCCTTTCTAACCCATTGGCTCAACCTTCTCCAACTACCGGGAAATTTCCATTTCTT 
CCTGGTGTTAATGGTAATGCATTGTCTTCTGAGA7VAGCGAAAGACGAGTTCTTTGATGAT 
ATTGGAGCATCATTCAGCTTCCATCCTGTTTCAAGATCATCTTCCTCTTTCTTCCAAGGC 
ACAAC^GAGATGATGTCAGTTGATTATGGTAACTA(^C^TAGATCTTCTTCTCATC^ 
TCCGCAGAAGAAGTAAAACCTGGCTCTGAAAACATAGAAAGCTCCAATCTTTATGGGATT 
GAAACTGACAATCAAAACGGGCAGAACAAGACATCTGATGTCACTACAAACACCAGTCTT 
GAAACCGTGGATCATCAAGAGGAAGAAGAAGAGCAAAGACGCGGTGATTCGATGGCTGGT 
GGTGCGCCTGCAGAGGATGGATATAACTGGAGGAAATACGGACAAAAGTTGGTCAAAGGA 
AGTGAGTATCCGCGAAGCTATTACAAGTGCACAAACCCGAATTGTCAGGTGAAGAAGAAA 
GTTGAGAGATCAAGGGAAGGTCACATCACAGAGATTATATACAAAGGAGCTCATAATCAT 
CTTAAACCTCCACCTAATCGCCGCTCAGGGATGCAAGTAGATGGAACTGAACAAGTTGAA 
CAACAACAACAACAGAGAGATTCTGCTGCAACGTGGGTTA 

CAAGGTGGAAGCAATGAGAACAATGTCGAAGAGGGATCTACGAGATTCGAGTATGGAAAC 

CAATCTGGATCAATTCAAGCTCAAACCGGAGGTCAATACGAGTCAGGTGATCCTGTGGTT 

GTGGTTGATGCTTCTTCAACATTCTCTAATGATGAAGATGAAGATGATCGAGGGACACAT 

GGAAGTGTTTCTTTGGGTTACGATGGAGGAGGAGGAGGTGGGGGAGGAGAAGGAGATGAA 

TC^GAGTCGAAAAGAAGGAAACTAGAAGCTTTTGCAGCAGAGATGAGTC 

GCCATACGTGAGCCAAGAGTTGTTGTGCAGACAACGAGTGATGTTGACATTCTTGATGAT 

GGTTATCGCTGGCGAAAATATGGTCAGAAAGTTGTCAAAGGCAATCCAAATCCAAGGAG^ 

TATTACAAATGCACAGCTCCAGGATGTACAGTGAGGAAACATGTTGAAAGAGCTTCTCAT 

GATCTC^ATCCGTTATAACAACTTACGAAGGCAAACATAACCATGACGTCCCCGCTGCA 

CGCAACAGCAGCCACGGAGGCGGTGGTGATAGTGGTAACGGTAACAGCGGCGGTTCAGCC 

GCAGTTTCTCACCAOT'ACCACAACGGTCATCACTCAGAGCCGCCACGTGGGAGATTCGAC 

AGACAAGTCACAACTAACAATCAGTCTCCTTTTAGCCGTCCCTTTAGCTTTCAGCCACAT 

TTGGGTCCTCCTTCTGGTTTCTCCTTCGGTTT^ 

ATGCCTGGTTTAGCGTATGGTCAAGGGAAAATGCCGGGTOT 

CAACCGGTTGGGATGAGTGAAGCAATGATGCAGAGAGGGATGGAACCAAAGGTTGAACCG 
GTTTCAGATTCAGGACAATCGGTATATAACCAQATCATGAGTAGA 
AATTTACTCTTCTTCTTCTTCTTCTGCATTTGGTCACTCCTTATAATAACTTT^ 
TGCTTCTTCTTCTTCTTTGATTT^ 

ATTGTTAAAAAAAAAAAAAAAAA 

>G877 Amino Acid Sequence (domain in AA coordinates: 272-328, 487-603) 



14 



BNSDOCID: <WO__0301 3227A2_L> 



WO 03/013227 



PCT/US02/25805 



MAGFDENVAVMGEWVPRSPSPGTLFSSAIGEEKSSKRVLERELSLNHGQVIGLEEDTSSN 

HNKDSSQSWVFRGGLSERIAARAGFNAPRLNTENIRTNTDFSIDSNLRSPCLTISSPGLS 

PATLLESPVFLSNPLAQPSPTTGKFPFLPGVNGNALSSEKAKDEFFDDIGASFSFHPVSR 

SSSSFFQGTTEMMSVDYGNYNNRSSSHQSAEEVKPGSENIESSNLYGIETDNQNGQNKTS 

DVTTNTSLETVDHQEEEEEQRRGDSMAGGAPAEDGYNWRKYGQKLVKGSEYPRSYYKCTN 

PNCQVKKKVERSREGHITEIIYKGAHNHLKPPPNRRSGMQVDGTEQVEQQQQQRDSAATW 

VSCMNTQQQGGSNENNVEEGSTRFEYGNQSGSIQAQTGGQYESGDPVWVDASSTFSNDE 

DEDDRGTHGSVSLGYDGGGGGGGGEGDESESKRRKLEAFAAEMSGSTRAIREPRWVQTT 

SDVDILDDGYRWRKYGQKWKGNPNPRSYYKCTAPGCTVRKHVERASHDLKSVITTYEGK 

HNHDVPAARNSSHGGGGDSGNGNSGGSAAVSHHYHNGHHSEPPRGRFDRQVTTNNQSPFS 

RPFSFQPHLGPPSGFSFGLGQTGLVNLSMPGLAYGQGKMPGLPHPYMTQPVGMSEAMMQR 

GMEPKVEPVSDSGQSVYNQIMSRLPQI * 

>G1000 (1..954) 

ATGGGAAGACCTCCTTGTTGTGACAAGTCCAATGTCAAGAAAGGTCTCTGGACCGAGGAA 

GAAGACGCTAAGATCCTTGCTTATGTTGCTATCCATGGTGTAGGAAACTGGAGCTTGATC 

CCCAAAAAAGCAGGTCTGAATCGATGTGGAAAGAGCTGTAGACTAAGATGGACTAATTAC 

TTAAGACCTGACCTTAAACATGACAGCTTCTCTACCCAAGAAGAAGAGCTTATCATTGAG 

TGTCATAGAGCCATTGGCAGCAGGTGGTCTTCCATTGCACGAAAGCTTCCAGGAAGAACG 

GATAATGATGTGAAGAATCACTGGAACACAAAGCTGAAGAAGAAGCTGATGAAAATGGGG 

ATAGACCCGGTGACTCAT7UVACCGGTTTCTCAACTCCTTGCAGAATTCAGAAACATTAGC 

GGCCATGGAAATGCATCCTTCAAAACAGAACCATCTAACAACTCTATACTCACACAATCC 

AACTCAGCTTGGGAAATGATGAGAAACACAACAACAAACCATGAGAGTTATTACACCAAC 

TCTCCAATGATGTTTACAAATTCCTCTGAGTACCAAACTACTCCATTTCATTTCTATAGC 

CATCCAAATCATCTGCTCAATGGAACCACATCTTCATGCTCTTCCTCATCATCTTCTACT 

AGTATCACTCAGCCA7^ACCAAGTACCTCAAACACCGGTTACTAACTTCTACTGGAGCGAT 

TTCCTTCTCTCGGACCCGGTTCCTCAAGTAGTGGGATCCTCAGCTACTAGCGACCTCACT 

TTTACGCAGAACGAACATCATTTCAACATCGAAGCCGAATACATCTCTCAAAACATCGAT 

TCAAAGGCCTCGGGAACATGTCATTCCGCGAGTTCCTTCGTTGACGAAATACTAGATAAA 

GACCAAGAGATGTTGTCACAGTTTCCTCAACTCTTGAATGATTTCGATTATTAG 

>G1000 Amino Acid Sequence (domain in AA coordinates: 14-117) 

MGRPPCCDKSNVKKGLWTEEEDAKILAYVAIHGVGNWSLIPKKAGLNRCGKSCRLRW 

LRPDLKHDSFSTQEEELIIECHRAIGSRWSSIARKLPGRTDNDVKNHWNTKIjKKKLM 

IDPVTHKPVSQLLAEFRNISGHGNASFKTEPSl^ 

SPMMFTNSSEYQTTPFHFYSHPNHLI^GTTSSCSSSSSSTSITQPNQVPQTPVTNFYWSD 
FLLSDPVPQWGS S ATSDLTFTQNEHHFN I E AE Y I SQN ID S KASGTCHS AS S FVDE I LDK 
DQEMLSQFPQLLNDFDY* 
>G1067 (436.. 1371) 

TCTCAAGCTTCTCTCTCCTTTTTTTCCCATAGCACATCAGAATCGCTAAATACGACTCCT 

ATGCAAAGAAGAAGCTACTTCTTTCTCTTGCCCTAATTAATCTACCTAACTAGGGTTTCC 

TCTTACCTTTCATGAGAGAGATCATTTAACATAAGTCACCTTTTTTATATCTTTTGCTTC 

GTCTTTAATTTAGTTCTGTTCTTGGTCTGTTTCTATATTTTGTCGGCTTGCGTAACCGAT 

C^CACCTTAATGCTTTAGCTATTGTTTCCTCAAAATCATGAGTTTO 

AGTTTTCTTTTTCTCTCTTTACGCTCTTCTTCACCTAGCTACCAATATATGAACGAGCAG 

GATCAAGAATCGAGAAATTGATTTGAGCTGGCGAATAAGCAGTGGTGGGATAGGGAATTA 

GTAGATGCGGCGGCGATGGAAGGCGGTTACGAGCAAGGCGGTGGAGCTTCTAGATACTTC 

CATAACCTCTTTAGACCGGAGATTCACCACCAACAGCTTCAACCGCAGGGCGGGATCAAT 

CTTATCGACCAGCATCATCATCAGCACCAGCAACATCAACAACAACAACAACCGTCGGAT 

GATTCAAGAGAATCTGACCATTCAAACAAAGATCATCATCAACAGGGTCGACCCGATTCA 

GACCCGAATACATCAAGCTCAGCACCGGGAAAACGTCCACGTGGACGTCCACCAGGATCT 

AAGAACAAAGCCAAGCCACCGATCATAGTAACTCGTGATAGCCCCAACGCGCTTAGATCT 

CACGTTCTTGAAGTATCTCCTGGAGCTGACATAGTTGAGAGTGTTTCCACGTACGCTAGG 

AGGAGAGGGAGAGGCGTCTCCGTTTTAGGAGGAAACGGCACCGTATCTAACGTCACTCTC 

CGTCAGCCAGTCACTCCTGGAAATGGCGGTGGTGTGTCCGGAGGAGGAGGAGTTGTGACT 

TTACATGGAAGGTTTGAGATTCTTTCGCTAACGGGGACTGTTTTGCCACCTCCTGCACCG 

CCTGGTGCCGGTGGTTTGTCTATATTTTTAGCCGGAGGGCAAGGTCAGGTGGTCGGAGGA 

AGCGTTGTGGCTCCCCTTATTGCATCAGCTCCGGTTATACTAATGGCGGCTTCGTTCTCA 

AATGCGGTTTTCGAGAGACTACCGATTGAGGAGGAGGAAGAAGAAGGTGGTGGTGGCGGA 
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GGAGGAGGAGGAGGAGGGCCACCGCAGATGCAACAAGCTCCATCAGCATCTCCGCCGTCT 
GGAGTGACCGGTCAGGGACAGTTAGGAGGTAATGTGGGTGGTTATGGGTTTTCTGGTGAT 
CCTCATTTGCTTGGATGGGGAGCTGGAACACCTTCAAGACCACCTTTTTAATTGAATTTT 
AATGTCCGGAT^TTTATGTGTTTTTATCATCTTGAGGAGTCGTCTTTCCTTTGGGATATT 
TGGTGTTTAATGTTTAGTTGATATGCATATTTT 

>G1067 Amino Acid Sequence (domain in AA coordinates: 86-93) 

MEGGYEQGGGASRYFHNLFRPEIHHQQLiQPQGGINLIDQHHHQHQQHQQQQQPSDDSRES 

DHSNKDHHQQGRPDSDPNTS S SAPGKRPRGRPPGS KNKAKPP I 1 VTRDS PNALRSHVLEV 

SPGADIVESVSTYARRRGRGVSVLGGNGTVSNVTLRQPVTPGNGGGVSGGGGWTLHGRF 

EILSLTGTVLPPPAPPGAGGLSIFLAGGQGQWGGSWAPLIASAPVILMAASFSNAVFE 

RLPIEEEEEEGGGGGGGGGGGPPQMQQAPSASPPSGVTGQGQLGGNVGGYGFSGDPHLLG 

WGAGTPSRPPF* 

>G1075 (19.. 876) 

TTTGTGTTTGGTGCTGGCATGGCTGGTCTCGATCTAGGCACAACTTCTCGCTACGTCCAC 
AACGTCGATGGTGGCGGCGGCGGACAGTTCACCACCGACAACCACCACGAAGATGACGGT 
GGCGCTGGAGGAAACCACCATCATC^CCATCATAATCATAATCACCATCAAGGTTTAGAT 
TTAATAGCTTCTAATGATAACTCTGGACTAGGCGGCGGTGGAGGAGGAGGGAGCGGTGAC 
CTCGTCATGCGTCGGCCACGTGGCCGTCCAGCTGGATCGAAGAACAAACCGAAGCCGCCG 
GTGATTGTCACGCGCGAGAGCGCAAACACTCTTAGGGCTCACATTCTTGAAGTTGGAAGT 
GGCTGCGACGTTTTCGAATGTATCTCCACTTACGCTCGTCGGAGACAGCGCGGGATTTGC 
GTTTTATCCGGGACGGGAACCGTCACTAACGTCAGCATCCGTCAGCCTACGGCGGCCGGA 
GCTGTTGTGACTCTGCGGGGTACTTTTGAGATTCTTTCCCTCTCCGGATCTTTTCTTCCG 
CCACCTGCTCCTCCAGGGGCGACTAGCTTGACGATATTCCTCGCTGGAGCTCAAGGACAG 
GTCGTCGGAGGTAACGTAGTTGGTGAGTTAATGGCGGCGGGGCCGGTAATGGTCATGGCA 
GCGTCTTTTACAAACGTGGCTTACGAAAGGTTGCCTTTGGACGAGCATGAGGAGCACTTG 
CAAAGTGGCGGCGGCGGAGGTGGAGGGAATATGTACTCGGAAGCCACTGGCGGTGGCGGA 
GGGTTGCCTTTCTTTAATTTGCCGATGAGTATGCCTCAGATTGGAGTTGAAAGTTGGCAG 
GGGAATCACGCCGGCGCCGGTAGGGCTCCGTTTTAGCAATTTAAGAAACTTTAATTGTTT 
TTTCCACTTTTTTGTTTTTCTCCGAATTITATGAAATTATGATTTAAGAAAAAAAACGAT 
ATTGTTCATGTATTGACCCTCTTACTGCATGGTTTCTTCTATTGGGTTAATTGGCTAGCT 
CATAAGAATTGTTTAATTTGGTTATTGTCATCAAATTTGCCCACATATAAAGCTTCTAGC 
AAAT 

>G1075 Amino Acid Sequence (domain in AA coordinates: 78-85) 
MAGLDLGTTSRYVHNVDGGGGGQFTTDNHHEDDGGAGGN 

NSGLGGGGGGGSGDLVMRRPRGRPAGSKNKPKPPVIVTRESANTLRAHILEVGSGCDVFE 
CISTYARRRQRGICVLSGTGTVTNVSIRQPTAAGAVVTLRGTFEILSLSGSFLPPPAPPG 
ATSLTI FLAGAQGQWGGNWGELMAAGPVMVMAAS FTNVAYERLPLDEHEEHLQSGGGG 
GGGNMYSEATGGGGGLPFFNLPMSMPQIGVESWQGNHAGAGRAPF* 
>G1266 (62.. 718) 

CAATCCACTAACGATCCCTAACCGAAAACAGAGTAGTCAAGAAACAGAGTAl T r r rTCTA 

CATGGATCCATTTTTAATTCAGTCCCCATTCTCCGGCTTCTCACCGGAATATTCTATCGG 

ATCTTCTCCAGATTCTTTCTC^TCCTCTTCTTCTAACAATTACTCTCTTCCCTTCAACGA 

GAACGACTCAGAGGAAATGTTTCTCTACGGTCTAATCGAGCAGTCCACGCAACAAACCTA 

TATTGACTCGGATAGTCAAGACCTTCCGATCAAATCCGTAAGCTCAAGAAAGTCAGAGAA 

GTCTTACAGAGGCGTAAGACGACGGCCATGGGGGAAATTCGCGGCGGAGATAAGAGATTC 

GACTAGAAACGGTATTAGGGTTTGGCTCGGGACGTTCGAAAGCGCGGAAGAGGCGGCTTT 

AGCCTACGATCAAGCTGCTTTCTCGATGAGAGGGTCCTCGGCGATTCTCAATTTTTCGGC 

GGAGAGAGTTCAAGAGTCGCTTTCGGAGATTAAATATACCTACGAGGATGGTTGTTCTCC 

GGTTGTGGCGTTGAAGAGGAAACACTCGATGAGACGGAGAATGACCAATAAGAAGACGAA 

AGATAGTGACTTTGATCACC^CTCCGTGAAGTTAGATAATGTAGTTGTCTTTGAGGATTT 

GGGAGAACAGTACCTTGAGGAGCnTITGGGGTCTTCTGAAAATAGTGGGACTTGGTO 

GATTAGGATTTGTATTAGGGACCTTAAGTTTGAAGTGGTTGATTAATTTTAACCCT 

TGTTTTTTGTTTGCTTAAATATTTGATTCTATTGAGAAACATCGAAAACAGTTTGTATGT 

ACTTTTGTGATACTTGGCG 

>G1266 Amino Acid Sequence (domain in AA coordinates: 79-147) 

MDPFLIQSPFSGFSPEYSIGSSPDSFSSSSSNNYSLPFNENDSEEMFLYGLIEQSTQQTY 

IDSDSQDLPIKSVSSRKSEKSYRGVRRRPWGKFAAEIRDSTRNGIRVWLGTFESAEEAAL 
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AYDQAAFSMRGSSAILNFSAERVQESLSEIKYTYEDGCSPWALKRKHSMRRRMTNKKTK 

DSDFDHRSVKLDNVWFEDLGEQYLEELLGSSENSGTW* 

>G1311 (41-. 757) 

AAGTATAATAACACAAAGAAACAGAGTAAAAGAAAGAAAAATGGATTTTAAGAAGGAAGA 
AACACTTCGTAGAGGGCCATGGCTCGAAGAAGAAGACGAACGGCTAGTGAAGGTCATTAG 
TCTTTTGGGAGAACGTCGTTGGGATTCTTTAGCAATAGTTTCCGGTTTGAAGAGGAGTGG 
TAAGAGTTGCAGGCTAAGGTGGATGAACTATCTGAATCCGACTCTGAAGCGTGGACCGAT 
GAGTCAAGAAGAAGAGAGAATCATCTTTCAGCTCCATGCTCTATGGGGTAACAAGTGGTC 
GAAGATTGCGAGAAGATTACCCGGTAGGACTGATAACGAGATAAAGAACTATTGGAGAAC 
TCATTATAGAAAGAAACAGGAAGCTCAAAACTATGGAAAGCTCTTTGAGTGGAGAGGAAA 
TACAGGAGAAGAATTGTTGCACAAGTATAAGGAAACAGAGATCACTAGGACAAAGACGAC 
GTCTCAAGAACATGGTTTTGTTGAAGTTGTGAGCATGGAAAGTGGTAAAGAAGCCAACGG 
TGGTGTTGGTGGAAGAGAAAGCTTCGGTGTTATGAAATCACCGTATGAAAATCGGATTTC 
GGATTGGATATCAGAGATTTCTACTGACCAGAGTGAAGCAAATCTTTCAGAAGATCACAG 
CAGCAATAGCTGCAGTGAGAACAATATTAACATTGGTACTTGGTGGTTTCAAGAGACTAG 
GGACTTTGAGGAGTTTTCATGTTCTCTATGGTCATAATTCTAAAGTTGGTTTATTTACTT 
TTTAAAAAAAAAAAAAAAAA 

>G1311 Amino Acid Sequence (domain in AA coordinates: 11-112) 
MDFKKEETLRRGPWLEEEDERLVKVISLLGERRWDSIAIVSGLKRSGKSCRIiRWMNYLNP 
TLKRGPMSQEEERI I FQLHALWGNKWSKIARRLPGRTDNEIKNYWRTHYRKKQEAQNYGK 
LFEWRGNTGEEIiLHKYKETEITRTKTTSQEHGFVEWSMESGKEANGGVGGRESFGVMKS 
PYENRISDWISEISTDQSEANLSEDHSSNSCSENNINIGTWWFQETRDFEEFSCSLWS* 
>G1321 (72.. 803) 

GTTCTTGTATTGGTTTGGATCGGTATACTTAGTTGATTACGTAATTAAATAGATCGGCGT 

GAAGAAGAAAAATGATCATGTGCAGCCGAGGCCATTGGAGACCAGCTGAAGACGAGAAGC 

TCAAGGATCTTGTCGAACAATACGGTCCTCACAATTGGAACGCCATTGCTCTCAAGCTTC 

CTGGTCGCTCTGGTAAGAGTTGTAGATTGAGATGGTTTAATCAATTGGATCCAAGGATCA 

ACCGAAACCCTTTCACGGAAGAAGAAGAAGAAAGACTTTTAGCGGCTCATCGGATCCATG 

GGAACAGATGGTCCATCATCGCAAGGCTTTTCCCTGGAAGAACTGATAACGCCGTCAAGA 

ACCATTGGCACGTCATCATGGCTCGTCGCACACGCCAAACCTCTAAGCCTCGTCTTCTTC 

CCTCGACGACTTCGTCTTCTTCTTTAATGGCGAGTGAACAAATCATGATGAGTTCTGGTG 

GTTATAATCATAATTATAGTTCCGATGATCGGAAGAAAATATTTCCAGCAGACTTTATAA 

ATTTCCCTTACAAATTCTCTCATATCAATCATCTTCACTTCCTAAAGGAGTTTTTCCCCG 

GAAAGATCGQTTTAAGTCACAAAGCAAATCAGAGTAAGAAGCCTATGGAGTTCTACAATT 

TTCTACAAGTAAACACAGATTCAAACAAGAGCGAGATTATAGATCAAGATTCAGGTCAAA 

GCAAACGCAGTGACTCGGACACCAAACATGAAAGTCATGTTCCATTCTTCGACTTTTTAT 

CCGTTGGAAACTCTGCCTCCTAGGATTAGTTTTTTTGCAGTAACTCCTAAATTTCTAGAT 

TAACTATTTAGTCCGTATACGTACGAGATTATCTAGGTCGTTAGdATGTATGCTTGATGT 

GTATAATCACTAACTAGTGAGCTATTACCTGCGAAAATTGTAAGAAAAATACATAATGTT 

GATGTATCACACATTCTCAATGTCTGTAAAATTTCC^TCGAGTTGTTAACTATCAAA 

ATCCGTTTGAAAAAAAAAAAA 

>G1321 Amino Acid Sequence (domain in AA coordinates: 4-106) 
MIMCSRGHWRPAEDEKLKDLVEQYGPHNWNAIALKLPGRSGKSCRLRWFNQLDPRINRNP 
FTEEEEERLLAAHRIHGNRWS I IARLFPGRTDNAVKNHWHVIMARRTRQTSKPRLLPSTT 
SSSSLMASEQIMMSSGGYNHNYSSDDRKKIFPADFINFPYKFSHINHLHFIiKEFFPGKIA 
LSHKANQSKKPMEFYNFLQVNTDSNKSEIIDQDSGQSKRSDSDTKHESHVPFFDFLSVGN 
SAS* 

>G1326 (32. .784-) 

CGACGGTACGGTGGAGATAGAGATAGCATCCATGGAGATGTCTAGAGGAAGCAACAGTTT 

TGACAATAAGAAGCCTAGTTGCCAAAGAGGTCACTGGAGACCTGTTGAAGATGACAATCT 

CCGGCAACTCGTTGAACAATACGGTCCCAAGAACTGGAATTTTATTGCTCAACATCTCTA 

TGGAAGATCAGGGAAAAGCTGTAGATTAAGATGGTACAACCAACTTGATCCAAACATCAC 

CAAGAAACCCTTCACCGAGGAGGAAGAAGAGAGACTGCTTAAAGCTCATCGGATCCAAGG 

GAATCGTTGGGCCTCCATAGCCCGACTGTTCCCCGGGAGGACCGACAACGCTGTCAAAAA 

CCATTTTCATGTCATCATGGCTAGA 

TACGTTCAACCAAACTTGGC^TACTGTTITOAGCCCTA 

TAGATCCCATTTCGGGCTATGGAGGTATCGAAAGGATAAGAGTTGCGGTCTCTGGCCTTA 
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CTCTTTTGTTTCACCACCTACGAATGGTCAATTTGGATCTTCATCTGTCTCTAACGTACA 
CCACGAAATTTATCTTGAGAGGAGAAAGTCGAAAGAGTTGGTGGATCCTCAGAATTACAC 
ATTTCATGCAGCCACACCAGATCATAAGATGACTTCAAATGAAGATGGACCATCCATGGG 
AGATGATGGTGAGAAGAACGATGTTACTTTCATTGATTTTCTTGGTGTTGGATTAGCTTC 
TTAGGTTATAACATCACAAGTCAAAGCTTTTAAGGGTTTCTATCATTAGGGTTAGGCATC 
ATTTTCAGCCTTTTGCTTCCTTAAACTCTCATATGGATCT 

>G1326 Amino Acid Sequence (domain in AA coordinates: 18-121) 

MEMSRGSNSFDNKKPSCQRGHWRPVEDDNLRQLVEQYGPKNWNFIAQHLYGRSGKSCRLR 

WYNQLDPNITKKPFTEEEEERLLKAHRI QGNRWAS IARLFPGRTDNAVKNHFHVTMARRK 

RENFSSTATSTFNQTWHTVLSPSSSLTRLNRSHFGLWRYRKDKSCGLWPYSFVSPPTNGQ 

FGSSSVSNVHHEIYLERRKSKELVDPQNYTFHAATPDHKMTSNEDGPSMGDDGEKNDVTF 

IDFLGVGLAS * 

>G1367 (128. .1567) 

TCCTTCCACAAAACTTTTTTAATTTTATCTGAAAAATTAAAACAACCGAAACAAAAAAAA 
AAAACTAAAAATCAAAAATCTCATCACCTTCCTTGCTCTGTATTTTTTCTCTCTCACTAA 
ATCCTCCATGGATCCTTCTCTCTCTGCAACCAATGATCCTCATCATCCTCCTCCTCCTCA 
GTTCACATCTTTCCCTCCTTTCACCAACACCAACCCCTTCGCCTCTCCAAACCACCCCTT 
CTTCACCGGACCCACCGCCGTCGCGCCGCCAAACAACATCCATCTCTATCAAGCAGCTCC 
TCCGCAGCAGCCACAAACATCTCCAGTTCCTCCTCATCCATCTATTTCCCACCCTCCTTA 
CTCTGACATGATTTGCACGGCGATTGCAGCGTTAAACGAACCAGATGGGTCAAGCAAGCA 
AGCTATTTCGAGGTACATAGAGAGAATTTACACTGGGATTCCTACTGCTCATGGAGCTTT 
GTTGACACACCATCTCAAGACTTTGAAGACCAGTGGGATTCTTGTCATGGTTAAGAAATC 
TTACAAGCTTGCTTCTACTCCTCCTCCTCCTCCTCCTACTAGTGTAGCTCCTAGTCTTGA 
ACCTCCCAGATCTGATTTCATAGTCAACGAGAACCAACCTTTACCTGATCCGGTTTTGGC 
TTCTTCTACTCCTCAGACTATTAAACGTGGTCGTGGTCGACCTCCAAAAGCTAAACCAGA 
TGTTGTTCAACCTCAACCTCTGACTAATGGAAAACTCACCTGGGAACAGAGTGAATTACC 
TGTCTCTCGACCAGAGGAGATACAGATACAGCCGCCACAGTTACCGTTACAGCCACAGCA 
GCCGGTTAAGAGACCGCCGGGTCGTCCTAGAAAAGATGGAACTTCGCCGACGGTGAAGCC 
AGCTGCTTCTGTTTCCGGTGGTGTGGAGACTGTGAAACGAAGAGGTAGACCTCCGAGTGG 
AAGAGCTGCTGGGAGGGAGAGAAAGCCTATAGTAGTCTC^GCTCCAGCTTCAGTGTTCCC 
GTATGTTGCTAATGGTGGTGTTAGACGCCGAGGGAGACCAAAGAGAGTTGACGCTGGTGG 
TGCTTCCTCTGTTGCTCCACCACCACCACCACCAACTAACGTAGAGAGTGGAGGAGAGGA 
GGTTGCAGTCAAGAAACGAGGAAGAGGACGGCCTCCTAAGATTGGAGGTGTTATCAGGAA 
GCCTATGAAGCCGATGAGAAGCTTTGCTCGTACTGGAAAACCCGTAGGAAGACCCAGAAA 
GAATGCGGTGTCAGTGGGAGCTTCTGGACGACAAGATGGTGACTATGGAGAACTGAAGAA 
GAAGTTTGAGTTGTTTCAAGCGAGAGCTAAGGATATTGTAATTGTGTTGAAATCCGAGAT 
AGGAGGAAGTGGAAATCAAGCAGTGGTTCAAGCCATACAGGACCTGGAAGGGATAGCAGA 
GACAACAAACGAGCCAAAGCACATGGAAGAAGTGCAGCTGCCAGACGAGGAACACCTTGA 
AACCGAACCAGAAGCAGAGGGTCAAGGACAGACAGAAGCAGAGGCAATGCAAGAAGCTCT 
GTTCTAAAGATAAAGCCTTGACATAAAAAGCTAGCAAGTGGTGGGTTTACTTGTTGTGTG 
TTACATGAAATTTTTAATCTTATAAGGGTC 

GATGAACTGATGATGATGATTGTGTCTCTAACCAAACAACAAGGAGAGGTAGGGTAATGT 
CTGTAAAGTGAATTAGGATGTTACCATTGTTCATC 

ATCTGTGTAGGCAGCTTTGTTCTTTGTTCCCTCGTGTTTTTTTTAGACTGTTG 

TATTCTATTTTGTCTCCTTAGGCTTTTTAGGAGT^ 

TATGTAATTTTTATGACCACTTCTACTTTTTATGATGGTTTCTT 

>G1367 Amino Acid Sequence (domain in AA coordinates: 179-201, 262-285, 298-319, 
335-357) _ 

MDPSLSATNDPHHPPPPQFTSFPPFTNTNPFASPNHPFFTGPTAVAPPNNIHLYQAAPPQ 
QPQTSPVPPHPSISHPPYSDMICTAIAALNEPDGSSKQAISRYIERIYTGIPTAHGAIiLT 
HHLKTLKTSGILVMVKKSYKLASTPPPPPPTSV^ 

TPQTIKRGRGRPPKAKPDWQPQPLTNGKLTWEQSELPVSRPEEIQIQPPQLPLQPQQPV 

KRPPGRPRKDGTSPTVKPAASVSGGVETVKRRGRPPSGRAAGRERKPIVVSAPASVFPYV 

ANGGVRRRGRPKRVDAGGASSVAPPPPPPTNVESGGEEVAVKKRGRGRPPKIGGVIRKPM 

KPMRS FARTGKPVGRPRKNAVS VGASGRQIX3DYGELKKKFELFQARAKDI VI VLKSE IGG 

SGNQAWQAIQDLEGIAETTNEPKHMEEVQLPDEEHL^ 

>G1386 (89.. 673) 
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AATTTTATTTCCTTCTCTCAAATCTTCCCACCAAAAATTAACTCTTTCGTTCACACTAAG 
TCCCTTTTAAAAGAAAATATCCCAATTAATGGAACGTGACGACTGCCGGAGATTTCAGGA 
CTCGCCGGCGCAGACGACGGAGAGAAGAGTGAAATATAAACCAAAGAAGAAAAGAGCCAA 
AGATGATGATGATGAGAAAGTTGTTTCGAAGCATCCAAATTTTCGAGGTGTCAGAATGAG 
ACAATGGGGAAAATGGGTGTCCGAAATCAGAGAGCCAAAAAAGAAATCAAGAATCTGGCT 
CGGTACTTTCTCCACGGCGGAGATGGCGGCGCGTGCTCACGACGTGGCAGCTTTAGCCAT 
CAAAGGCGGTTCTGCACATCTCAACTTCCCGGAGCTCGCTTATCACCTCCCTAGACCAGC 
TAGTGCCGACCCTAAAGACATCCAAGCTGCCGCCGCCGCAGCTGCAGCCGCTGTGGCCAT 
TGACATGGATGTAGAGACGTCTTCGCCGTCGCCATCTCCCACAGTTACGGAAACGTCATC 
TCCGGCTATGATAGCACTCTCCGACGACGCGTTCTCCGATCTTCCTGATCTCTTGCTCAA 
CGTGAACCATAACATCGATGGCTTCTGGGACTCTTTTCCCTATGAAGAACCCTTCCTCTC 
TCAAAGTTACTAGAAACTCAAAACTATGTCGTTTTTGTATGTATTTTTGTCATGTGACCA 
TTTTTTG ACGTCGAAAATCAC C CGG ATAATCCAAATTGTATG ATTTATTAATGGTTG ATG 
ATTTTCTTTGTGTGGAACAATGTGTATGATACGTAATCAAAAGTTCAAAAAAAAAATAAA 
AAAAA 

>G1386 Amino Acid Sequence (domain in AA coordinates: TBD) 

MERDDCRRFQDSPAQTTERRVKYKPKKKRAKDDDDEKWSKHPNFRGVRMRQWGKWSEI 

REPKKKSRIWLGTFSTAEMAARAHDVAALAIKGGSAHLNFPELAYHLPRPASADPKDIQA 

AAAAAAAAVAIDMDVETS S PS PS PTVTETS SPAMIALSDDAFSDLPDLLLNVNHNIDG FW 

DSFPYEEPFLSQSY* 

>G1421 (292. .1155) 

GAAATTTCATCCCTAAATAAGAAAAAAGCATCTCCTTCTTTAGTGTCCTCCTTCACCAAA 

CTCTTGATTCCATAAGCATATATTAAAAAAGCTCTCTGCTTTCTTCAACTTTCCCGGGAA 

AATCTTCTTGTTACAAAGCATCAATCTCTTGTTTTACCAATTTTCTCTCTTTATTCCTTT 

TTTGCCCTTTACTTTTCCTAACTTTGGTCTTTATATATAAACACACGACACAAAGAAGAA 

CACACATAAGTTAAAACTATTAGAACAGTTTTAAAGAGAGAGATTTAAAAAATGGAGACA 

GAGAAGAAAGTTTCTCTCCCAAGAATCTTACGAATCTCTGTTACTGATCCTTACGCAACA 

GATTCGTCAAGCGACGAAGAAGAAGAAGTTGATTTTGATGCATTATCTACAAAACGACGT 

CGTGTTAAGAAGTACGTGAAGGAAGTGGTGCTTGATTCGGTGGTTTCTGATAAAGAGAAG 

CCGATGAAGAAGAAGAGAAAGAAGCGCGTTGTTACTGTTCCAGTGGTTGTTACGACGGCG 

ACGAGGAAGTTTCGTGGAGTGAGGCAAAGACCGTGGGGAAAATGGGCGGCGGAGATTAGA 

GATCCGAGTAGACGTGTTAGGGTTTGGTTAGGTACTTTTGACACGGCGGAGGAAGCTGCC 

ATTGTTTACGATAACGCAGCTATTCAGCTACGTGGTCCTAACGCAGAGCTTAACTTCCCT 

CCTCCTCCGGTGACGGAGAATGTTGAAGAAGCTTCGACGGAGGTGAAAGGAGTTTCGGAT 

TTTATCATTGGCGGTGGAGAATGTCTTCGTTCGCCGGTTTCTGTTCTCGAATCTCCGTTC 

TCCGGCGAGTCTACTGCGGTTAAAGAGGAGTTTGTCGGTGTATCGACGGCGGAGATTGTG 

GTTAAAAAGGAGCCGTCTTTTAACGGTTCAGATTTCTCGGCGCCGTTGTTCTCGGACGAC 

GACGTTTTTGGTTTCTCGACGTCGATGAGTGAAAGTTTCGGCGGCGATTTATTTGGAGAT 

AATCTTTTTGCGGATATGAGTTTTGGATCCGGGTTTGGATTCGGGTCTGGGTCTGGATTC 

TCCAGCTGGCACGTTGAGGACCATTTTCAAGATATTGGGGATTTATTCGGGTCGGATCCT 

GTCTTAACTGTTTAAGAAATAACTGGCCGTTTAACGGCGTTTAGTGAAGTTTTGTTACCG 

GCGACGGCGAGGATTAAAAAAAAACGGCGATTTATTTTTTGAATGAAGATTTGTTAAATA 

>G1421 Amino Acid Sequence (domain in AA coordinates: 74-151) 

METEKKVSLPRILRISVTDPYATDSSSDEEEEVDFDALST^ 

KEKPMKKXRKKRVVWPWVTTAT 

EAAIVYDNAAIQLRGPNAELNFPPPPVTENVEEASTEVKGVSDFIIGGGECLRSPVSVLE 
SPFSGESTAVKEEFVGVSTAEIWKKEPSFNGSDFSAPLFSDDDVFGFSTSMSESFGGDIi 
FGDNLFADMSFGSGFGFGSGSGFSSWHVEDHFQDIGDLFGSDPVLTV* 
>G1453 (39.. 917) 

CGTCGACGCGAAATAAATCCTAGAAAATAACTATCAATATGATGAAGGTTGATCAAGATT 
ATTCGTGTAGTATACCGCCTGGATTTAGGTTTCATCCGACAGATGAAGAACTTGTCGGAT 
ATTATCTCAAGAAGAAAATCGCCTCCCAGAGGATTGATCTCGACGTTATCAGAGAAATTG 
ATCTTTACAAGATCGAACCATGGGATCTACAAGAGAGATGTAGGATAGGGTACGAGGAGC 
AAACGGAGTGGTATTTCTTCAGCCATAGAGACAAGAAGTATCCGACTGGGACTAGGACAA 
ACCGAGCCACCGTGGCCGGTTTCTGGAAAGCAACGGGCCGGGACAAGGCGGTTTACCTCA 
ACTCCAAACTTATCGGTATGAGAAAAACGCTTGTCTTTTACCGAGGTCGAGCGCCTAATG 
GCCAAAAGTCCGATTGGATC^TTCACGAATACTACAGCCTCGAGTCACACCAGAACTCTC 
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CTCCACAGGAAGAAGGATGGGTAGTGTGTAGAGCATTTAAGAAACGAACGACCATCCCAA 
CAAAAAGGAGGCAACTTTGGGATCCGAACTGCTTATTCTACGACGACGCCACTCTCTTGG 
AACCTCTCGACAAGCGAGCCAGACATAATCCTGATTTTACCGCCACACCGTTCAAGCAAG 
AACTACTCTCCGAGGCCAGTCACGTCCAGGATGGAGATTTCGGATCTATGTACCTTCAAT 
GCATCGATGATGATCAATTCTCCCAGCTTCCTCAGCTCGAGAGCCCCTCTCTTCCGTCGG 
AAATAACTCCCCATAGTACTACTTTTTCTGAGAACAGTAGCCGGAAAGATGACATGAGCT 
CCGAGAAGAGGATCACTGACTGGAGATATCTAGATAAGTTCGTGGCGTCTCAATTTTTGA 
TGAGTGGAGAAGACTAAAAAAGGCTTTCCTATGCATGCATGCACTAGAAACGTCGTCGCA 
TTTTGGATTTACATGCGGCCGCT 

>G1453 Amino Acid Sequence (conserved domain in AA coordinates : 13-160) 

MMKVDQD YS CS I P PG FRFHPTDEELVG YYLKKKI ASQRIDLDVIRE IDL YKI EPWDLQER 

CRIGYEEQTEWYFFSHRDKKYPTGTRTNRATVAGFWKATGRDKAVYLNSKLIGMRKTLVF 

YRGRAPNGQKSDW I IHE YYSLESHQNS PPQEEGWWCRAFKKRTTI PTKRRQLWDPNCLF 

YDDATLLEPLDKRARHNPDFTATPFKQELLSEASHVQDGDFGSMYLQCIDDDQFSQLPQL 

ESPSLPSEITPHSTTFSENSSRKDDMSSEKRITDWRYLDKFVASQFLMSGED* 

>G1560 (120.. 1340) 

ATCCTTTCAATTTCCACTCCTCTCTAATATAATTCACATTTTCCCACTATTGCTGATTCA 

TTTTTTTTTGTGAATTATTTCAAACCCACATAAAAAAATCTTTGTTTAAATTTAAAACCA 

TGGATCCTTCATTTAGGTTCATTAAAGAGGAGTTTCCTGCTGGATTCAGTGATTCTCCAT 

CACCACCATCTTCTTCTTCATACCTTTATTCATCTTCCATGGCTGAAGCAGCCATAAATG 

ATCCAACAACATTGAGCTATCCACAACCATTAGAAGGTCTCCATGAATCAGGGCCACCTC 

CATTTTTGACAAAGACATATGACTTGGTGGAAGATTCAAGAACCAATCATGTCGTGTCTT 

GGAGCAAATCCAATAACAGCTTCATTGTCTGGGATCCACAGGCCTTTTCTGTAACTCTCC 

TTCCCAGATTCTTCAAGCACAATAACTTCTCCAGTTTTGTCCGCCAGCTCAACACATATG 

GTTTCAGAAAGGTGAATCCGGATCGGTGGGAGTTTGCAAACGAAGGGTTTCTTAGAGGGC 

AAAAGCATCTCCTCAAGAACATAAGGAGAAGAAAAACAAGTAATAATAGTAATCA7VATGC 

AACAACCTCAAAGTTCTGAACAACAATCTCTAGACAATTTTTGCATAGAAGTGGGTAGGT 

ACGGTCTAGATGGAGAGATGGACAGCCTAAGGCGAGACAAGCAAGTGTTGATGATGGAGC 

TAGTGAGACTAAGACAGCAAC^^CAAAGCACCAAAATGTATCTCACATTGATTGAAGAGA 

AGCTCAAGAAGACCGAGTCAAAACAAAAACAAATGATGAGCTTCCTTGCCCGCGCAATGC 

AGAATCCAGATTTTATTCAGCAGCTAGTAGAGCAGAAGGAAAAGAGGAAAGAGATCGAAG 

AGGCGATCAGCAAGAAGAGACAAAGACCGATCGATCAAGGAAAAAGAAATGTGGAAGATT 

ATGGTGATGAAAGTGGTTATGGGAATGATGTTGCAGCCTCATCCTCAGCATTGATTGGTA 

TGAGTCAGGAATATACATATGGAAACATGTCTGAATTCGAGATGTCGGAGTTGGACAAAC 

TTGCTATGCACATTCAAGGACTTGGAGATAATTCCAGTGCTAGGGAAGAAGTCTTGAATG 

TGGAAAAAGGAAATGATGAGGAAGAAGTAGAAGATCAACAACAAGGGTACCATAAGGAGA 

ACAATGAGATTTATGGTGAAGGTTTTTGGGAAGATTTGTTAAATGAAGGTCAAAATTTTG 

ATTTTGAAGGAGATCAAGAAAATGTTGATGTGTTAATTCAGCAACTTGGTTATTTGGGTT 

CTAGTTCACACACTAATTAAGAAGAT^ATTGAAATGATGACTACTTTAAGCATTTGAATCA 

ACTTGTTTCCTATTAGTAATTTGGCTTTGTTTCAATCAAGTGAGTCGTGGACTAACTTGC 

>G1560 Amino Acid Sequence (domain in AA coordinates: 52-151) 

MDPSFRFIKEEFPAGFSDSPSPPSSSSYLYSSSMAEAAINDPTTLSYPQPLEGLHESGPP 

PFLTKTYDLV^SRTOHWSWSKSNNSFI^^ 

GFRKVNPDRWEFAI^GFLRGQKHLLKNIRRRlCrSNNSNQMQQPQSSEQQSLDNFCIEVGR 

YGLDGEMDS LRRDKQVLMMELVRLRQQQQSTKMYLTLIEEKIjKKTESKQKQMMS FLARAM 

QNPDFI QQL VEQKEKRKE I EE AI SKKRQRPIDQGKRNVED YGDE SG YGNDVAAS SS ALI G 

MSQEYTYGNMSEFEMSELDKLAMHIQGLGDNSSJ^ 

NNE I YGEGFWEDLLNEGQNFDFEGDQENVD VLI QQLGYLGSS SHTN* 

>G1594 (1..984) 

ATGGATGGAATGTACAATTTCCATTCGGCCGGTGATTATTCAGATAAGTCGGTTCTGATG 
ATGTCACCGGAGAGTCTCATGTTTCCTTCCGATTACCAAGCTTTGCTATGTTCCTCCGCC 
GGTGAAAATCGTGTCTCTGATGTTTTCGGATCCGACGAGCTACTCTCAGTAGCCGTCTCC 
GCTTTGTCGTCGGAGGCGGCTTCGATCGCTCCGGAGATCCGAAGAAATGATGATAACGTT 
TCTCTAACTGTCATCAAAGCTAAAATCGCTTGT 

GCTTACATCGATTGCCAAAAGGTCGGAGCACCACCGGAGATAGCGTGTTTACTAGAGGAG 

ATTCAACGGGAGAGTGATGTTTATAAGCAAGAGGTTGTTC 

GATCCTGAGCTTGATGAATTTATGGAAACGTACTGCGATAT^ 
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GATCTAGCAAGACCGTTTGACGAGGCAACGTGTTTCTTGAACAAGATTGAGATGCAGCTA 
CGGAACCTATGTACTGGTGTCGAGTCTGCCAGGGGAGTTTCTGAGGATGGTGTAATATCA 
TCTGACGAGGAACTGAGTGGAGGTGATCATGAGGTAGCAGAGGATGGGAGACAAAGATGT 
GAAGACCGGGACCTCAAAGATAGGTTGCTACGCAAATTTGGAAGCCGTATTAGTACTTTA 
AAGCTTGAGTTCTCAAAGAAGAAGAAGAAAGGAAAGTTACCAAGAGAAGCAAGACAAGCT 
CTTCTTGATTGGTGGAATCTCCATTATAAGTGGCCTTACCCTACTGAAGGAGATAAGATA 
GCATTAGCTGATGCAACGGGGTTAGACCAAAAACAAATCAACAATTGGTTTATAAACCAA 
AGGAAACGTCATTGGAAGCCATCAGAGAATATGCCTTTCGCTATGATGGATGATTCTAGT 
GGATCATTCTTTACCGAGGAATGA 

>G1594 Amino Acid Sequence (conserved domain in AA coordinates : 343-3 08) 

MDGMYNFHSAGDYSDKSVLMMSPESLMFPSDYQALLCSSAGENRVSDVFGSDELLSVAVS 

ALSSEAASIAPEIRRNDDNVSLTVIKAKIACHPSYPRLLQAYIDCQKVGAPPEIACLLEE 

IQRESDVYKQEWPSSCFGADPELDEFMETYCDILVKYKSDLARPFDEATCFLNKIEMQL 

RNLCTGVESARGVSEDGVISSDEELSGGDHEVAEDGRQRCEDRDLKDRLLRKFGSRISTL 

KLEFSKKKKKGKLPREARQALLDWWNLHYKWPYPTEGDKIALADATGLDQKQINNWFINQ 

RKRHWKPSENMPFAMMDDSSGSFFTEE* 

>G1750 (94.. 1101) 

CCCTTTTCCTCTCTTTCTCCAAATCTCTGAAAATTTTCACCAGAATCTCTGTTCTTTTTT 
TCACCAGAATCTCTCTGTTTAAAATAATAGGTGATGATGATGGATGAGTTTATGGATCTT 
AGACCAGTGAAGTACACAGAGCACAAGACTGTTATCAGAAAGTACACTAAAAAGTCGTCT 
ATGGAGAGGAAGACCAGTGTTCGTGACTCGGCCAGGTTGGTTCGGGTCTCAATGACGGAT 

AAGAGATTGATTAACGAGATCAGAGTCGAGCCTAGCAGCTCTTCCACCGGCGACGTCTCT 

GCTTCTCCGACGAAGGACCGGAAAAGAATCAACGTTGATTCTACGGTTCAAAAGCCCTCT 

GTTTCCGGCCAAAACCAGAAGAAGTACCGCGGCGTGAGACAGCGACCATGGGGAAAATGG 

GCGGCGGAGATTCGTGATCCTGAGCAACGCCGGAGAATCTGGCTCGGTACTTTTGCAACG 

GCGGAGGAAGCTGCCATCGTCTACGACAACGCAGCAATCAAACTTCGTGGCCCTGATGCT 

CTTACCAACTTCACCGTACAACCAGAACCAGAACCGGTACAAGAACAAGAACAAGAACCG 

GAGAGGAACATGTCGGTTTCGATATCAGAATCAATGGACGATTCTCAACATCTATCATCT 

CCGACATCGGTTCTCAACTACCAAACATATGTCTCGGAGGAACCAATCGATAGTCTTATC 

AAACCGGTTAAACAAGAGTTTCTTGAACCAGAACAAGAGCCAATAAGCTjGGCATCTTGGA 

GAAGGTAATACTAATACTAATGATGATTCATTTCCATTGGACATTACATTTCTCGACAAC 

TATTTCAATGAATCATTACCAGACATCTCCATCTTCGATCAACCTATGTCTCCTATTCAA 

CCAACAGAGAATGATTTCTTCAACGACCTTATGTTATTCGATAGCAACGCAGAAGAATAC 

TACTCCTCCGAGATCAAAGAGATTGGTTCATCGTTCAACGATCTTGATGATTCTTTGATA 

TCCGATCTCTTAC1TGTGTGATATTTTTGCCATTAACCAAACACCGGTTTGGTTGC 

>G1750 Amino Acid Sequence (domain in AA coordinates: 107-173) 

IW1MDEFMDLRPVTCYTEHKTVI RKYTKKS SMERKTS VTTOSARLWVSMTDRDATD S S SDEE 

EFLFPRRRVlO^INEIRv^PSSSSTGDVSASPTKDRra^ 

VRQRPWGKWAAE I RDPEQRRR I WLGTFATAEEAAI VYDNAAI KLRGPDALTNFTVQPEPE 

PVQEQEQEPESNMSVSISESMDDSQHLSSPTSVLNYQTYVSEEPIDSLIKPVKQEFLEPE 

QEPISWHLGEGNTNTWDDSFPLDITFLDNYFNESLPDISIFDQPMSPIQPTENDFFNDLM 

LFDSNAEEYYSSEIKEIGSSFNDLDDSLISDLLLV* 

>G1947 (70.. 918) 

GTTCACAAAATGGATTATAACCTTCCAATTCCATTAGAGGGTCTCAAAGAAACGCCACCA 

ACGGCTTTCTTGACGAAAACATACAACATAGTGGAGGATTCAAGCACAAACAAC^TAGTT 

TCATGGAGCAGAGAeAACAACAGCTTC^TTGTTTGGGAACCAGAGACTTTTGCCCT 

TGCCTCCCTAGATGCTTTAAGCACAATAATTTCTCCAGCTTTGTTAGACAGCTCAATACT 

TATGGGTTTAAGAAGATTGATACAGAGAGATGGGAATTTGCAAATGAGCATTTTCTGAAG 

GGAGAGAGGCATCTTCTTAAGAACATCAAGAGAAGAAAGACATCATCTCAAACGCAAACG 

CAGTCGCTAGAAGGAGAGATCCATGAGCTGCGAAGAGACAGAATGGCTTTAGAAGTAGAA 

CTGGTTAGACTGCGACGAAAACAAGAAAGCGTGAAGACATATCTGCATTTGATGGAAGAG 

AAACTGA^GTCACAGAAGTAAAGCAAGAAATGATGATGAATTTCTTGCTAAAGAAGATT 

AAGAAACCGAGTTTTTTACAGAGCTTAAGGAAACGTAATCTGCAAGGAATCAAGAATCGA 

GAGCAAAAGCAAGAGGTGATCTCAAGCCATGGTGTTGAGGAT^ 

GCTGAGCCAGAAGAGTATGGTGATGACATCGATGATCAATGTGGAGGTGTGTTTGATTAT 
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GGTGATGAGCTTCACATAGCTTCAATGGAGCATCAAGGACAAGGGGAGGATGAAATTGAA 

ATGGATAGTGAAGGAATTTGGAAGGGTTTCGTGTTGAGTGAGGAGGAGATGTGTGATTTA 

GTGGAACATTTTATATAATAAACTAATGTATTATGAGAGGTTTTTTTTTGTTTTTTTGCT 

TTTTTTTTCCGAGTTTGTCATCAAGCATTGTATACAATTTGGGCCAAACTAAAAGCCCAA 

CAAAATATTTGGCCTTGGCATTTGTTAACAAATTGACTAATTCGGCCACACCTTCC 

>G1947 Amino Acid Sequence (domain in AA coordinates: 37-120) 

MDYNLP I PLEGLKETPPTAFLTKTYNI VEDSSTNNI VS WSRDNNS F I VWE PETFAL I CLP 

RCFIOINNFSSFWQLNTYGFKKIDTERWEFANEHFLKGERHLLKNIKRRKTSSQTQTQSL 

EGEIHELRRDR^l^EVEIlWLRRKQESVKTYLHLMEEKI J KVTEVKQEMM™FLLKKIKKP 

SFLQSLRKRNLQGIKNREQKQEVISSHGVEDNGKFVKAEPEEYGDDIDDQCGGVFDYGDE 

LHIASMEHQGQGEDEIEMDSEGIWKGFVLSEEEMCDLVEHFI* 

>G2011 (309. .1547) 

AATGTCGGTTGTACAATTATTTGTCACTAAAGTTTCCAAATTTCTTCTAAACTGATGAAT 

CAATGGAACATGATGACGAAAAAGATAAATCCACGGTGGCGGGAACTGACCCACCCATTT 

CCACCGCCTCTCTATTCCCCAGATTTTTTTCAATTATCTGACTACAGTTTGTCGGTTACT 

TCCTTCCCTAAACCTTTATAAACCATTAAACCTCTCATCCTTCTTCTCTTAAACCCCCTA 

ATTATCACACACACCCCAATTTCTCACTCTCTCTCTCACTAAAACCCGTAAATTTTCTAC 

TATATCAAATGAGCCCAAAAAAAGATGCTGTTTCTAAACCAACTCCAATTTCAGTACCCG 

TTTCGAGACGATCCGATATACCCGGGTCTCTCTACGTCGACACTGACATGGGTTTCTCTG 

GGTCACCACTTCCCATGCC^CTAGACATCTTACAAGGGAATCCAATTCCACCTTTTTTAT 

CCAAGACTTTTGATTTGGTTGATGACCCGACTCTTGACCCGGTCATCTCTTGGGGACTGA 

CCGGAGCTAGCTTCGTAGTTTGGGATCCTCTAGAGTTTGCCAGAATCATACTTCCAAGGA 

ATTTCAAACACAACAATTTCTCCAGCTTCGTCAGACAGCTTAACACTTATGGATTTCGAA 

AGATTGATACTGACAAGTGGGAATTCGCTAACGAGGCTTTCCTTAGAGGCAAGAAGCATC 

TTCTGAAGAACATTCATCGTCGTCGATCACCACAATCCAACCAAACTTGCTGCAGTAGCA 

CTAGCCAAAGCCAAGGGTCACCTACTGAGGTTGGAGGAGAGATTGAGAAGCTGAGGAAAG 

AGCGGCGTGCATTGATGGAGGAAATGGTTGAGCTTCAGCAGCAAAGCAGAGGCACAGCTC 

GACATGTGGACACTGTAAACCAGAGGCTGAAAGCTGCAGAGCAACGTCAGAAGCAATTGC 

TCTCTTTCTTGGCTAAGTTGTTTCAGAACCGGGGTTTCTTGGAACGCCTGAAGAACTTCA 

AAGGAAAAGAAAAAGGAGGAGCTCTTGGATTGGAAAAGGCGAGAAAGAAGTTCATCAAGC 

ACCACCAGCAGCCTCAAGATTCTCCAACAGGAGGGGAGGTGGTGAAGTATGAAGCTGATG 

ATTGGGAGAGATTGCTAATGTATGACGAA(3AGACTGAGAACACCAAGGGTTTAGGAGGGA 

TGACTTCAAGCGATCCAAAAGGCAAGAACTTGATGTATCCATCAGAAGAAGAGATGAGCA 

AACCAGATTACTTGATGTCCTTCCCATCTCCTGAAGGACTTATTAAACAAGAAGAGACGA 

CATGGAGCATGGGTTTCGATACTACAATACCGAGTTTCAGCAACACCGATGCATGGGGAA 

AC^CAATGGACTATAATGATGTCTCAGAGTTTGGTTTTGCTGCAGAAACAACAAGTGATG 

GTTTGCCTGATGTCTGCTGGGAACAATTTGCTGCAGGAATCACAGAGACTGGATTCAACT 

GGCCAACTGGTGATGATGATGATAATACGCCAATGAATGATCCTTAGGATCTTTTCATAT 

ATAGTTTAGACCAAAAACCCGTTTCTTATCGGGTGAACTATTAATTCATTATTCATTTTG 

AATGCACTCTTTATACATATATATAATATTGATGAGTTTGATTGTTCCAAAAAAAAA 

>G2011 Amino Acid Sequence (domain in AA coordinates: 56-147) 

MSPKKDAVSKPTPISVPVSRRSDIPGSLYVDTDMGFSGSPLPMPLDILQGNPIPPFLSKT 
FDLVDDPTLDPVI SWGLTGAS FWWDPLEFARI ILPRNFKHNNFS S FVRQLNTYGFRKID 

TDKWEFANEAFLRGKKHLLKNIHRRRSPQSNQTCCSSTSQSQGSPTEVGGEIEKLRKER^ 
ALMEEMVELQQQSRGTARHVDTWQRLKAAEQRQKQLLSFLAIOjFQNRGFLERLI^ 
EKGGALGLEKARKKF I KHHQQPQDS PTGGEWKYEADD WERLLMYDEETENTKGLGGMTS 
SDPKGKNLMYPSEEEMSKPDYLMSFPSPEGLIKQEETTWSMGFDTTIPSFSNTDAWGNTM 

DYITOVSEFGFAAETXSDGLPDVCWEQFAAGITETGFNWPTGDDDDNTPMNDP* 
>G2094 (1..450) 

ATGCTAGATCCCACCGAG7\AAGTAATCGATTCAGAATCAATGGAAAGCAAACTCACATCA 

GTAGATGCGATCGAAGAACACAGCAGCAGTAGCAGTAATGAAGCTATCAGCAACGAGAAG 

AAGAGTTGTGCCATTTGTGGTACCAGCAAAACCCCTCTTTGGCGAGGCGGTCCTGCCGGT 

CCCAAGTCGCTTTGTAACGCATGCGGGATCAGAAACAGAAAGAAAAGAAGAACACTGATC 

TCAAATAGATCAGAAGATAAGAAGAAGAAGAGTCATAACAGAAACCCGAAGTTTGGTGAC 

TCGTTGAAGCAGCGATTAATGGAATTGGGGAGAGAAGTGATGATGCAGCGATCAACGGCT 

GAGAATCAACGGCGGAATAAGCTTGGCGAAGAAGAGCAAGCCGCCGTGTTACTCATGGCT 
CTCTCTTATGCTTCTTCCGTTTATGCTTAA 
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>G2094 Amino Acid Sequence (domain in AA coordinates : 43 -68) 
MLDPTEKVIDSESMESKLTSVDAIEEHSSSSSNEAISNEKKSCAICGTSKTPLWRGGPAG 
PKS LCNACG I RNRKKRRTL I SNRSEDKKKKSHNRNPKFGDSLKQRLMELGRE VMMQRSTA 
ENQRRNKLGEEEQAAVLLMALS YAS SVYA* 
>G2113 (90.. 590) 

ATAACAAACTCATCAAACTTCCTCAGCGTTTCTTTTTCTTACATAAACAATTTTTCTTAC 
ATAAACAAATCTTGTTGTTTGTTGTTGTCATGGCACCGACAGTTAAAACGGCGGCCGTCA 
AAACCAACGAAGGTAACGGAGTCCGTTACAGAGGAGTGAGGAAGAGACCATGGGGACGTT 
ACGCAGCCGAGATCAGAGATCCTTTCAAGT^AGTCACGTGTCTGGCTCGGTACTTTCGACA 
CTCCTGAAGAAGCCGCTCGTGCCTACGACAAACGTGCTATTGAGTTTCGTGGAGCTAAAG 
CCAAAACCAACTTCCCTTGTTACAACATCAACGCCCACTGCTTGAGTTTGACACAGAGCC 
TGAGCCAGAGCAGCACCGTGGAATCATCGTTTCCTAATCTCAACCTCGGATCTGACTCTG 
TTAGTTCGAGATTCCCTTTTCCTAAGATTCAGGTTAAGGCTGGGATGATGGTGTTCGATG 
AAAGGAGTGAATCGGATTCTTCGTCGGTGGTGATGGATGTCGTTAGATATGAAGGACGAC 
GTGTGGTTTTGGACTTGGATCTTAATTTCCCTCCTCCACCTGAGAACTGATTAAGATTTA 
ATTATGATTATTAGATATAATTAAATGTTTCTGAATTGAG 

>G2113 Amino Acid Sequence (domain in AA coordinates: TBD) 

MAPTVKTAAVKTNEGNGVRYRGVRKRPWGRYAAEIRDPFKKSRVWLGTFDTPEEAARAYD 

KRAIEFRGAKAKTNFPCYNINAHCLSLTQSLSQSSTVESSFPNLNLGSDSVSSRFPFPKI 

QVKAGMMVFDERSESDSSSVVMDVVRYEGRRVVLDLDLNFPPPPEN* 

>G2115 (41.. 733) 

AATCACTCTACAAAGCCTGTACGTACACAACAACATTACCATGGTGAAACAAGAACGCAA 
GATCCAAACCAGCAGCACAAAAAAGGAAATGCCTTTGTCATCATCACCATCTTCTTCTTC 
TTCTTCATCTTCTTCCTCGTCTTCGTCTTCGTGTAAGAACAAGAACAAGAAGAGTAAGAT 
TAAGAAGTACAAAGGAGTGAGGATGAGAAGTTGGGGATCATGGGTCTCTGAGATTAGGGC 
ACCAAATCAAAAGACAAGGATTTGGTTAGGTTCTTACTCAACAGCTGAAGCAGCTGCTAG 
AGCTTACGATGTTGCACTCTTATGTCTCAAAGGCCCTCAAGCCAATCTCAACTTCCCTAC 
TTCTTCTTCTTCTCATCATCTTCTTGATAATCTCTTAGATGAAAATACCCTTTTGTCCCC 
CAAATCCATCCAAAGAGTAGCTGCTCAAGCTGCCAACTCATTTAACCATTTTGCCCCTAC 
TTCATCAGCCGTCTCGTCACCGTCCGATCATGATCATCACCATGATGATGGGATGCAATC 
TTTGATGGGATCTTTTGTGGACAATCATGTGTCTTTGATGGATTCAACATCTTCATGGTA 
TGATGATCATAATGGGATGTTCTTGTTTGATAATGGAGCTCCATTCAATTACTCTCCTCA 
ACTAAACTCGACGACGATGCTCGATGAATACTTCTACGAAGATGCTGACATTCCGCTTTG 
GAGTTTCAATTAATCCGACGGTCCATAATACATACTTTAATTAGT 

>G2115 Amino Acid Sequence (conserved domain in AA coordinates : 46-115) 

MVKQERKIQTSSTKKEMPLSSSPSSSSSSSSSSSSSSCKNKNKKSKIKKYKGVRMRSWGS 

WVSEIRAPNQKTRIWLGSYSTAEAAARAYDVALLCLKGPQANLNFPTSSSSHHLLDNLLD 

ENTLLSPKS IQRVAAQAANS FNHFAPTS S AVS S PSDHDHHHDDGMQSLMGS FVDNHVSLM 

DSTSSWYDDHNGMFLFDNGAPFNYSPQLNSTTMLDEYFYEDADIPLWSFN* 

>G2130 (41.. 988) 

CCTCTCTTCATTTTTTAACTCCCTCTCTCTCTCTCTCTCTATGGAGAGACGAACGAGACG 

AGTGAAGTTCACAGAGAATCGTACGGTCACAAACGTAGCAGCTACACCATCTAACGGGTC 

TCCGAGACTGGTCCGTATCACTGTTACTGATCCTTTCGCTACTGACTCGTCTAGCGACGA 

CGACGACAACAACAACGTCACGGTGGTTCCAAGAGTGAAACGATACGTGAAGGAGATTAG 

ATTCTGCCAAGGTGAATCTTCTTCCTCCACCGCGGCGAGGAAAGGTAAGCACAAGGAGGA 

GGAAAGCGTAGTGGTTGAAGATGACGTGTCGACGTCGGTGAAGCCTAAAAAGTACAGAGG 

CGTGAGACAGAGACCTTGGGGAAAATTCGCGGCGGAGATTAGAGATCCGTCGAGCCGTAC 

TCGGATTTGGCTTGGGACTTTTGTCACGGCGGAGGAAGCTGCTATAGCGTACGATAGAGC 

CGCGATTCATCTCAAAGGACCTAAAGCGCTCACGAATTTCCTAACTCCGCCGACGCCAAC 

GCCGGTTATCGATCTCCAAACGGTTTCCGCCTGCGATTACGGTAGAGATTCTCGGCAGAG 

CCTTCATTCACCGACCTCTGTTCTAAGATTCAACGTCAACGAGGAAACAGAGCATGAGAT 

TGAAGCGATCGAGCTATCTCCGGAGAGAAAGTCGACGGTTATAAAAGAAGAAGAAGAATC 

GTCGGCGGGTTTGGTGTTCCCGGATCCGTATCTGTTACCGGATTTATCTCTCGCCGGCGA 

ATGTTTTTGGGATACCGAAATTGCCCCTGACCTTTTGTTTCTCGATGAAGAAA 

CCAATCAACGTTGTTACCAAACACAGAGGTTTCGAAACAAGGAGAAAACGAAACT 

TTTCGAGTTTGGTTTGATTGATGATTTCGAGTCTTCTCCATGGGATGTGGATCATTTCTT 

CGACCATCATCATCACTC1TTCGATTAAAAATCTCTTCTTTTTTGGGGAAATTTTTGTG 
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>G2130 Amino Acid Sequence (domain in AA coordinates 93-160) 
MERRTRRVKFTENRTVTNVAATPSNGS PRL VR I TVTDPFATDS S S DDDDNNNVTWPR VK 
RYVKEIRFCQGESSSSTAARKGKHKEEESWVEDDVSTSVKPKKYRGVRQRPWGKFAAEI 
RDPSSRTRIWLGTFVTAEEAAIAYDRAAIHLKGPKALTNFLTPPTPTPVIDLQTVSACDY 
GRDSRQSLHS PTS VLRFNVNEETEHE I EAI ELS PERKSTVI KEEEES SAGLVFPDP YLLP 

DLSLAGECFWDTEIAPDLLFLDEETKIQSTLLPNTEVSKQGENETEDFEFGLIDDFESSP 
WDVDHFFDHHHHS FD * 

>G2147 (162.. 1262) 

CTGTGATTGTCAAGAGTTTGAACACACAAAGAAGAAAGAAGAACTCAACATTTCAAGCAA 
GAAGAAAGAGAGAAGAGAGAAGGTCCAATAATAGAGAGAACAAAAAAAAAGAGAGCTTAA 
TTGTCAGTTTATTCTCTGCAAACGTGCGGCCTAAGTAACACATGTCGAATTATGGAGTTA 
AAGAGCTCACATGGGAAAATGGGCAACTAACCGTTCATGGTCTAGGCGACGAAGTAGAAC 
CAACCACCTCGAATAACCCTATTTGGACTCAAAGTCTCAACGGTTGTGAGACTTTGGAGT 
CTGTGGTTCATCAAG CGGCTCTACAG CAGCCAAGCAAGTTTCAGCTGCAGAGTCCGAATG 
GTCCAAACCACAATTATGAGAGCAAGGATGGATCTTGTTCAAGAAAACGCGGTTATCCTC 
AAGAAATGGACCGATGGTTCGCTGTTCAAGAGGAGAGCCATAGAGTTGGCCACAGCGTCA 
CTGCAAGTGCGAGTGGTACCAATATGTCTTGGGCGTCTTTTGAATCCGGTCGGAGCTTGA 
AGACAGCTAGAACCGGAGACAGAGACTATTTCCGCTCTGGATCGGAAACTCAAGATACTG 
AAGGAGATGAACAAGAGACAAGAGGAGAAGCAGGTAGATCTAATGGACGACGGGGACGAG 
CAGCAGCGATTCACAACGAGTCCGAAAGGAGACGGCGTGATAGGATAAACCAGAGGATGA 
GAACACTTCAG7U\GCTGCTTCCTACTGCAAGTAAGGCGGATAAAGTCTCAATCTTGGATG 
ATGTTATCGAACACTTGAAACAGCTACAAGCACAAGTACAGTTCATGAGCCTAAGAGCCA 
ACTTGCCACAACAAATGATGATTCCGCAACTACCTCCACCACAGTCAGTTCTCAGCATCC 
AAC ACCAAC AAC AAC AAC AAC AACAG CAG C AG C AGCAGCAACAACAG CAG C AAC AGTTTC 

AGATGTCGTTGCTTGCAACAATGGCAAGAATGGGAATGGGAGGTGGTGGAAATGGTTATG 

GAGGTTTAGTTCCTCCTCCTCCTCCTCCACCAATGATGGTCCCTCCTATGGGTAACAGAG 

ACTGCACCAACGGTTCTTCAGCCACATTATCTGATCCATACAGCGCCTTTTTCGCACAGA 

CAATGAATATGGATCTCTACAATAAAATGGCAGCAGCTATCTATAGACAACAGTCTGATC 

AAACAACAAAGGTAAATATCGGCATGCCTTCAAGTTCTTCGAATCATGAGAAAAGAGATT 

AGTCTAGCGACCTAGTATTATTGATCCATATATATAGTTCTTGAAAGATTGTTGTATCAT 

GATTGTAAAAACTGTTTTGAGTATGGAAAAAGACTTGCAGATAAAA 

>G2147 Amino Acid Sequence (domain in AA coordinates : 160-234 ) 

MSNYGVKELTWENGQLTVHGLGDEVEPTTSNN^ 

QLQSPNGPNHNYESI03GSCSRKRGYPQEMDRWFAVQEESHRVGHSVTASASGTNMSWAS 
ESGRSLKTARTGDRDYFRSGSETQDTEGDEQETRGEAGRSNGRRGRAAAIHNESERRRRD 
RINQRMRTLQKtiLPTASKADKVS ILDDVIEHLKQLQAQVQFMSLRANLPQQMMI PQLPPP 
QSVLSIQHQQQQQQQQQQQQQQQQQFQMSLLATMARMGMGGGGNGYGGLVPPPPPPPMMV 

PPMGNRDCTNGSSATLSDPYSAFFAQTMNMDLYNKMAAAIYRQQSDQTTK^ 
NHEKRD* 

>G2156 (384.. 1292) 

TTTTTTTTCCCTTTCCTCGTTCAAAAAAAGTACTTGCAGAGTCACTCACTCTCAGTCTCA 
GCACATGAATTAATTTGAAGCTTCCCTAGAATTCTTTCAC^ 

CTCGGGTGAAGAATCTCTCCTCTCTTGCCCTAAAGCGAGTTAGGGTTTAACACACAAAGC 
ATACCCTTTAGATTTGTGTCTCTTAGCTCTGTTTTTGTCGGCTTGTGTAACCGATCAACT 
CAAGCTATTGGCTCCTCACCTCCTGAAATTTGACTTCTCCAATGGATCTCAAAGTTTCTC 
TTATATGAATTCTATCTTCACCCTCACAATATCTTTATATATATGAGCCACAAGAACAAG 
AAGAGTCAGTAGATGCGGCTGCCATGGACGGTGGTTACGATCAATCCGGAGGAGCTTCTA 
GATACTTTCACAACCTCTTCAGGCCTGAGCTTCATCACCAGCTTCAACCTCAGCCTCAAC 
TTCACCCTTTGCCTCAGCCTCAGCCTCAACCTCAGCCTCZAGCAGCAGAATTCAGATGATG 
AATCTGACTCCAACAAGGATCCGGGTTCCGACCCAGTTACCTCTGGTTCAACCGGGAAAC 
GTCCACGTGGACGTCCTCCGGGATCCAAGAACAAGCCGAAGCCACCGGTGATAGTGACTA 
GAGATAGCCCCAACGTGCTTAGATCTCATGTTCTTGAAGTCTCATCTGGAGCCGACATAG 
TCGAGAGCGTTACCACTTACGCTCGCAGGAGAGGAAGAGGAGTCTCCATTCTCAGTGGTA 
ACGGCACGGTGGCTAACGTCAGTCTCCGGCAGCCGGCAACGACAGCGGCTCATGGGGCAA 
ATGGTGGAACCGGAGGTGTTGTGGCTCTACATGGAAGGTTTGAGATACTTTCCCTCACAG 
GTACGGTGTTGCCGCCCCCIXKXJCCGCCAGGATCCGGTC 

GCGTTCAAGGTCAGGTGATTGGAGGAAACGTGGTGGCTCCGCTTGTGGCTTCGGGTCCAG 
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TGATACTAATGGCTGCATCGTTCTCTAATGCAACTTTCGAAAGGCTTCCCCTTGAAGATG 
AAGGAGGAGAAGGTGGAGAGGGAGGAGAAGTTGGAGAGGGAGGAGGAGGAGAAGGTGGTC 
CACCGCCGGCCACGTCATCATCACCACCATCTGGAGCCGGTCAAGGACAGTTAAGAGGTA 
ACATGAGTGGTTATGATCAGTTTGCCGGTGATCCTCATTTGCTTGGTTGGGGAGCCGCAG 
CCGCAGCCGCACCACCAAGACCAGCCTTTTAGAATTGAAAATTATGTCCGTAACATAGCT 
GTAACCAAATTTCATTTCTCAAAATTAAAAGAAAAAAAAAA 

>G2156 Amino Acid Sequence (domain in AA coordinates : 66-86) 

MDGGYDQSGGASRYFHNLFRPELHHQLQPQPQLHPLPQPQPQPQPQQQNSDDESDSNKDP 

GSDPVTSGSTGKRPRGRPPGSKNKPKPPVIVTRDSPNVLRSHVLEVSSGADIVESVTTYA 

RRRGRGVSILSGNGTVANVSLRQPATTAAHGANGGTGGWALHGRFEILSLTGTVLPPPA 

PPGSGGLSIFLSGVQGQVIGGNWAPLVASGPVILMAASFSNATFERLPLEDEGGEGGEG 

GEVGEGGGGEGGPPPATSSSPPSGAGQGQLRGNMSGYDQFAGDPHLLGWGAAAAAAPPRP 

AF* 

>G2294 (24.. 659) 

TCCTCCCTTAATTAGTATCAAAAATGGTGAAAACACTTCAAAAGACACCAAAGAGAATGT 
CATCTCCATCATCATCATCTTCATCATCCTCATCAACATCATCATCATCCATAAGGATGA 
AGAAGTACAAGGGAGTGAGAATGAGAAGTTGGGGTTCATGGGTTTCAGAGATCAGAGCTC 
CTAATCAAAAGACAAGGATCTGGCTTGGTTCTTACTCAACTGCTGAAGCCGCGGCTAGAG 
CCTACGACGCAGCACTCCTATGTCTTAAAGGATCCTCAGCTAATAATCTCAACTTCCCAG 
AGATCTCAACTTCTCTCTACCATATTATCAACAATGGTGATAACAACAATGACATGTCCC 
CTAAGTCTATACAAAGAGTAGCAGCTGCAGCTGCTGCTGCCAACACAGATCCTTCCTCAT 
CATCAGTCTCTACTTCATCTCCATTGCTTTCCTCTCCATCTGAAGATCTCTATGATGTTG 
TCTCCATGTCACAGTATGACCAACAAGTCTCCTTGTCTGAATCATCATCATGGTACAACT 
GCTTTGATGGTGATGATCAGTTCATGTTCATTAATGGAGTCTCCGCGCCGTATTTGACAA 
GATCACTTTCTGATGATTTCTTTGAGGAAGGAGATATCAGATTATGGAACTTCTGCTGAT 
TCTACTTTCATTATACCTTATTCTTTG 

>G2294 Amino Acid Sequence (conserved domain in AA coordinates : 32-102) 
IWKTLQKTPKRMSSPSSSSSSSSSTSSSSIRMKKYKG^ 

LGS YSTAEAAARAYDAALLCLKGS S ANNLNFPE I S TSLYHI INNGDNNNDMS PKS I QR VA 
AAAAAANTDPSSSSVSTSSPLLSSPSEDLYDWSMSQYDQQVSIiSESSSWYNCFDGDDQF 
MFINGVSAPYIiTTSLSDDFFEEGDIRLWNFC* 
>G2510 .(16. .594) 

ATAACAAACTCTTTAATGTCAC(^CAGAGAATGAAGCTATCAT(^CCACCAGTTACCAAC 
AACGAACCAACCGCCACCGCTTCTGCCGTTAAATCTTGCGGCGGAGGAGGTAAAGAAACC 
AGCTCATCGACCACGAGGCATCCAGTGTACCACGGAGTTCGCAAACGCCGATGGGGAAAA 
TGGGTTTCTGAGATCAGAGAGCCCCGGAAAAAGTCTCGGATTTGGCTCGGATCTTTTCCG 
GTGCCGGAGATGGCTGCTAAGGCCTACGACGTGGCAGCGTTTTGTCTAAAAGGTAGAAAA 
GCTC^VGCTGAATTTCCCTGAAGAAATCGAGGATCTACCTCGACCGTCCACGTGTACTCCC 
AGAGATATCCAAGTCGCAGCGGCCAAAGCAGCCAACGCCGTGAAGATCATCAAAATGGGA 
GATGATGACGTGGCAGGAATAGACGACGGAGATGATTTCTGGGAAGGCATTGAGCTGCCT 
GAGCTTATGATGAGTGGAGGTGGGTGGTCGCCGGAGCCTTTTGTTGCCGGAGATGATGCC 
ACGTGGCTTGTCGACGGAGACTTGTATCAGTATCAGTTCATGGCGTGTCTGTGAGTGTTG 
CTGTCGATTGTGTCGTATTCGTTATACGTGTACGTTGTATCGTTATTGTGTTGGCTCACT 
TAATTTAATGCATATGCATGTATATTTTCATTTATTTGTTTCTAGTTTATTGTTTTACGC 
GATTAATAATTAGATACCTGTTTCTCAAGTTAGTTATCAGGTTTGTACGCATCTACAAAA 
ATACGTATAAGTGTATGTTCTTATATACAGTTTTTGTTTGCATAAGTATTGCTACTTATT 
CTAAAAAAAAAAAAAAAAAAAAAAAAA 

>G2510 Amino Acid ' Sequence (conserved domain in AA coordinates : 41-108) 
MS PQRMKLSSPP VTNNEPTATASAVKSCGGGGKETS SSTTRHPV^ 

REPRKKSRIWLGSFPYPEMAAKAYDVAAFCLKGRKAQLNFPEEIEDLPRPSTCTPRDIQV 
AAAKAANAV1CIIKMGDDDVAGIDDGDDFWEGIELPELMMSGGGWSPEPFVAGDDATWLVD 
GDLYQYQFMACL* 
>G2893 (130.. 981) 

AAATCATAAAAGCCTCTCTCTTAGTCTATTTTTATCTCACGGCTCTCTCTCCCCTCTCTA 
CACACACAAACACAAATAAAGCGTAAAACTGAAATATTTTAATTACAATTAGAAA 
CATATTAATATGTCAAATATAACAAAGAAGAAGTGTAATGGAAATGAAGAGGGTGCAGAG 
CAGAGGAAAGGGCCTTGGACACTCGAGGAAGACACTCTTCTCACCAATTACATTTCCCAT 
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AACGGTGAAGGCCGATGGAATCTGCTCGCTAAATCTTCTGGGCTAAAGAGAGCAGGAAAA 

AGTTGTAGATTGAGATGGTTGAATTACCTTAAACCCGACATAAAGCGTGGGAATCTCACT 

CCTCAAGAACAACTTTTAATCCTTGAGCTCCATTCTAAATGGGGTAATAGGTGGTCAAAA 

ATTTCGAAGTATTTACCAGGAAGAACAGACAACGATATCAAAAACTACTGGAGAACTAGA 

GTCCAGAAACAAGCACGCCAGCTCAACATAGATTCCAATAGCCACAAGTTCATAGAAGTT 

GTTCGTAGCTTTTGGTTTCCAAGACTGATCAACGAGATTAAAGACAACTCATACACCAAC 

AATATTAAAGCTAATGCTCCTGATTTACTTGGACCAATTTTACGAGACAGCAAAGATTTG 

GGTTTCAACAACATGGATTGTTCCACTTCCATGTCAGAAGATCTCAAGAAAACTTCACAA 

TTCATGGATTTTTCTGATCTTGAAACCACAATGTCCTTGGAAGGATCACGAGGGGGTAGT 

AGTCAATGTGTGAGTGAGGTTTATAGCTCCTTCCCTTGCCTAGAGGAGGAGTACATGGTG 

GCCGTTATGGGCAGTTCAGACATTTCAGCATTGCATGATTGTCACGTGGCTGATTCCAAG 

TACGAGGATGATGTGACACAAGATCTAATGTGGAACATGGATGACATTTGGCAGTTTAAC 

GAGTATGCACACTTTAATTAGGTTATATTATATTTATGTACTTCTTACAACTTGGAGGGG 

TTTATCGGTCTTTTATTAAATTTTGATTGTTTTGGATTCCTTAAAAATGTGTTCTTATTA 

TAGTTTTTAATGAAAAAAATGTTTAAGCGCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

>G2893 Amino Acid Sequence (conserved domain in AA coordinates : 19-120) 

MSNITKKKCNGNEEGAEQRKGPWTLEEDTLLTNYISHN^ 

LRWLNYLKPD I KRGNLTPQEQLL ILELHS KWGNRWS KI S KYLPGRTDND I KNYWRTR VQK 
QARQLNIDSNSHKFIEVATCSFWFPRLINEIKI^^ 

NMDCSTSMSEDLKKTSQFMDFSDLETTMSLEGSRGGSSQCVSEVYSSFPCLEEEYKVAVM 

GSSDISALHDCHVADSKYEDDVTQDLMWNMDDIWQFNEYAHFN* 

>G340 (97.. 834) 

ATGAAATCTCTGTAGTTTTTTTTTGTTCCTTTCTTAAATTTCGAAAGAAAGACATTTATT 

AAACCAAAATAACTCTTTAGATCATTGCAAGGAAAAATGTTGAAAAGTGCAAGTCCAATG 

GCATTCTACGATATCGGAGAGCAGCAATACTCTACTTTCGGGTACATTTTAAGCAAACCT 

GGGAACGCAGGAGCTTACGAGATTGACCCTTCGATCCCAAACATCGACGATGCGATCTAC 

GGCTCAGATGAGTTCCGTATGTACGCTTACAAAATCAAACGGTGTCCTCGTACTCGTAGC 

CACGACTGGACGGAGTGTCCCTACGCTCACCGTGGCGAGAAAGCCACACGCCGTGATCCT 

CGCCGTTACACTTACTGTGCAGTCGCATGCCCGGCTTTCCGAAATGGCGCATGCCACCGT 

GGCGACTCATGCGAATTCGCACATGGCGTATTCGAGTACTGGCTCCACCCGGCGCGTTAC 

CGAACACGCGCATGTAACGCCGGGAACTTGTGTCAGAGGAAAGTGTGTTTCTTTGCCCAC 

GCGCCGGAGCAGCTAAGGCAGTCTGAAGGAAAGCACAGGTGCAGGTACGCATATAGGCCG 

GTGAGGGCTAGAGGTGGTGGAAACGGCGATGGAGTGACGATGAGAATGGACGACGAGGGT 

TACGACACGTCACGGTCTCCGGTGAGAAGCGGGAAAGATGATTTAGATAGTAACGAGGAG 

AAGGTGTTGTTGAAGTGTTGGAGTCGGATGAGCATTGTGGATGATCATTATGAGCCGTCC 

GATTTGGATTTGGATTTGTCACACTTTGATTGGATCTCAGAGTTGGTCGATTAAATTTGG 

GAAATCAAAGCAGAGAACAAAAGAAACCCGATAAATAAAGTGGATTTTGTTAAAATCCAC 

AAGATCAAGATTCAAGATGAGAGATCTTGTCATGTATATGGTAAATTTAATTGTAATGAT 

TTATTGCAATGTCGCAAAAGAAGTTACTTCTCT 

TATAAGTCTTTGTATTAA 

>G340 Amino Acid Sequence (domain in AA coordinates: 37-154) 

MLKSAS PMAFYDIGEQQY STFGY ILSKPGNAGAYE IDPS I PNIDDAI YGSDEFRMYAYKI 

KRCPRTRSHDWTECPYAHRGEKATRRDPRRYTYCAVACPAFRNGACHRCT 

YWLHPARYRTRACNAG^CQRKVCFFAHAPEQLRQSEGKHRCRYAYRPV^ 

TMRMDDEGYDTSRSPTOSGKDDLDSNEE^ 

SELVD* 

>G39 (75.. 638) 

GTTTCCACAGTCCCTOTACTTGTGCATAAAACTGTAAAAC^ 
TCTGTTAGGATATAATGCCACCCTCTCCTCCT^ 

AAGGAGCTCATGAAGATCGCAAATTTAAATGCTATAGGGGTGTCCGAAAGAGGTCTTGGG 

GCAAATGGGTGTCTGAAATCAGAGTTCCAAAGACTGGACGACGAATATGGCTAGGTTCAT 

ACGATGCTCCAGAGAAGGCAGCTAGAGCOTATGATGCTGCTTTGTTCTGTATTAGGGGTG 

AGAAGGGAGTTTAC^TTTTCCCACTGATAAAAAGCCGCAGCTTCCAGAAGGTTCTGTCC 

GGCCTCTGTCCAAGCTCGACATACAGACAATAGCAACAAACTATGCTTCATCAGTTGTGC 

ATGTACCTTCCCATGCCACCACACTCCCGGCAACAACCCAGGTTCCCTCTGAAGTTCCTG 

CTTCCTCTGATGTTTCTGCTTCTACTGAGATTACAGAGATGGTCGATGAATATTAT 

CAACCGATGCAACTGCAGAATCAATATTCTCAGTTGAAGACTTACAACTGGACAGTTTCC 
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TCATGATGGACATTGATTGGATAAACAATCTAATCTGATGTGTAACGTCACTTGCAGTGA 
CATTTAATATGGTTTANCTATCAGTTACCTGTCTGCTTCTTGTAAGGGTATACTTGGATC 
CTTGTCTTTGAACTTGTTTTATTTAGCATGCAAA 

>G39 Amino Acid Sequence (domain in AA coordinates: 24-90) 
MPPSPPKSPFISSSLKGAHEDRKFKCYRGVRKRSWGKWVSEIRVPKTGRRIWLGSYDAPE 
KAARAYDAALFCIRGEKGVYNFPTDKKPQLPEGSVRPLSKLDIQTIATNYASSWHVPSH 
ATTLPATTQVPSEVPASSDVSASTEITEMVDEYYLPTDATAESIFSVEDLQLDSFLMMDI 

DWINNLI* 

>G439 (128.. 967) 

TATAAATCTTCGTTTCTACTTTTTTTTCTTCCATAATATAGTCAATTCGTTTTCTTAATT 
AGGGCTTCTTCTCTTTGTTTCTCCAATCTTTATTAGTTTATTTATTTATTTTGGTTATTG 
TATACAAATGGCAATGGCTTTAAACATGAATGCTTACGTAGACGAGTTCATGGAAGCTCT 
TGAACCATTCATGAAGGTAACTTCATCTTCTTCTACTTCGAATTCATCAAATCCAAAACC 
ATTAACTCCTAATTTCATCCCTAATAATGACCAAGTCTTACCGGTATCTAACCAAACCGG 
TCCGATTGGGCTAAACCAGCTCACTCCAACACAAATCCTCC7UVATTCAGACAGAGTTACA 
TCTCCGGCAAAACCAATCTCGTCGTCGCGCTGGTAGTCATCTTCTCACCGCTAAACCAAC 
CTCAATGAAGAAAATCGACGTAGCAACTAAACCGGTTAAACTATACCGAGGCGTAAGACA 
GAGGCAATGGGGTAAATGGGTAGCTGAGATTCGGCTACCTAAAAACCGAACCCGGTTATG 
GCTCGGTACGTTCGAAACGGCTCAAGAAGCTGCATTAGCTTACGATCAAGCAGCTCATAA 
GATCAGAGGAGACAACGCTCGTCTCAATTTCCCAGACATTGTTCGTCAAGGACACTATAA 
ACAGATATTGTCTCCGTCTATCAACGCAAAGATCGAATCCATCTGCAATAGTTCTGATCT 
TCCACTGCCTCAGATCGAGAAACAGAACAAAACAGAGGAGGTGCTCTCTGGTTTTTCCAA 
ACCGGAGAAAGAACCGGAATTTGGGGAGATATACGGATGCGGATACTCGGGCTCATCTCC 
TGAGTCGGATATAACGTTGTTGGATTTCTCAAGCGACTGTGTGAAAGAAGATGAGAGTTT 
CTTGATGGGTTTGCACAAGTATCCTTCTTTGGAGATTGATTGGGACGCTATAGAGAAACT 
CTTCTGAATCCATTTTATCTTTTTGATTCATTTGTCTCTAAATTGTAGAATTTTATTTTC 
AGAGCTTTGTAAGGGAAGTTCTTGAATGAGAGTTGCAGAGGACTAGTGGAACCTAACTCT 
GTTTTCTTTTGTAAGTATTGTTTATAATGGGCCGTTGAATGGGCCTTATTGATTTAAACA 
GCCCAAGTTTTTAAAAAAAAAAAAAAAAAAAAAAAAA 

>G439 Amino Acid Sequence (domain in AA coordinates: 110-177) 

MAMALNMNAYVDEFMEALEPFMKVTSSSST^ 

GLNQLTPTQ ILQIQTELHLRQNQS RRRAGSHLLTAK^ 

WGKWAEIRLPKNRTRLWLGTFETAQEAALAYDQAAHKIRGDNARLNFPDIVRQGHYKQI 
LSPSINAKIESICNSSDLPLPQIEKQNKTEEVLSGFSKPEKEPEFGEIYGCGYSGSSPES 
DITLLDFSSDCVKEDESFLMGLHKYPSLEIDWDAIEKLF* 
>G470 (1..2580) 

ATGGCGAGTTCGGAGGTTTCAATGAAAGGTAATCGTGGAGGAGATAACTTCTCCTCCTCT 
GGTTTTAGTGACCCTAAGGAGACTAGAAATGTCTCCGTCGCCGGCGAGGGGCAAAAAAGT 
AATTCTACCCGATCCGCTGCGGCTGAGCGTGCTTTGGACCCTGAGGCTGCTCTTTACAGA 
GAGCTATGGCACGCTTGTGCTGGTCCGCTTGTGACGGTTCCTAGACAAGACGACCGAGTC 
TTCTATTTTCCTCAAGGACACATCGAGCAGGTGGAGGCTTCGACGAACCAGGCGGCAGAA 
CAACAGATGCCTCTCTATGATCTTCCGTCAAAGCTTCTCTGTCGAGTTATTAATGTAGAT 
TTAAAGGCAGAGGCAGATACAGATGAAGTTTATGCGCAGATTACTCTTCTTCCTGAGGCT 
AATCAAGACGAGAATGCAATTGAGAAAGAAGCGCCTCTTCCTCCACCTCCGAGGTTCCAG 
GTGCATTCGTTCTGCAAAACCTTGACTGCATCCGAC^CAAGTACACATGGTGGATTTTCT 
GTTCTTAGGCGACATGCGGATGAATGTCTCCCACCTCTGGATATGTCTCGACAGCCTCCC 
ACTCAAGAGTTAGTTGCAAAGGATTTGC^TGCAAATGAGT^ 

CGGGGTCAACCACG8AGGCATTTGCTACAGAGTGGGTGGAGTGTGTTTGTTAGCTCCAAA 

AGGCTAGTTGCAGGCGATGCGTTTATATTTCTAAGGGGCGAGAATGGAGAATTAAGAGTT 

GGTGTAAGGCGTGCGATGCGACAACAAGGAAACGTGCCGTCTTCTGTTATATCTAGCCAT 

AGCATGCATCTTGGAGTACTGGCCACCGCATGGCATGCCATTTCAACAGGGACTATGTTT 

ACAGTCTACTACAAACCCAGGACGAGCCCATCTGAGTTTATTGTTCCGTTCGATCAGTAT 

ATGGAGTCTGTTAAGAATAACTACTCTATTGGCATGAGATTCAAAATGAGATTTG 

GAAGAGGCTCCTGAGCAGAGGTTTACTGGCACAATCGTTGGGATTGAAGAGTCTGATCCT 

ACTAGGTGGCCAAAATCAAAGTGGAGATCCCTCAAGGTGAGATGGGATGAGACTTCTAGT 

ATTCCTCGACCTGATAGAGTATCTCCGTGGAAAGTAGAGCCAGCTCTTGCTCCTCCTGCT 

TTGAGTCCTGTTCCAATGCCTAGGCCTAAGAGGCCCAGATCAAATATAGCACCTTCATCT 
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CCTGACTCTTCGATGCTTACCAGAGAAGGTACAACTAAGGCAAACATGGACCCTTTACCA 
GCAAGCGGACTTTCAAGGGTCTTGCAAGGTCAAGAATACTCGACCTTGAGGACGAAACAT 
ACTGAGAGTGTAGAGTGTGATGCTCCTGAGAATTCTGTTGTCTGGCAATCTTCAGCGGAT 
GATGATAAGGTTGACGTGGTTTCGGGTTCTAGAAGATATGGATCTGAGAACTGGATGTCC 
TCAGCCAGG CATGAACCTACTTACACAG ATTTGCTCTCCGG CTTTGGGACTAACATAGAT 
CCATCCCATGGTCAG CGG ATACCTTTTTATGACCATTCATCATCACCTTCTATGCCTG CA 
AAGAGAATCTTGAGTGATTCAGAAGGCAAGTTCGATTATCTTGCTAACCAGTGGCAGATG 
ATACACTCTGGTCTCTCCCTGAAGTTACATGAATCTCCTAAGGTACCTGCAGCAACTGAT 
GCGTCTCTCCAAGGGCGATGCAATGTTAAATACAGCGAATATCCTGTTCTTAATGGTCTA 
TCGACTGAGAATGCTGGTGGTAACTGGCCAATACGTCCACGTGCTTTGAATTATTATGAG 
GAAGTGGTCAATGCTCAAGCGCAAGCTCAGGCTAGGGAGCAAGTAACAAAACAACCCTTC 
ACGATACAAGAGGAGACAGCAAAGTCAAGAGAAGGGAACTGCAGGCTCTTTGGCATTCCT 
CTGACCAACAACATGAATGGGACAGACTCAACCATGTCTCAGAGAAACAACTTGAATGAT 
GCTGCGGGGCTTACACAGATAGCATCACCAAAGGTTCAGGACCTTTCAGATCAGTCAAAA 
GGGTCAAAATCAACAAACGATCATCGTGAACAGGGAAGACCATTCCAGACTAATAATCCT 
CATCCGAAGGATGCTCAAACGAAAACCAACTCAAGTAGGAGTTGCACAAAGGTTCACAAG 
CAGGGAATTGCACTTGGCCGTTCAGTGGATCTTTCAAAGTTCCAAAACTATGAGGAGTTA 
GTCGCTGAGCTGGACAGGCTGTTTGAGTTCAATGGAGAGTTGATGGCTCCTAAGAAAGAT 
TGGTTGATAGTTTACACAGATGAAGAGAATGATATGATGCTTGTTGGTGACGATCCTTGG 
CAGGAGTTTTGTTGCATGGTTCGCAAAATCTTCATATACACGAAAGAGGAAGTGAGGAAG 
ATGAACCCGGGGACTTTAAGCTGTAGGAGCGAGGAAGAAGCAGTTGTTGGGGAAGGATCA 
GATGCAAAGGACGCCAAGTCTGCATCAAATCCTTCATTGTCCAGCGCTGGGAACTCTTAA 
>G47 0 Amino Acid Sequence (domain in AA coordinates: 61-393) 

MASSEVSMKGNRGGDNFSSSGFSDPKETRNVSVAGEGQKSNSTRSAAAERALDPEAALYR 
ELWHACAGPLVTVPRQDDRVFYFPQGH I EQVEASTNQAAEQQMPL YDLPS KLLCRVINVD 
LKAEADTDEVYAQITLLPEANQDENAIEKEAPLPPPPRFQVHSFCKTLTASDTSTHGGFS 
VLRRHADECLPPLDMSRQPPTQELVAKDLHANEWRFRHIFRGQPRRHLLQSGWSVFVSSK 
RLVAGDAFI FLRGENGELRVGVRRAMRQQGNVPS S VI S SHSMHLGVLATAWHAI S TGTMF 

TVYYKPRTSPSEFIVPFDQYMESVKNNYSIGMRFKMRFEGEEAPEQRFTGTIVGIEESDP 
TRWPKSKWRSLKVRWDETSSIPRPDRVSPWKVEPAIiAPPALSPVPMPRPKRPRSNlAPSS 
PDSSMLTREGTTKANMDPLPASGLSRVLQGQEYSTLRTKHTESVECDAPENSVWQSSAD 
DDKVDWSGSRRYGSENWMSSARHEPTYTDLLSGFGTNIDPSHGQRIPFYDHSSSPSMPA 
KRILSDSEGKFDYLANQWQMIHSGLSLKLHESP 

STENAGGNWPIRPRALNYYEEWNAQAQAQAREQVTKQPFTIQEETAKSREGNCRLFGIP 
kTNNMNGTOSTMSQRNNLNDAAGLTQIASPI^ 

HPKDAQTKTNSSRSCTKVHKQGIALGRSVDLSKFQNYEELVAEbDRLFEFNGELimP^ 

WLIVYTDEENDMMLVGDDPWQEFCCMVRKIFIYTKEEVRKMNPGTLSCRSEEEAWGEGS 
DAKDAKS ASNPSLSSAGNS * 

>G652 (1..606) 

atgagcggaggaggagacgtgaacatgagtggtggagacagacgcaagggaacggtgaag 

tggtttgatacacagaaggggtttggtttcatcacacctagcgacggtggtgacgatctc 

ttcgttcaccagtcttccatcagatctgaaggatttcgtagcctcgcagctgaggaatct 

gttgagttcgacgttgaggttgacaactccggccgtcccaaggctattgaagtgtctgga 

cccgacggtgctcccgttcagggtaacagcggtggtggtggttcatctggtggacgcggt 

ggttttggcggcggtggtggaagaggagggggacgtggtggaggaagctacggaggaggt 

tatggtggaagaggaagcggtggccgtggaggaggtggtggtgataattcttgctttaag 

tgcggtgaaccaggtcacatggcgagagaatgctctcaaggtggtggaggatacagcgga 

ggcgggggtggtggaaggtacgggtctggcggcggcggaggaggaggtggtggtggctta 

agctgctacagctgtggagagtctgggcactttgcaagggattgcactagcggtggtgct 
cgttga . ~ 

>G652 Amino Acid Sequence (domain in AA coordinates : 28 -49, 137-151 182-196) 

MSGGGDVM4SGGDRRKGT^^CWFDTQKGFGFITPSDGGDDLFVHQSSIRSEGFRSIlAAEES 

VEFDVEVDNSGRPKAIEVSGPDGAPVQGNSGGGGSSGGRGGFGGGGGRGGGRGGGSYGGG 

YGGRGSGGRGGGGGDNSCFKCGEPGHMARECSQGGGGYSGGGGGGRYGSGGGGGGGGGGL 

SCYSCGESGHFARDCTSGGAR* 

>G671 (61.. 1119) 

TTCACTTGAGAACAACCCCCTTTGAACTCGATCAAGAAAGCTAAGTTTGAAGAATCAAGA 
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ATGGTGCGGACACCGTGTTGCAAAGCCGAACTAGGGTTAAAGAAAGGAGCTTGGACTCCC 
GAGGAAGATCAGAAGCTTCTCTCTTACCTTAACCGCCACGGTGAAGGTGGATGGCGAACT 
CTCCCCGAAAAAGCTGGACTCAAGAGATGCGGCAAAAGCTGCAGACTGAGATGGGCCAAT 
TATCTTAGACCTGACATCAAAAGAGGAGAGTTCACTGAAGACGAAGAACGTTCAATCATC 
TCTCTTCACGCCCTTCACGGCAACAAATGGTCTGCTATAGCTCGTGGACTACCAGGAAGA 
ACCGATAACGAGATCAAGAACTACTGGAACACTCATATCAAAAAACGTTTGATCAAGAAA 
GGTATTGATCCAGTTACACACAAGGGCATAACCTCCGGTACCGACAAATCAGAAAACCTC 
CCGGAGAAACAAAATGTTAATCTGACAACTAGTGACCATGATCTTGATAATGACAAGGCG 
AAGAAGAACAACAAGAATTTTGGATTATCATCGGCTAGTTTCTTGAACAAAGTAGCTAAT 
AGGTTCGGAAAGAGAATCAATCAGAGTGTTCTGTCTGAGATTATCGGAAGTGGAGGCCCA 
CTTGCTTCTACTAGTCACACTACTAATACTACAACTACAAGTGTTTCCGTTGACTCTGAA 
TCAGTTAAGTCAACGAGTTCTTCCTTCGCACCAACCTCGAATCTTCTCTGCCATGGGACC 
GTTGCAACAACACCAGTTTCATCGAACTTTGACGTTGATGGTAACGTTAATCTGACGTGT 
TCTTCGTCCACGTTCTCTGATTCCTCCGTTAACAATCCTCTAATGTACTGCGATAATTTC 
GTTGGTAATAACAACGTTGATGATGAGGATACTATCGGGTTCTCCACATTTCTGAATGAT 
GAAGATTTCATGATGTTGGAGGAGTCTTGTGTTGAAAACACTGCGTTCATGAAAGAACTT 
ACGAGGTTTCTTCACGAGGATGAAAACGACGTCGTTGATGTGACGCCGGTCTATGAACGT 
CAAGACTTGTTTGACGAAATTGATAACTATTTTGGATGAGTGAAACTCATAATCGATGAA 
TCCCACGTGACCATGTCAATATGATGTCTATGGATATGTTACCTTGATGATGTTGATGGT 
AATAATAATAAATAATAGATGGTGATGATGACCATGCATGAATCATGAATGTAGTTCGTG 



TGTAAATGGATTATAAATGGTGATGTAATAATTATAATGTTAAAAAAAAAAAAAAAAAAA 
AAAA 

>G671 Amino Acid Sequence (domain in AA coordinates: 15-115) 
MVRTPCCKAELGLKKGAWTPEEDQKLLSYLNI^GEGGWRTLPEKAGLKRCGKSCRLRWAN 
YLRPDI KRGEFTEDEERS 1 I SLHALHGNKWS AI ARGLPGRTDNE I KNYWNTHI KKRL I KK 
GIDPVTHKGITSGTDKSENLPEKQNVNLTTSDH^^ 

RFGKRINQSVLSEIIGSGGPLASTSHTTNTTTTSVSVDSESVKSTSSSFAPTSNLLCHGT 
VATTPVSSNFDVDGNVNLTCSSSTFSDSSVNNPLMYCDN^ 
EDFMMLEESCVENTAFMPCELTRFLHEDENDVVDVTPVYERQDLFDEIDNYFG* 
>G779 (110.. 712) 

GACATGCATGTAAGCATTCGGTTAATTAATCGAGTCAAAGATATATATCAGTAAATACAT 

ATGTGTATATTTCTGGAAAAAGAATATATATATTGAGAAATAAGAAAAGATGAAAATGGA 

AAATGGTATGTATAAAAAGAAAGGAGTGTGCGACTCTTGTGTCTCGTCCAAAAGCAGATC 

CAACCAC^GCCCOWU^GAAGC^TGATGGAGCCTC^GCCTCACCATCTCCTC^TGGATTG 

GAACAAAGCTAATGATCTTCTCACACAAGAACACGCAGCTTTTCTCAATGATCCTCACCA 

TCTCATGTTAGATCCACCTCCCGAAACCCTAATTCACTTGGACGAAGACGAAGAGTACGA 

TGAAGACATGGATGCGATGAAGGAGATGCAGTACATGATCGCCGTCATGCAGCCCGTAGA 

CATCGACCCTGCCACGGTCCCTAAGCCGAACCGCCGTAACGTAAGGATAAGCGACGATCC 

TCAGACGGTGGTTGCTCGTCGGCGTCGGGAAAGGATCAGCGAGAAGATCCGAATTCTCAA 

GAGGATCGTGCCTGGTGGTGCGAAGATGGACACAGCTTCCATGCTCGACGAAGCCATACG 

TTACACCAAGTTCTTGAAACGGCAGGTGAGGATTCTTCAGCCTCACTCTCAGATTGGAGC 

TCCTATGGCTAACCCCTCTTACCTTTGTTATTACCACAACTCCCAACCCTGATGAACTAC 

ACAGAAGCTCGCTAGCTAGACATTTGGTGTCATCCTCTCAACCTTT 

>G779 Amino Acid Sequence (domain in AA coordinates: 126-182) 

MKMENGMYKKKGVCDSCVS S KSRSNHS PKRSMMEPQPHHLLMDWNKANDLLTQEHAAFLN 

DPHHLMLDPPPETL IHLDEDEEYDEDl^AMKEMQYMIAVMQPVDIDPATVPKPNRRNVR I 

SDDPQTVVARRRRERISEKIRILKRIVPGGAKMDTASMLDEAIRYTKFLKRQVRILQPHS 

QIGAPMANPSYLCYYHNSQP* 

>G962 (148.. 1392) 

CGTCGACTCTCTACTCAACACC^CTCAATTTCATCTCTCTTTTTCCCTTCCATTGTTAGT 

ATAAAAACCAAGCAAACCCTTAATCACTTTTCATGATCATATATCACCT^ 

CATACACATATCTAGTCTTTTTGATATATGGCAATTGTATCCTCCACAACAAGCATCATT 

CCCATGAGTAACCAAGTCAACAATAACGAAAAAGGTATAGAAGACAATGATCATAGAGGC 

GGCCAAGAGAGTCATGTCCAAAATGAAGATGAAGCTGATGATCATGATCATGACATGGTC 

ATGCCCGGATTTAGATTCCATCCTACCGAAGAAGAACTCATAGAGTTTTACCTTCGCCGA 

AAAGTTGAAGGCAAACGCTTTAATGTAGAACTCATCACTTTCCTCGATCTTTATCGCTAT 
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GATCCTTGGGAACTTCCTGCTATGGCGGCGATAGGAGAGAAAGAGTGGTACTTCTATGTG 

CCAAGAGATCGGAAATATAGAAATGGAGATAGACCGAACCGAGTAACGACTTCAGGATAT 

TGGAAAGCCACCGGAGCTGATAGGATGATCAGATCGGAGACTTCTCGGCCTATCGGATTA 

AAGAAAACCCTAGTTTTCTACTCTGGTAAAGCCCCTAAAGGCACTCGTACTAGTTGGATC 

ATGAACGAGTATCGTCTTCCGCACCATGAAACCGAGAAGTACCAAAAGGCTGAAATATCA 

TTGTGCCGAGTGTACAAAAGGCCAGGAGTAGAAGATCATCCATCGGTACCACGTTCTCTC 

TCCACAAGACATCATAACCATAACTCATCGACATCATCCCGTTTAGCCTTAAGACAACAA 

CAACACCATTCATCCTCCTCTAATCATTCCGACAACAACCTTAACAACAACAACAACATC 

AACAATCTCGAGAAGCTCTCCACCGAATATTCCGGCGACGGCAGCACAACAACAACGACC 

ACAAACAGTAACTCTGACGTTACCATTGCTCTAGCCAATCAAAACATATATCGTCCAATG 

CCTTACGACACAAGCAACAACACATTGATAGTCTCTACGAGAAATCATCAAGACGATGAT 

GAAACTGCCATTGTTGACGATCTTCAAAGACTAGTTAACTACCAAATATCAGATGGAGGT 

AACATCAATCACCAATACTTTCAAATTGCTCAACAGTTTCATCATACTCAACAACAAAAT 

GCTAACGCAAACGCATTACAATTGGTGGCTGCGGCGACTACAGCGACAACGCTAATGCCT 

CAAACTCAAGCGGCGTTAGCTATGAACATGATTCCTGCAGGAACGATTCCAAACAATGCT 

TTGTGGGATATGTGGAATCCAATAGTACCAGATGGAAACAGAGATCACTATACTAATATT 

CCTTTTAAGTAATTTAATTAGATCATGATTATTATCCATGACAATAATTAATGCTGCTTT 
GCGC 

>G962 Amino Acid Sequence (domain in AA coordinates: 53-175) 
MAIVSSTTS I IPMSNQVNXWEKGIEDNDHRGGQESHVQNEDEADDHDHDMVMPGFRFHPT 
EEELIEFYLRRKVTIGKRFNVELITFLDLYRYDPW 

DRPNRVTTSGYWKATGADRMIRSETSRPIGLKKTLVFYSGKAPKGTRTSWIMNEYRLPHH 
ETEKYQKAEISI»CRVYKRPGVEDHPSVPRSLSTRHHNHNSSTSSRLALRQQQHHSSSSNH 
SDNNLNNlSflWINNLEKLSTEYSGD^^ 

IVSTRNHQDDDETAIVDDLQRLVNYQISDGGNINHQYFQIAQQFHHTQQQNANAMALQLV 
AAATTATTLMPQTQAALAMNM I PAGTI PNNALWDMWiNPI VPDGNRDHYTNI PFK* 
>G977 (46.. 591) 

CACCAAACTCACCTGAAACCCTATTTCCATTTACCATTCACACTAATGGCACGACCACAA 

CAACGCTTTCGAGGCGTTAGACAGAGGCATTGGGGCTCTTGGGTCTCCGAAATTCGTCAC 

CCTCTCTTGAAAACAAGAATCTGGCTAGGGACGTTTGAGACAGCGGAGGATGCAGCAAGG 

GCCTACGACGAGGCGGCTAGGCTAATGTGTGGCCCGAGAGCTCGTACTAATTTCCCATAC 

AACCCTAATGCCATTCCTACTTCCTCTTCCAAGCTTCTATCAGCAACTCTTACCGCTAAA 

CTCCACAAATGCTACATGGCTTCTCTTCAAATGACCAAGCAAACGCAAACACAAACGCAA 

ACGCAGACCGCAAGATCACAATCCGCGGACAGTGACGGTGTGACGGCTAACGAAAGTCAT 

TTGAACAGAGGAGTAACGGAGACGACAGAGATCAAGTGGGAAGATGGAAATGCGAATATG 

CAACAGAATTTTAGGCCATTGGAGGAAGATCATATCGAGCAAATGATTGAGGAGCTGCTT 

CACTACGGTTCCATTGAGCTTTGCTCTGTTTTACCAACTCAGACGCTGTGAGAAATGGCC 

TTGTCGTTTTAGCGTATTCTTTTC^TTTTTATTTTTGTTTCCACAAAAA^ 

GTGATGAGAGTAGTAGTGAGAGAAGGCTAATTTCAAGACATTTTGATCTGAATTGGCCTC 

TTTTGAAACACTGATTCTAGTTTCTATAAGAGCAATCGATCATATGCTATGTTATGTATA 

GTATTATAAAAAAATGTTATTTTCTGATTNAAAAAAAAAAAAAAAAAAAAAAA 

>G977 Amino Acid Sequence (domain in AA coordinates: 5-72) 

MARPQQRFRGTOQRHWGSWSEIRHPLLKTRIWLGTFETAEDAARAYDEAARLMCGPRAR 

TNFPYNPNAIPTSSSKLLSATLTAKLHKCYMASLQMTOQTQTQTQTQTARSQSADSDGVT 

ANESHLNRGVTETTEIKWEDGNANMQQNFRPLEEDHIEQMIEELLHYGSIELCSVLPTQT 
Ii* 

>G1063 (241.. 966) 

CTATTGCTTGAGTTCTGATTGGGCACAGTAGTACCATTGCCATTTCTCTCACACATACCG 
TCTCTTTCTCTCATCATC^T^ 

GTAAGCTTTTCACC^GTTTCTCTCCATACCCATTTTATC^GCTTCTCCATATCrTTCTCT 

ATGGATTCTGACATAATGAACATGATGATG(^TC^GATGGAGAAGCTTCCTGAGTTTTGT 

AACCCTAATTCCTCTTTCTTCTCTCCCGACCAC^CAACACTTACCCTTTTCT 

TC(^CTC^TTACCAGTCCGATCACTC^TGACCAACGAACC^GGTTTCCGCTACGGTTCC 

GGTTTACTCACTAACCCTTCTTCTATCTCTCCCAACACAGCT^ 

GACAAAAGAAACAACAGTAACAACAACAATAATGG CACGAACATGG CAGCTATGCGAGAG 
ATGATCTTCCGTATCGCCGTGATGCAACCGATCCATATCGATCCCGAGGCGGTTAAGCCA 
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CCGAAGAGGAGGAACGTCAGGATCTCTAAAGATCCTCAAAGCGTGGCGGCTAGGCATAGA 
AGGGAGAGAATAAGCGAGAGGATTCGGATTTTGCAACGGCTTGTTCCTGGTGGGACGAAG 
ATGGATACAGCTTCGATGCTCGATGAAGCAATTCATTATGTGAAGTTTTTAAAGAAACAG 
GTGCAGTCTCTGGAGGAGCAGGCGGTGGTTACTGGCGGAGGGGGAGGAGGAGGAGGAAGG 
GTTTTGATCGGTGGAGGTGGAATGACGGCGGCGAGTGGTGGTGGTGGCGGCGGGGGAGTG 
GTTATGAAAGGGTGTGGAACAGTGGGGACTCATCAGATGGTGGGCAATGCACAGATTCTT 
AGATGATGATGATTTTTAATTTTATTATTATTATATTAATGTTGGAGAAAAAGAGAAAAA 
TGATTCTGGAGAGGGAAGCCAAGTAATTTATGTGAGAGTCTTTAATTTAACTTTATTTTC 
TTGTTTAGATAATGTGTAATGATGGTTTTTAAAGCCAAAGACTCTCCATGGTTGTTGGAG 
CGAGTTTG 

>G1063 Amino Acid Sequence (domain in aa coordinates: 131-182) 

MDSDIMNMMMHQMEKLPEFCNPNSSFFSPDHNNTYPFLFNSTHYQSDHSMTNEPGFRYGS 

GLLTNPSSISPNTAYSSVFLDKRNNSN^ 

PKRRNVRISKDPQSVAARHRRERISERIRILQRLVPGGTKMDTASMLDEAIHYVKFLKKQ 
VQSLEEQAWTGGGGGGGGRVLIGGGGMTAASGGGGGGGVVMKGCGTVGTHQMVGNAQIL 
R* 

>G1140 (67.. 729) 

ATCCAAGATCCTCCAACTCACAGAAAGGCAGATTCAAGAACAGTAGTGAAGGAGAGATCT 

GGTAAAATGGCGAGAGAGAAGATAAGGATAAAGAAGATTGATAACATAACAGCGAGACAA 

GTTACTTTCTCAAAGAGAAGAAGAGGAATCTTCAAGA7VAGCCGATGAACTTTCAGTTCTT 

TGCGATGCTGATGTTGCTCTCATCATCTTCTCTGCCACCGGAAAGCTCTTCGAGTTCTCC 

AGCTCAAGAATGAGAGACATATTGGGAAGGTATAGTCTTCATGCAAGTAACATCAACAAA 

TTGATGGATCCACCTTCTACTCATCTCCGGCTTGAGAATTGTAACCTCTCCAGACTAAGT 

AAGGAAGTCGAAGACAAAACCAAGCAGCTACGGAAACTGAGAGGAGAGGATCTTGATGGA 

TTGAACTTAGAAGAGTTGCAGCGGCTGGAGAAACTACTTGAATCCGGACTTAGCCGTGTG 

TCTGAAAAGAAGGGCGAGTGTGTGATGAGCCAAATTTTCTCACTTGAGAAACGGGGATCG 

GAATTGGTGGATGAGAATAAGAGACTGAGGGATAAACTAGAGACGTTGGAAAGGGCAAAA 

CTGACGACGCTTAAAGAGGCTTTGGAGACAGAGTCGGTGACCACAAATGTGTCAAGCTAC 

GACAGTGGAACTCCCCTTGAGGATGACTCCGACACTTCCCTGAAGCTTGGGCTTCCATCT 

TGGGAATGAATCTGAGAGAGAGAAAGATCCAGCAGAGTTGACTTCGATGGAAGCCCACAA 

ATATTAAGTCTACCTTTTCCCTTTCTTTTCTTTGAATAAGTGTTGAAAAAGAATTGAGAT 

GGGAAGGATGAATTCTCATTGCATTGCAGAGAAGCAAGTTTCAGATATTGTACGTGTTAT 

TGGGTCTTTATAACTATTTTTCTCCCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

>G1140 Amino Acid Sequence (conserved domain in AA coordinates:2-57) 

MAREKIRIKKIDNITARQVTFSKRRRGIFKKADELSvliCDADVALIIFSATGKLFEFSSS 

RMRMLGRYSLHASNINKLMDPPSTHLRIiENC^ 

LEELQRLEKLLESGLSRVSEKKGECVMSQIFSLEKRGSELVT>ENKRLRDKLETLERAKLT 

TLKEALETESVTTNVSSYDSGTPLEDDSDTSLKLGLPSWE* 

>G1425 (43.. 1005) 

ACTCTCTCAAACCATAAAAAATATTCTCCGATCATCATTTTAATGGAGAGTACAGATTCT 

TCCGGTGGTCCTCCGCCGCCGCAACCAAACCTCCCTCCAGGATTCCGGTTTCATCCAACA 

GACGAAGAACTTGTAATTCATTACCTCAAACGCAAAGC^GATTCTGTTCCTTTACCAGTC 

GCGATCATCGCCGACGTTGATCTTTACAAATTTGATCCATGGGAACTTCCCGCGAAAGCT 

TCGTTTGGAGAACAAGAATGGTATTTTTTCAGTCCAAGAGATCGGAAATATCCCAACGGA 

GCTAGACCTAACCGAGCTGCGACTTCCGGTTATTGGAAAGCGACTGGTACAGATAAACCG 

GTGATTTCAACCGGCGGTGGTGGTAGTAAAAAAGTGGGAGTTAAAAAGGCTCTAGTGTTT 

TACAGTGGTAAACCACCAAAAGGAGTTAAATCAGATTGGATTATGCATGAATATCGGTTA 

ACTGATAATAAACCIACTCACATTTGTGACTTCGGCAACAAGAAAA^ 

GATGATTGGGTGTTGTGTCGTATCTACAAGAAAAACAATAGTACAGCATCTAGACATCAT 

CATCATCTTCATCATATTCATCTAGATAATGATCATCATCGTCATGATATGATGATTGAT 

GATGATCGATTCCGTCATGTTCCTCCTGGTCTTCACTTCCCGGCGATTTTTTCTGACAAT 

AATGATCCGACGGCTATATATGATGGTGGCGGCGGCGGATACGGAGGTGGAAGTTACTCG 

ATGAATCATTGTTTCGCATCTGGATCAAAGCAGGAGCAGTTGTTTCCACCGGTGATGATG 

ATGACTAGTCTAAATCAAGATTCCGGTATTGGATCGTCGTCGTCACCTAGCAAGAGATTT 

AACGGCGGCGGCGTTGGAGATTGTTCGACTTCTATGGCGGCGACGCCGTTAATGCAGAAC 

CAAGGTGGGATTTACCAATTGCCTGGTTTGAATTGGTAT^ 

AAGAATTTTTAAAATTTGTGTATATATATACGGTTTGAGTGATTAGGGGGCATTGGGGGA 



31 



BNSDOCID: <WO__03013227A2_L> 



WO 03/013227 PCT7US02/25805 

32/286 



TTTATTTACGGTTGATTATTATTGTAGTGTTATAGAACTAAGGAGATTAAATTAAATAGA 
TTGGAGGAAAAAAAAAAAAAAAAA 

>G1425 Amino Acid Sequence (domain in AA coordinates: 20-173) 

MESTDSSGGPPPPQPNLPPGFRFHPTDEELVIHYLKRKADSVPLPVAIIADVDLYKFDPW 

ELPAKASFGEQEWYFFSPRDRKYPNGARPNRAATSGYWKATGTDKPVISTGGGGSKKVGV 

KKALVFYSGKPPKGVKSDWIMHEYRLTONKPTHICDFGNKKNSLRLDDWLCRIYKia^S 

TASRHHHHLHHIHLDNDHHRHDMMIDDDRFRHVPPGLHFPAIFSDNNDPTAIYDGGGGGY 

GGGSYSMNHCFASGSKQEQLiFPPVMMMTSLNQDSGIGSSSSPSKRFNGGGVGDCSTSMAA 

TPLMQNQGGIYQLPGLNWYS* 

>G1449 (105. .581) 

TAGACAGAGAGAAATAGAAATAGAGAGAGAGAGACATGAAGAGCACTCTCAATAGAGAAG 
AGAAGGAAGCATGAAGCTAGCTCTGCAGCTTCAAGGTCTCATTAATGGAGGTCTCTAACT 
CTTGTTCTTCATTTTCTTCATCCTCTGTCGACAGTACTAAACCTTCTCCTTCTGAATCTT 
CTGTTAATCTCTCCCTTAGTCTCACATTTCCTTCTACTTCTCCACAAAGAGAAGCAAGAC 
AAGATTGGCCACCGATAAAGTCTAGATTAAGAGATACACTAAAGGGTCGTCGTCTTCTTC 
GTCGTGGTGATGACACTTCTCTCTTTGTTAAGGTTTATATGGAAGGTGTTCCCATTGGAA 
GAAAACTCGACCTTTGCGTATTCTCAGGCTACGAGAGTCTATTAGAAAATCTCTCTCACA 
TGTTCGATACTTCAATCATCTGCGGTAATCGAGATCGAAAACATCATGTTTTGACATATG 
AAGACAAGGATGGAGATTGGATGATGGTCGGAGATATTCCATGGGATATGTTTCTTGAAA 
CCGTGAGAAGACTAAAGATCACGAGACCGGAGAGGTATTAAAACTTGGATCGGTCAAGGC 
TGTGATTGCGCAGTTACGAGACGTGTAAGATTTAGGCATTGATGAAGAGACTTGAGGCGG 
GACGGAGCTATTGCTGCATATTGCAACAAAGGCCTTGAAGAAGTTGGAGAATTGATTGAT 
GCATATATTTATTTATATGACACCTTTGAGTGTGTTTTTTCTTATAAATAAATCACAATA 
TCCAAGACTTCTCTTTAAA 

>G1449 Amino Acid Sequence (domain in AA coordinates: 48-53,74-107,122-152) 
MEVSNSCSSFSSSSVDSTKPSPSESSVNLSLSLTFPSTSPQREARQDWPPIKSRLRDTLK 
GRRLLRRGDDTSLFVKVYMEGVPIGRKLDLCVFSGYESLLENLSHMFDTS I ICGNRDRKH 
HVLT YEDKDGDWMMVGD I PWDMFLETVRRLKI TRPERY * 
>G1897 (1..678) 

ATGCCTTCTGAATTCAGTGAATCTCGTCGGGTTCCTAAGATTCCCCACGGCCAAGGAGGA 
TCTGTTGCGATTCCGACGGATCAACAAGAGCAGCTTTCTTGTCCTCGCTGTGAATCAACC 
AACACCAAGTTCTGTTACTAGAAGAACTAGAACTTCTCACAACCTCGTCATTTCTGCAAG 
TCTTGTCGCCGTTACTGGACTCATGGAGGTACTCTCCGTGACATTCCCGTCGGTGGTGTT 
TCCCGTAAAAGCTCAAAACGTTCCCGGACTTATTCCTCTGCCGCTACCACCTCCGTTGTC 
GGAAGCCGGAACTTTCCCTTACAAGCTACGCCTGTTCTTTTCCCTCAGTCGTCTTCCAAC 
GGCGGTATCACGACGGCGAAGGGAAGTGCTTCGTCGTTCTATGGCGGTTTCAGCTCTTTG 
ATCAACTACAACGCCGCCGTGAGCAGAAATGGGCCTGGTGGCGGGTTTAATGGGCCAGAT 
GCTTTTGGTCTTGGGCTTGGTCACGGGTCGTATTATGAGGACGTCAGATATGGGCAAGGA 
ATAACGGTCTGGCCGTTTTCAAGTGGCGCTACTGATGCTGCAACTACTACAAGCCAC^TO 
GCTCAAATACCCGCCACGTGGCAGTTTGAAGGTCAAGAGAGCAAAGTCGGGTTCGTGTCT 
GGAGACTACGTAGCGTGA 

>G1897 Amino Acid Sequence (domain in AA coordinates : 34 -62) 

MPSEFSESRRVPKIPHGQGGSVAIPTDQQEQLSCPRCESTNTKFCYYNNYNFSQPRHFCK 

SCRRYWTHGGTLRDIPVGGVSRKSSKRSRTYSSAATTSVVGSRNFPLQATPVLFPQSSSN 

GGITTAKGSASSFYGGFSSLINYNAAVSRNGPGGGFNGPDAFGLGLGHGSYYEDVRYGQG 

ITVWPFSSGATDAATTTSHIAQIPATWQFEGQESKVGFVSGDYVA* 

>G2143 (89.-784) 

TCTTCTTCTTCCTCOVTACCTTATCT 

AACCCTATAAATTCCACAAAAAAGGAGGATGGATAACTCCGACATTCTAATGAACATGAT 
GATGCAGCAGATGGAGAAGCTTCCTGAACACTTCTCTAACTCAAACCCTAACCCTAATCC 
CCATAACATTATGATGCITTCTGAATCCAACACCC^ 

TTCTCATCTCCCATTTGACCAAACCATGCCTCACCACCAACCCGGTTTAAATTO 

CGCCCCCTCCCCGTCATCATCTCTCCCGGAGAAGAGAGGAGGCTGCAGCGACAACGCCAA 

CATGGCGGCGATGAGAGAGATGATCTTTCGAATAGCCGTGATGCAGCCTATACATATTGA 

TCCGGAATCCGTAAAGCCACCAAAGAGAAAGAACGTGAGGATCTCTAAGGATCCACAGAG 

CGTGGCAGCTCGGCATCGAAGGGAGAGGATAAGCGAGCGGATTCGGATTCTTCAGCGGCT 

TGTTCCO^TGGGACrAAGATGGATACGGCGTCGATGCTCGATGAGGCTATCCATTACGT 
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TAAGTTTCTCAAGAAGCAAGTGCAGTCGCTGGAGGAACATGCGGTGGTTAACGGCGGAGG 
AATGACGGCGGTGGCCGGAGGAGCACTTGCGGGTACTGTTGGTGGAGGATATGGAGGAAA 
AGGGTGTGGCATTATGCGGTCTGATCATCACCAGATGCTTGGAAATGCACAGATTCTTAG 
ATGATGATGATGTTGATTTTTAAATATATATCATATGTTTATTAATATGACGGGAAAAAA 
TATTATCGAGGGAGTTGAATTTAGTATCATGAAACTATGAGAGCATTTTTTTTAAATGTT 
TTTATCTTTCCGGGTTTCGATAATGTTTGGGATGGfTAATTAACAATTTAAAAGTCAGAC 
AACTTGGTTGTAAAGACTAAAGAATAAGCATAGTTTATCAATTTATCATTACTAAATGAA 
ATAG 

>G2143 Amino Acid Sequence (domain in aa coordinates: 128-179) 

MDNSDILMNMMMQQMEKLPEHFSNSNPNPNPHNIMMLSESNTHPFFFNPTHSHLPFDQTM 

PHHQPGLNFRYAPSPSSSLPEKRGGCSDNANMAAMREMIFRIAVMQPIHIDPESVKPPKR 

K^^^ISKDPQSVAARHRRERISERIRILQRLVPGGTKMDTASMLDEAIHYVKFLKKQVQS 

LEEHAWNGGGMTAVAGGALAGTVGGGYGGKGCGIMRSDHHQMLGNAQILR* 

>G2535 (1..1005) 

ATGAACATATCAGTAAACGGACAGTCACAAGTACCTCCTGGCTTTAGGTTTCACCCAACC 
GAGGAAGAGCTCTTGAAGTATTACCTCCGCAAGATUU^TCTCTAACATCAAGATCGATCTC 
GATGTTATTCCTGACATTGATCTCAACAAGCTCGAGCCTTGGGATATTCAAGAGATGTGT 
AAGATTGGAACGACGCCGCAAAACGATTGGTACTTTTATAGCCATAAGGACAAGAAGTAT 
CCCACCGGGACTAGAACCAACAGAGCCACCACGGTCGGATTTTGGAAAGCGACGGGACGT 
GACAAGACCATATATACCAATGGTGATAGAATCGGGATGCGAAAGACGCTTGTCTTCTAC 
AAAGGTCGAGCCCCTCATGGTCAGAAATCCGATTGGATCATGCACGAATATAGACTCGAC 
GAGAGTGTATTAATCTCCTCGTGTGGCGATCATGACGTCAACGTAGAAACGTGTGATGTC 
ATAGGAAGTGACGAAGGATGGGTGGTGTGTCGTGTTTTCAAGAAAAATAACCTTTGCAAA 
AACATGATTAGTAGTAGCCCGGCGAGTTCGGTGAAAACGCCGTCGTTCAATGAGGAGACT 
ATCGAGCAACTTCTCGAAGTTATGGGGCAATCTTGTAAAGGAGAGATAGTTTTAGACCCT 
TTCTTAAAACTCCCTAACCTCGAATGCCATAACAACACCACCATCACGAGTTATCAGTGG 
TTAATCGACGACCAAGTCAACAACTGCCACGTCAGCAAAGTTATGGATCCCAGCTTCATC 
ACTAGCTGGGCCGCTTTGGATCGGCTCGTTGCCTCACAGTTAAATGGGCCCAACTCGTAT 
TCAATACCAGCCGTTAATGAGACTTCACAATCACCGTATCATGGACTGAACCGGTCCGGT 
TGTAATACCGGTTTAACACCAGATTACTATATACCGGAGATTGATTTATGGAACGAGGCA 
GATTTCGCGAGAACGACATGCCACTTGTTGAACGGTAGTGGATAA 

>G2535 Amino Acid Sequence (conserved domain in AA coordinates : 11-114) 
MNISVNGQSQVPPGFRFHPTEEELLKYYLRKKISNIKIDLDVIPDIDLNKLEPWDIQEMC 
KIGTTPQNDWYFYSHKDKIOTPTGTRTNRATTVGFWKATGRDKTIYTNGDRIGMRKTLVFY 
KGRAPHGQKSDWIMHEYRLDESVTjISSCGDHDVIW^ 

NMISSSPASSVKTPSFNEETIEQIiLEVMGQSCKGEIVLDPFLKLPNLECHNNTTITSYQW 
LIDDQVKNCHVSKVMDPSFITSWAALDRLVASQLNGPNSYSIPAWETSQSPYH 
CNTGLTPD Y Y I PE I DL WNE AD F ARTTCHLLNGS G * 
>G2557 (94.. 1215) 

TCGACTTCCTGTGAACTCATCTGTTTGTTCTCTTCTTCCGGTTTCACTTTTTCATGTCCT 

GCCGTTATTACAACGAGGATTGTGTTTGATCCGATGGAAGGATTGGAATCTGTGTACGCT 

CAAGCTATGTATGGAATGACACGAGAGAGCAAAATCATGGAGCATCAAGGATCAGATTTG 

ATTTGGGGAGGAAATGAGCTAATGGCTCGAG7UVCTCTGTTCTTCTTCTTCTTATCACCAC 

CAACTCATTAATCCGAATCTTAGCAGCTGTTTCATGTCTGATCTTGGAGTCTTAGGTGAG 

ATTCAACAGCAGCAACATGTTGGCAACAGAGCTAGCTCGATAGATCCATCATCACTCGAT 

TGTTTGTTATCTGCGACGTCGAATAGCAACAACACCTCGACGGAGGACGATGAAGGAATA 

TCTGTGCTTTTCTCAGATTGTCAGACTCTTTGGAGCTTTGGTGGAGTCTCATCTGCAGAG 

TCTGAGAACAGAGAGATCACTACTGAGACGACAACAACGATAAAGCCTAAGCCTTTGAAG 

AGAAACAGAGGAGGAGATGGAGGAACTACTGAGACTACAACAACAACAACAAAACCTAAG 

TCTTTGAAGAGAAACAGAGGAGACGAGACAGGAAGTCACTTTAGTCTTGTTCATCCT 

GATGATTCGGAGAAAGGAGGTTTCAAGCTTATATACGATGAGAATCAATCGAAATCAAAG * 

AAACCAAGAACAGAGAAAGAACGAGGCGGTTCTTCGAACATTAGTTTCCAACATTCAACT 

TGTTTGTCTGACAATGTCGAGCCCGATGCTGAGGCGATTGCACAAATGAAGGAGATGATA 

TACAGAGCGGCTGCATTTAGACCGGTGAATTTCGGGTTAGAGATTGTGGAGAAGCCTAAG 

AGGAAGAACGTCAAGATATCGACGGATCCTCAAACGGTTGCAGCGAGACAGAGAAGGGAG 

AGGATAAGTGAGAAGATTAGGGTTTTACAAACATTGGTTCCAGGTGGGACGAAGATGGAT 

ACTGCATCAATGCTTGATGAAGCTGCTAATTATCTCAAGTTCCTTAGAGCACAAGTAAAA 
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GCTTTAGAAAACTTGAGACCCAAGCTTGACCAAACCAATCTCTCTTTCTCTTCTGCTCCT 

ACATCGTTTCCATTATTCCACCCATCTTTTCTTCCATTGCAAAATCCTAATCAAATCCAT 

CATCCAGAGTGTTGACAGATTATAAACTTTTGAGTTTCATCATCATCAACAGAATCATGG 

CGTCTTGATTGTTTTAGCAGTTCTCAAGAAAGGCAACTTCTGTGACAAGGGTGGTGTCGG 

GCAGTGTTGTTTACACTTTCCAGTCTTTGTTTTGCATTTCTTTTTATATAAAGTTTGTAT 

TTTATATAGAATCTGTGGAATTCGAGGGTTGAAATATTGTGAAAAACAGAGCCGCAAGAG 

GTTAATTACAGTCTCTGCAATATTTTCAACCTTTTATTACTTTATTAGAGTAAAGATAGC 
GT 

>G2557 Amino Acid Sequence {domain in aa coordinates: 278-328) 

MEGLESVYAQT^MYGMTRESKIMEHQGSDLIWGGNELMARELCSSSSYHHQLINPNLSSCF 

MSDLGVLGEIQQQQHVGNRASSIDPSSLDCLLSATSNSNNTSTEDDEGISVLFSDCQTLW 

SFGGVSSAESENREITTETTTTIKPKPLKRNRGGDGGTTETTTTTTKPKSLKRNRGDETG 

SHFSLVHPQDDSEKGGFKLIYDENQSKSKKPRTEKERGGSSNISFQHSTCLSDNVEPDAE 

AI AQMKEM I YRAAAFRPVNFGLEI VEKPKRKNVKI STDPQTVAARQRRER I SEKIRVLQT 

LVPGGTKMDTASMLDEAANYLKFLRAQVKALENLRPKLDQTNLSFSSAPTSFPLFHPSFL 

PLQNPNQIHHPEC* 

>G259 (52.. 786) 

GAGATCTTCTACTACTTGTTTTCTTCAAGAATAATAATTTTCGTTTTATATATGGAAGAT 

GCTGGTGAACATTTACGGTGTAACGATAACGTTAACGACGAGGAGCGTTTGCCATTGGAG 

TTTATGATCGGAAACTCAACATCCACGGCGGAGCTACAGCCGCCTCCACCGTTCTTGGTA 

AAGACATACAAAGTGGTGG AG G ATCCG ACG ACG G ACGGGGTTATATCTTGG AACG AATAC 

GGAACTGGTTTCGTCGTGTGGCAGCCGGCAGAATTCGCTAGAGATCTGTTACCAACACTT 

TTCAAGCATTGCAACTTCTCTAGCTTCGTTCGCCAGCTCAATACTTACGGTTTTCGAAAA 

GTAACGACGATAAGATGGGAATTTAGTAATGAGATGTTTCGAAAGGGGCAAAGAGAGCTT 

ATGAGCAATATCCGAAGAAGGAAGAGCCAACATTGGTCACACAACAAGTCTAATCACCAG 

GTTGTACCAACAACAACGATGGTGAATCAAGAAGGTCATCAACGGATTGGGATTGATCAT 

CACCATGAGGATCAACAGTCTTCCGCCACTTCATCCTCTTTCGTATACACTGCATTACTC 

GACGAAAACAAATGCTTGAAGAATGAAAACGAGTTATTAAGCTGCGAACTTGGGAAAACC 

AAGAAGAAATGCAAGCAGCTTATGGAGTTGGTGGAGAGATACAGAGGAGAAGACGAAGAT 

GCAACTGATGAAAGTGATGATGAAGAAGATGAAGGGCTTAAGTTGTTCGGAGTAAAACTT 

GAATGAAACTAGATTGCTAGATTGATATTCGTAATATACCAGTTTCTTCATATTCTTAGA 

AGTTTTGCATAACTATATATAGTACTCTTTTAAGACATGCAAGATCAGAACATATG 

>G259 Amino Acid Sequence (domain in AA coordinates: 27-131) 

MEDAGEHLRCNDNVNDEERLPLEFMIGNSTSTAELQPPPPFLVKTYKOTEDPTTD 

NEYGTGFVWQPAEFARDLLPTLFKHCNFSSFVRQLNTYGFRKVTTIRWEFSNEMFRKGQ 

RELMSNIRRRKSQHWSHNKSNHQVVPTTTMVNQEGHQRIGIDHHHEDQQSSATSSSFVYT 

ALLDENKCLKNENELLSCELGKTKKKCKQLMELVERYRGEDEDATDESDDEEDEGLKLFG 
VKLE* 

>G353 (82.. 570) 

ACCAAACTCAAAAAACACAAACCACAAGAGGATCATTTCATTTTT^^ 
ATCATOVTCATCAGAAGAAAAATGGTTGCGATATCGGAGATCAAGTCGACGGTGGATGTC 
ACGGCGGCGAATTGTTTGATGCTTTTATCTAGAGTTGGACAAGAAAACGTTGACGGTGGC 
GATCAAAAACGCGTTTTCACATGTAAAACGTGTTTGAAGC^^ 

TTAGGAGGTCACCGTGCGAGTCACAAGAAGCCTAACT^CGACGCITTGTCGTCTGGATTG 

ATGAAGAAGGTGAAAACGTCGTCGCATCCTTGTCCCATATGTGGAGTGGAGTTTCCGATG 

GGACAAGCTTTGGGAGGACACATGAGGAGACACAGGAACGAGAGTGGGGCTGCTGGTGGC 

GCGTTGGTTACACGCGCTTTGTTGCCGGAGCCCACGGTGACTACGTTGAAGAAATCTAGC 

AGTGGGAAGAGAGTGGCTTGTTTGGATCTGAGTCTAGGGATGGTGGACAATTTGAATCTC 

AAGTTGGAGCTTGGAAGAACAGTTTATTGATTTTATTTATTTTCCTT 

ATATTTGTTTCTCTCATTCTTTGAATTTTTCCT 

AGATTTAGGAAACTTTC^TAGAGTGTAATCTTTTCTTTCTGTAAAAATATATTTTACTTG 
TAGCAAA 

>G353 Amino Acid Sequence (domain in aa coordinates: 41-61, 84-104) 

MVAISEIKSTVDVTAANCLMLLSRVGQENVDGGDQKRVFTCKTCLKQFHSFQALGGHRAS 

HKKPNl^ALSSGLMKKVKTSSHPCPICGVEFPMGQALGGHMRRHRWESGAAGGAL^ 

LPEPTVTTLKKSSSGKRVACLDLSLGMVDNLNLKLELGRTVY* 

>G354 (27.. 533) 
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CCTAGAAGTCACTAAGTCGATTCAAAATGGTTGCGAGAAGTGAGGAAATTGTGATAGTGG ' 

AAGAAGATACGACTGCGAAATGTTTGATGTTGTTATCAAGAGTCGGAGAATGCGGCGGCG 

GCTGCGGGGGAGATGAACGTGTTTTCCGATGCAAGACTTGTCTTAAAGAGTTCTCATCGT 

TTCAAGCTTTGGGAGGTCATCGTGCAAGCCACAAGAAACTTATCAACAGTGACAATCCAT 

CACTTCTTGGATCCTTGTCCT^ACAAGAAAACTAAAACGTCTCATCCTTGTCCGATATGTG 

GAGTGAAGTTTCCGATGGGACAAGCTCTTGGTGGTCACATGAGGAGACATAGGAACGAGA 

AAGTCTCAGGCTCGTTGGTTACACGTTCTTTTCTACCGGAGACGACGACGGTGACGGCTT 

TGAAGAAATTTAGTAGTGGGAAGAGAGTGGCTTGTTTGGATTTGGACTTAGATTCGATGG 

AGAGTTTGGTCAATTGGAAGTTGGAGTTGGGAAGAACGATTTCTTGGAGTTAAGTTTTTG 

GGTTGTATACAGTTTCACATGATTTTGTAATCTTTGTTGATCCAATTATCGTACCGATCG 

ATGTGAATATTATTTTGATACAATAAAA 

>G354 Amino Acid Sequence (domain in aa coordinates: 42-62, 88-109) 
MVARSEEI VIVEEDTTAKCLMLLSRVGECGGGCGGDERVFRCKTCLKEFS S FQALGGHRA 
SHKKLINSDNPSLLGSLSNIOCTKTSHPCPICGVKFPMGQALGGHMRRHRNEKVSGSLVTR 
SFLPETTTVTALKICFSSGKRVACLDLDLDSMESLVNWKLELGRTI S WS * 
>G638 (86.. 1861) 

GAATTAAAAGGTTTAACCTTTACCTTTTTTTCCCTTCACTATCGATAATTGATCTTCTCT 
TTCGGCTGAATATAAATCTGAAAAAATGGATCAAGATCAGCATCCTCAGTACGGTATACC 
GGAGCTCCGGCAGCTCATGAAAGGCGGAGGAAGGACGACTACTACAACACCGTCTACTTC 
TTCTCATTTTCCCTCTGATTTCTTCGGTTTTAACCTTGCTCCGGTGCAGCCACCGCCACA 
CCGTCTTCATCAGTTCACTACTGATCAAGATATGGGTTTCTTGCCACGTGGCATACATGG 
ATTGGGTGGAGGTTCTTCAACGGCTGGAAATAACAGTAACTTAAACGCGAGTACTAGTGG 
TGGAGGAGTTGGGTTTAGTGGGTTTCTTGACGGTGGTGGTTTCGGCAGCGGAGTAGGAGG 
AGACGGTGGAGGAACTGGAAGGTGGCCGAGACAAGAAACCCTAACTCTGTTGGAAATTAG 
ATCTCGTCTTGATCATAAATTCAAAGAAGCTAATCATAAAGGACCTCTTTGGGATGAAGT 
TTCTAGGATTATGTCCGAGGAACATGGATACCAAAGGAGTGGGAAGAAATGCAGAGAGAA 
GTTTGAGAATCTGTACAAATACTATAGTAAGACTAAAGAAGGCGAAGCCGGAAGACAAGA 
CGGAAAACATCACAGATTTTTCCGCCAGCTCCAAGCGCTATACGGGGATTCTAATAACTT 
GGTTTCTTGTCCCAATCATAACACGCAGTTCATGAGCAGTGCTCTTCATGGTTTCCATAC 
TCAAAACCCTATGAACGTTGCTACAACAACGTCCAACATCCATAACGTTGATAGTGTTCA 
TGGTTTTCATCAAAGCCTTAGTCTTTCTAACAACTACAACTCCTCCGAGCTTGAGCTGAT 
GACTTCCTCTTCGGAAGGGAATGATTCTAGTAGTAGAAGGAAAAAGAGGAGTTGGAAAGC 
GAAGATAAAGGAGTTCATTGATACGAACATGAAAAGGTTGATAGAGAGGCAAGATGTTTG 
GCTTGAGAAGTTGACAAAGGTTATTGAAGACAAAGAGGAACAACGGATGATGAAAGAAGA 
GGAATGGAGGAAGATTGAAGCTGCAAGGATTGATAAAGAGCATTTGTTTTGGGCTAAAGA 
GAGGGCGAGGATGGAAGCTAGGGATGTTGCGGTGATTGAGGCATTGCAATACTTGACAGG 
AAAGCCATTGATAAAGCCGCTGTGTTCATCCCCGGAAGAGAGGACAAATGGTAATAATGA 
GATCCGAAACAATAGTGAGACACAGAATGAGAATGGAAGCGATCAAACGATGACTAACAA 
TGTTTGTGTTAAAGGAAGTAGTAGCTGCTGGGGTGAGCAAGAGATTTTAAAGCTTATGGA 
GATAAGAACGAGCATGGACTCGACCTTTCAAGAGATATTAGGAGGGTGCTCGGATGAGTT 
TCTATGGGAGGAAATCGCAGCGAAGTTGATTCAGTTAGGGTTTGATCAGAGAAGTGCCTT 
ATTATGCAAGGAAAAGTGGGAATGGATAAGCAATGGAATGAGGAAAGAAAAGAAGCAAAT 
. ' CAACAAGAAAAGAAAGGATAATTCGTCCAGCTGCGGCGTGTACTACCCGAGAAACGAAGA 
AAATCCAATCTACAATAATCGAGAAAGTGGATATAATGATAATGATCCGCATCAAATCAA 
CGAACAAGGCAATGTAGGTTCTTCAACATCAAACGCAAACGCAAACGCAAACGTAACCAC 
TGGAAATCCGAGCGGTGCAATGGCTGCTAGTACAAACTGCTTCCCGTTCTTCATGGGAGA 
TGGAGATCAGAATTTGTGGGAGAGTTATGGTTTGAGGCTCAGTAAAGAAGAGAATCAGTA 
AGTAATTTCTCTTAATGAAGAAGAAGAAGGTAATCATGTGGTTAACTAATTCTTTTGAGT 
TAGCTATATATGAGATAAACCTTGACTTAGCTATTATATGTCACATGCTGCTTAGAATTA 
AGAAATATTTGTTGGQGCTTAACGAATTATATATCAGCATATATAAGATGAGAGTCTAAG 
AATTATATCAAATTAGGCTTTAACCAACGTACGATTATATATTATGTTTTCATGTATTTA 
TTCTGTAAGACTTTTTAATATCAATCTTTCTCTAAA 

>G63 8 Amino Acid Sequence (domain in AA coordinates: 119-206) 

MDQDQHPQYGIPELRQLMKGGGRTTTTTPSTSSHFPSDFFGFNLAPVQPPPHRLHQFTTD 

QDMGFLPRGIHGLGGGSSTAGNNSNIiNASTSGGGVGFSGFLDGGGFGSGVGGDGGGTGRW 

PRQETLTLLEIRSRLDHKFKEANHKGPLWDEVSRIMSEEHGYQRSGKKCREKFENLYKYY 

SKTKEGEAGRQDGKHHRFFRQLQALYGDSNNLVSCPNHNTQFMSSALHGFHTQNPMNVAT 
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ttsnihisttosvhgfhqslslsnntosselelmtsssegndsssrrkkrswkakikefidt 
nmkrlierqdwlekltkviedkeeqrmmkeeewrkieaaridkehlfwakerarmeard 
vaviealqyltgkplikplcsspeertngnweirnnsetqnengsdqtmtnnvcvkgsss 
cwgeqeilklmeirtsmdstfqeilggcsdeflweeiaakliqlgfdqrsallckekwew 
isngmri<ekkqink™kdnssscgvyyprneenpiynnresgywdndphqineqgnvgss 
tsnanananvttgnpsgamaastncfpffmgdgdqnlwesyglrlskeenq* 

>G869 (428.. 1402) 

AGGAACAGTGAAAGGTTCGGTTTTTTGGGTTTCGATCTGATAATCAACAAGAAAAAAGGG 
TTTGATTTATGTCGGCTGGGTTTGAATCGACTGTGATTTTGTCTTTGATTCATATCTCTT 
CTCCGATTTCATCATCATCTTCCCCATCATCGTCGTCTTTGAAATCTTGTCTTCTCAACG 
CTCTTCACTTCTGCTGTAATAAGCAGAGGCTTGTTCTGGAGACTCCTTCTCTTTCCATGC 
GCTTAAGACCCAAAAGGACTTGTTCTAGTGTTGAAGTCTTTGGGGGTTTTCACATAAAGC 
AGCAAAAGTTTTCTTTTTTCATAGTTCGCTGAGAGTTTTGAGTTTTGATACCAAAA7UVGT 
TTTGACCTTTTAGAGTGATTTTTTGTTCTTTCTGTTTTCTGGGTATTTTTGAGGAGTGGG 
TTTAACAATGGTTGCGATTAGAAAGGAACAGTCTTTGAGTGGTGTTAGTAGCGAGATTAA 
GAAGAGAGCTAAGAGAAACACTCTATCGTCCCTTCCTCAAGAAACCCAACCTTTGAGGAA 
AGTCCGTATTATTGTGAATGATCCTTATGCTACTGATGATTCCTCTAGTGATGAGGAAGA 
GCTTAAGGTTCCTAAGCCAAGGAAAATGAAACGTATCGTTCGTGAGATTAACTTTCCTTC 
TATGGAAGTTTCTGAACAGCCTTCTGAGAGTTCTTCTCAGGACAGTACTAAAACTGATGG 
CAAGATAGCTGTGTCAGCTTCTCCTGCTGTTCCTAGGAAGAAGCCTGTTGGTGTTAGGCA 
AAGGAAATGGGGGAAATGGGCTGCTGAGATTAGAGATCCTATTAAGAAAACTAGGACTTG 
GTTGGGTACTTTTGATACTCTTGAAGAAGCTGCTAAAGCTTATGATGCTAAGAAGCTTGA 
GTTTGATGCTATTGTTGCTGGAAATGTGTCCACTACTAAACGTGATGTTTCTTCATCTGA 
GACTAGCCAATGCTCTCGTTCTTCACCTGTTGTTCCTGTTGAGCAAGATGACACTTCTGC 
ATCAGCTCTCACTTGTGTCAACAACCCTGATGACGTCTCGACCGTTGCTCCAACTGCTCC 
AACTCCAAATGTTCCTGCTGGTGGAAACAAGGAAACGTTGTTCGATTTCGACTTTACTAA 
TCTACAGATCCCTGATTTTGGTTTCTTGGCAGAGGAGCAAC^GACCTAGACTTCGATTG 
TTTCCTCGCGGATGATCAGTTTGATGATTTCGGCTTGCTTGATGACATTCAAGGATTCGA 
AGATAACGGTCCAAGTGCGTTACCAGATTTCGACTTTGCGGATGTTGAAGATCTTCAGCT 
AGCTGACTCTAGTTTCGGTTTCCTTGATCAACTTGCTCCTATCAACATCTCTTGCCCATT 
AAAAAGTTTTGCAGCTTCATAGGATCTTGCTTAGTAATGTTAAGTGAGAAGAGTGTTTTG 
TTTTTTCGTTTATGCTTTAGTAATTTAAGACATACAAAAGTGTGTGTTCCGGATTGTAGT 
AAGATCTTAAGACATAAAGCCGGGTTTTGCAATTAGGAATCGAGTTTTAATGAAGTTTTA 
GTTTATGTTTG 

>G869 Amino Acid Sequence (domain in AA coordinates: 109-177) 
MVAIRKEQSLSGVSSEIKKRAKRNTLSS 

VPKPRKMKRIVREINFPSMEVSEQPSESSSQDSTKTDGKIAVSASPAVPRKKPVGVRQRK 

WGKWAAEIRDPIKKTRTWLGTFDTLEEAAKAYDAKKLEFDAIVAGNVSTTKRDVSSSETS 

QCSRSSPWPVEQDOTSASALTCVNNPDDVSTVAPTAPTPNVPAGGNKETLFDFDFTNLQ 

IPDFGFLAEEQQDLDFDCFLADDQFDDFGLLDDIQGFEDNGPSALPDFDFADVEDLQLAD 

SSFGFLDQLAPINISCPLKSFAAS* 

>G1645 (25.. 1104) 

CGTCGACCTCCCAACACTAACTCCATGTTTATAACGGAAAAACAAGTGTGGATGGATGAG 

ATCGTCGCAAGAAGAGCTTCTTCTTCTTGGGACT^ 

CAGCATCATCATCGTCACTGCAACACAAGTCATG 

GGAGATGTAGCGGTTCACGAAGAAGAGAGTAATAATAATAACCCTAATTTCAGTAACAGC 

GAGAGTGGTAAGAAGGAGACAACAGATAGTGGTC^GTCTTGGTCCTCGTCGTCTTCAAAA 

CCATCGGTCTTGGGGAGAGGACATTGGAGACCAGCTGAAGATGTTAAACTCAAAGAGCTT 

GTCTCCATTTACGGCCCACAAAACTGGAACCTCATAGCTGAAAAGCTTCAAGGAAGATCT 

GGGAAGAGCTGTAGACTACGATGGTTTAACCAATTGGACCCGAGGATAAACCGAAGAGCT 

TTCACAGAAGAAGAAGAGGAGAGGCTGATGCAAGCACATAGGCTTTATGGTAACAAATGG 

GCAATGATTGCGAGGCTTTTCCCTGGTAGAACTGATAATTCAGTGAAGAACCATTGGCAT 

GTTGTCATGGCTCGTAAGTATAGAGAACACTCnTCTGCTTACCGTAGGAGAAAGCTTATG 

AGTAATAATCCACTTAAACCTCACCTCACCAATAATCATCATCCTAACCCTAACCCTAAT 

TACCACTCTTTTATCTCCACTAATCATTACraCGCTCAGCCTTTCCCCGAGTT^ 

ACTCATCACCTGGTTAATAATGCCCCTATCACGAGTGACCATAACCAGCTTGTGTTGCCT 

TTCCATTGCTTTCAAGGTTATGAGAACAATGAACCTCCGATGGTTGTGAGTATGTTTGGC 
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AACCAAATGATGGTCGGCGATAACGTTGGTGCCACGTCAGACGCGTTATGCAATATTCCG 
CACATTGACCCTAGTAACCAAGAGAAACCGGAGCCAAATGATGCAATGCATTGGATCGGA 
ATGGACGCGGTAGATGAGGAGGTGTTCGAAAAGGCTAAGCAGCAACCACATTTTTTCGAT 
TTTCTTGGCTTGGGGACGGCGTGAATGTTGAACAAATTGGTGTTAATCAGATAACGACAG 

TGGC 

>G1645 Amino Acid Sequence (domain in AA coordinates: 90-210) 

MFITEKQVWMDEIVARRASSSWDFPFNDINIHQHHHRHCNTSHEFEILKSPLGDVAVHEE 

ESNNNNPNFSNSESGKKETTDSGQSWSSSSSKPSVLGRGHWRPAEDVKLKELVSIYGPQN 

WNLIAEKLQGRSGKSCRLRWFNQLDPRINRRAFTEEEEERLMQAHRLYGNKWAMIARLFP 

GRTDNSVKNHWHWMARKYREHSSAYRRRKLMSNNPLKPHLTNNHHPNPNPNYHS 

HYFAQPFPEFNLTHHLTONAPITSDHNQIiATLPFHCFQGYENNEPPMWSMFGNQMMVGDN 

VGATSDALCNIPHIDPSNQEKPEPNDAMHWIGMDAVDEEVFEKAKQQPHFFDFLGLGTA* 

>G1038 (240.. 1574) 

GCTCGTTTTCAAATTAAAAACAGGGAGAAATTTGGAAATTCCAGTACGACGGGAGATAAA 

ACCTAACATACGCCATGGTGACCGTTATCTAAACTACGCCAAAATATTTGAAGTGTCGTC 

GTTTCATAATAAAACGCAAACAAAAACCCACTCCCACTTTCTCCTTTCCAAAAAAAGAAC 

TCTCGCCACTTTCTCTGCTCTTTTCTTTCTCTCTCTCTTTCTTGTTTTCGCCGGCGATCA 

TGGAGAAAAGCGGCTTCTCTCCCGTCGGTCTAAGGGTTCTTGTCGTAGACGATGATCCAA 

CTTGGCTCAAGATTCTCGAGAAAATGCTCAAGAAGTGTTCTTACGAAGTAACGACCTGTG 

GATTAGCTAGAGAGGCTTTGAGGTTGCTGAGGGAGCGTAAAGATGGATATGATATCGTGA 

TCAGCGATGTGAACATGCCTGACATGGATGGTTTCAAGCTTCTTGAGCATGTTGGTCTTG 

AATTAGACCTCCCTGTAATAATGATGTCGGTGGACGGCGAAACAAGCCGAGTGATGAAGG 

GAGTGCACACGGGAGCTTGTGATTACCTCTTGAAGCCGATAAGAATGAAGGAGTTAAAGA 

TTATATGGCAACATGTTCTGAGAAAGAAGCTTCAAGAAGTGAGAGATATCGAAGGCTGTG 

GATACGAAGGAGGAGCGGATTGGATCACTCGATACGATGAAGCACATTTTCTTGGAGGTG 

GTGAAGATGTTTCTTTTGGGAAAAAGAGAAAAGACTTTGACTTTGAGAAGAAGCTTCTTC 

AAGATGAGAGTGATCCATCATCTTCTTCTTCCAAGAAAGCTAGAGTTGTTTGGTCTTTTG 

AGCTTCATCATAAGTTTGTCAACGCCGTTAACCAAATCGGATGCGATCACAAAGCTGGTC 

CCAAGAAGATATTGGATCTCATGAATGTTCCATGGCTCACTAGAGAAAATGTTGCAAGCC 

ACCTTCAGAAATATAGACTTTACCTGAGCAGATTAGAGAAAGGAAAGGAGCTCAAGTGTT 

ATTCAGGTGGCGTGAAGAATGCGGATTCATCTCCAAAAGATGTCGAAGTGAATTCAGGCT 

ACCAAAGCCCTGGGAGGAGCAGCTATGTATTCTCTGGAGGAAATTCTCTGATCCAAAAAG 

CAACAGAGATTGATCC7^AAGCCACTTGCTTCAGCTTCTTTGTCTGACCCCAACACCGATG 

TGATCATGCCTCCGAAAACAAAAAAGACGCGTATAGGATTTGATCCTCCCATTTCCTCCT 

CTGCGTTTGACTCTCTGCTTCCTTGGAATGATGTTCCAGAGGTCCTTGAATCGAAGCCGG 

TTCTGTATGAGAATAGCTTTCTCCAGCAACAACCATTGCCAAGTCAAAGTTCCTATGTTG 

CAATTTCTGCACCATCTCTCATGGAGGAGGAAATGAAGCCTCCTTATGAGACACCAGCAG 

GAGGCAGTAGTGTGAATGCAGATGAGTTTCTCATGCCACAAGACAAGATCCCTACTGTAA 

CCCTTCAAGATTTGGATCCCTCTGCCATGAAGCTGCAGGAGTTCAACACAGAAGGCGATT 

CTGAAGAAGCTTGAACTGGGGAACTTCCAGAATCACATCATTCTGTTTCTTTAGACACTG 

ACTTAGACTTGACTTGGCTTCAAGGCGAGCGTTTCTTGCAAACACCGACTCCAGTTTCAA 

GATACAGTAGTAGCCCATCACTCCTATCTGAGCTCCCAGCCCACCITAATTGGTATGGAA 

ATGAGCGGCTGCCTGACCCTGACGAGTATTCCTTCATGGTAGACCAAGGTTTATTCATAT 

CTTAACCTTGTTCCAATAACTTCrTTTCGTATATTGGTTGGTGTAATGCAGAAAGATTTT 

GTGGGTATACCTGAAAATAATCTTGCTTTCCC^GAACCTTCCATGATCGGATGCATTGT 

ACAATAATCCACGAGTGTCGTAGGCTAATTACACCAAACAGGTTGATGACAGTGATAAGG 

CCACATGTTTCACACCGTCGCTTAAGATCTTTACTGTCACCTGGAAGGAAA 

>G1038 Amino Acid Sequence (domain in AA coordinates: 198-247) 

MEKSGFSPVGLRVLVVDDDPTWLKILEKMLKKCSYEVTTCGLAREALRLLRERKDGYDIV 

I SDVNMPDMDGFKLLEHVGLELDLPVI MMSVDGETSRVMKGVHTGACD YLLKP IRMKELK 

IIWQHVLRKKLQEVRDIEGCGYEGGADWITRYDEAHFLGGGEDVSFGKKRKDFT>FEKKLb 

QDESDPSS S S SKKARWWS FELHHKFVNAVNQIGCDHKAGPKKI LDLMNVPWLTRENVAS 

HLQKYRLYLSRLEKGKELKCYSGGVKNADSSPKDVEVNSGYQSPGRSSYVFSGGNSLIQK 

ATEIDPKPIASASLSDPNTDVIMPPKTKKTRIGFDPPISSSAFDSLLPWNDVPEVLESKP 

VLYENSFLQQQPLPSQSSYVAISAPSLMEEEMKPPYETPAGGSSVNADEFLMPQDKIPTV 

TLQDLDPSAMKLQEFNTEGDSEEA* 

>G1073 (62.-874) 
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CCCCCCGACCTGCCTCTACAGAGACCTGAAGATTCCAGAACCCCACCTGATCAAAAATAA 
CATGGAACTTAACAGATCTGAAGCAGACGAAGCAAAGGCCGAGACCACTCCCACCGGTGG 
AGCCACCAGCTCAGCCACAGCCTCTGGCTCTTCCTCCGGACGTCGTCCACGTGGTCGTCC 
TGCAGGTTCCAAAAACAAACCCAAACCTCCGACGATTATAACTAGAGATAGTCCTAACGT 
CCTTAGATCACACGTTCTTG7VAGTCACCTCCGGTTCGGACATATCCGAGGCAGTCTCCAC 
CTACGCCACTCGTCGCGGCTGCGGCGTTTGCATTATAAGCGGCACGGGTGCGGTCACTAA 
CGTCACGATACGGCAACCTGCGGCTCCGGCTGGTGGAGGTGTGATTACCCTGCATGGTCG 
GTTTGACATTTTGTCTTTGACCGGTACTGCGCTTCCACCGCCTGCACCACCGGGAGCAGG 
AGGTTTGACGGTGTATCTAGCCGGAGGTCAAGGACAAGTTGTAGGAGGGAATGTGGCTGG 
TTCGTTAATTGCTTCGGGACCGGTAGTGTTGATGGCTGCTTCTTTTGCA7\ACGCAGTTTA 
TGATAGGTTACCGATTGAAGAGGAAGAAACCCCACCGCCGAGAACCACCGGGGTGCAGCA 
GCAGCAGCCGGAGGCGTCTCAGTCGTCGGAGGTTACGGGGAGTGGGGCCCAGGCGTGTGA 
GTCAAACCTCCAAGGTGGAAATGGTGGAGGAGGTGTTGCTTTCTACAATCTTGGAATGAA 
TATGAACAATTTTCAATTCTCCGGGGGAGATATTTACGGTATGAGCGGCGGTAGCGGAGG 
AGGTGGTGGCGGTGCGACTAGACCCGCGTTTTAGAGTTTTAGCGTTTTGGTGACACCTTT 
TGTTGCGTTTGCGTGTTTGACCTCAAACTACTAGGCTACTAGCTATAGCGGTTGCGAAAT 
GCGAATATTAGGTT 

>G1073 Amino Acid Sequence (domain in AA coordinates: 33-42, 78-175) 

MELNRSEADEAKAETTPTGGATSSATASGSSSGRRPRGRPAGSKNKPKPPTIITRDSPNV 

LRSHVLEVTSGSDI SEAVSTYATRRGCGVCI ISGTGAVTOVTIRQPAAPAGGGVITLHGR 

FDILSLTGTALPPPAPPGAGGLTVYIAGGQGQWGGNVAGSLIASGPVVLMAASFANAVY 

DRLPIEEEETPPPRTTGVQQQQPEASQSSEVTGSGAQACESNLQGGNGGGGVAFYNLGMN 

MNNFQFSGGDIYGMSGGSGGGGGGATRPAF* 

>G1146 (129. .3095) 

cttctctagcgtcactcttcttcttcattggtcggtagaataaggccaaggaagggatca 
gttttaagttttgtttcattctttttgtagtggagaaaaagagtttttgaaaatcaaaac 
aacaaaaaatgccgattaggcaaatgaaagatagctctgagactcacttagttatcaaaa 
cccaacctttaaagcaccacaatccaaaaaccgttcaaaacggtaaaatccctcctcctt 
ctccttctccggtgacggtgactactccggcgacggttactcagagtcaagcttcttcac 
cttcaccaccgtcaaagaatcgtagccggaggagaaaccgtggtggaagaaaatctgatc 
aaggagatgtttgtatgagacctagctctcgtcctcgtaaaccgccaccgccaagtcaaa 
ccacttcctccgccgtctccgtcgccaccgccggtgagattgtcgctgtgaatcatcaga 
tgcagatgggtgttcgtaaaaactcaaactttgctccaagacctggatttggaacacttg 
gaactaaatgcattgttaaagctaaccactttctcgctgatttgcctaccaaggatttga 
atcagtatgatgttacaattactcctgaagtgtcatcaaagagtgttaacagagctataa 
ttgctgagttagttagactttacaaagagtctgatctcgggaggagacttccggcttacg 
atggccggaaaagtctttacactgctggagaacttccttttacttggaaggagttcagtg 
ttaagattgttgatgaagatgacggtatcatcaatggccctaaaagggagagatcatata 
a 9gtggcaatcaagtttgttgcacgggcaaatatgcatcacttaggcgagtttctagctg 
gtaaacgggcagattgtccgcaagaggcggtgcagattcttgatattgtactcagggagt 
tgtcggttaagaggttttgtcccgttggaagatctttcttttcgcctgatattaaaacac 
cgcagcgactcggtgaagggttagagtcatggtgtgggttttaccagagtattagaccaa 
ctcaaatgggtttatcactaaatatcgatatggcttcagctgcattcatcgagcctcttc 
cagtgatagagtttgtagcacagcttcttggaaaggatgtcttgtcgaagccattgtcgg 
attctgatcgcgtcaagattaagaagggtcttagaggagtgaaagtagaggttactcaca 
gagcgaatgtaagaaggaaataccgtgttgcgggtttaacaactcaaccaacaagagagc 
taatgtttccagtagatgagaactgtactatgaagtcagttattgagtatttccaagaga 
tgtatggattcacgatccagcacacgcatttgccatgtctccaagttggaaaccaaaaga 
aggcaagctatttgccgatggaggcatgcaaaattgtcgagggacaacggtacacgaaaa 
ggttgaatgagaagcagattactgctctcttgaaagttacatgccaaagggccgagggac 
agagaaacgatattttgcggactgtccaacacaacgcatatgatcaagatccatatgcaa 
a ggagtttggcatgaacataagcgaaaagttagcttctgttgaagctcgtattcttccag 
ctccatggcttaagtatcacgagaacgggaaagaaaaagattgtctcccgcaagttggtc 
agtggaatatgatgaacaagaaaatgatcaacgggatgactgtgagcagatgggcctgtg 
ttaacttctcacgcagcgttcaagaaaacgttgctcgtggattttgtaatgaacttggtc 
agatgtgtgaagtctcaggcatggagtttaatccagaacccgtgataccaatatatagtg 
cgaggcccgatcaagtcgagaaagctctaaagcatgtttatcacacttcaatgaacaaaa 
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ccaaaggcaaagagttagagcttctgctggcaatattacctgataacaacggttcacttt 

atggtgatcttaagagaatctgtgaaaccgagcttggtttgatatctcaatgttgtctca 

caaaacatgtgttcaagattagcaaacagtatctggcagatgtatcccttaaaatcaacg 

taaagatgggaggaaggaacacagttctagtagacgccataagctgtagaattccactgg 

ttagcgatataccgacaatcatttttggcgcagacgtgactcacccagagaacggggaag 

agtcaagcccttcaatcgctgctgttgttgcttctcaagactggcctgaagtgacaaaat 

atgcgggtttagtttgtgctcaagctcacaggcaagaacttatacaagatttgtataaaa 

catggcaagatcctgttcgcggtactgttagtggcggtatgatcagggaccttcttatct 

catttagaaaagcaacagggcaaaaaccgcttcgaattatcttttatcgtgatggagtaa 

gcgaagggcaattctatcaagttttactctatgagttggatgcaattcgaaaggcttgtg 

catcgcttgaaccgaattatcagccaccggtgacattcatagttgtacagaagcgtcacc 

acactcgtttgtttgctaataatcaccgagacaaaaacagtactgaccgaagcggaaata 

tcttaccaggtactgtagttgacactaaaatatgtcatccaactgaattcgacttctacc 

tttgtagccatgcgggtattcagggaacaagcaggcctgcacattaccatgttctttggg 

acgagaacaatttcacagcagatggtattcaatctctgactaacaatctctgttatacct 

atgcgcggtgcactcggtcggtctctatagttcctccagcgtattatgctcatcttgcag 

catttcgagcacgtttctacctggaacctgagataatgcaagacaacggatcaccgggta 

aaaagaacacgaaaacaacaactgtcggagacgtaggtgtgaagcctttaccagccttga 

aggagaatgtgaagagagtaatgttctactgctaaaaatccaaacattccttaatcagtt 

ttaataagtagtttggttgtttgcttgtagttcggctttagatttaccaatgtttttctt 

atgtaaattttgtcggtttggtttaagcctttaggaattagtgtattagggtttttctaa 

agttgtactttagctgatgataacgttgatgcagtgactttgttaaaacctcctcttcta 

cagtagtgtttacgtcgttcctc 

>G1146 Amino Acid Sequence (domain in AA coordinates: 886-896) 

MPIRQMKDSSETHLVIKTQPLKHHNPKTVQNGKIPPPSPSPVTVTTPATVTQSQASSPSP 

PSKNRSRRRNRGGRKSDQGDVCMRPSSRPRKPPPPSQTTSSAVSVATAGEIVAVNHQMQM 

GVRKNSNFAPRPGFGTLGTKCIVKANHFLADLPTKDLNQYDVTITPEVSSKSVNRAIIAE 

LVRLYKESDLGRRLPAYDGRKSLYTAGELPFTWKEFSVKIVDEDDGIINGPKRERSYKVA 

IKFVARANMHHLGEFLAGKRADCPQEAVQILDIVLRELSVKRFCPVGRSFFSPDIKTPQR 

LGEGLESWCGFYQSIRPTQMGLSLNIDMASAAFIEPIiPVIEFVAQLLGKDVLSKPLSDSD 

RVKIKKGLRGVKVEVTHRANVRRKYRVAGM 

FTIQHTHLPCLQVGNQKKASYLPMEACKIVEGQRYTKRLNEKQITALLKVTCQRAEGQRN 
DI LRTVQHN AYDQDPYAKEFGMN I S EKLAS VEAR I LPAP WLKYHENGKEKDCIjPQVGQWN 
MMNKKMINGMTVSRWACVNFSRSVQENVARGFCNELGQMCEVSGMEFNPEPVIPIYSARP 
DQVEKALKH\raiTSMNKTKGKELELLLAILPDNNGSLYGDLKRICETELGLISQCCLTKH 
VFKISKQYLADVSLKINVKMGGRNTVLVDAISCRIPLVSDIPTIIFGADVTHPENGEESS 
PSIAAWASQDWPEVTKYAGLVCAQAHRQELIQDLYKTWQDPVRGTVSGGMIRDLLISFR 
KATGQKPLRIIFYRDGVSEGQFYQVLLYELDAIRKACASLEPNYQPPVTFIWQKRHHTR 
LFANNHRDKNSTDRSGNILPGTVVDTKI^^ 

NFTADGIQSLTIWLCYTTARCTRSVSIVPPAYYAHI^FRARFYLEPEIMQDNGSPGKKN 

TKTTTVGDVGVKPLPALKENVKRVMFYC* 

>G1267 (152.. 967) 

AAGTAGAGAATAATAATCACATCAAGATTGTTTATAACCCTCCCCNTAATCACCTTCTTA 
NTNACCACCCTCTCCGGCTCTCAACAGAACAACAACAAA 

TTCCGGCGAAATCGGACGGTCGAGATCAATCATGCATCGTAGAGCAGC^ATTCAAGAATC 
GGATGACGAAGAAGATGAGACTTACAACGACGTCGTTCCTGAATCTCCTTCTTCTTGTGA 
AGACTCAAAGATCTCAAAACCAACTCCAAAGAAAAGGAGGAACGTAGAGAAGAGAGTTGT 
CTCAGTTCCGATAGCTGACGTGGAAGGATCTAAGAGCAGAGGCGAAGTATATCCACCGTC 
CGATTCATGGGCCTGGAGAAAGTACGGACAAAAACCGATCAAAGGCTCGCCTTATCCCAG 
GGGATATTACAGATGTAGTAGCTCAAAAGGATGTCCGGCGAGGAAGCAGGTGGAGAGAAG 
CCGTGTGGACCCTTCTAAGCTTATGATTACTTACGCCTGCGACCACAATCACCCTTTCCC 
TTCCTCCTCCGCTAACACCA^TCCCACCACCGCTCCTCCGTCGTCCTCAAAACCGCAAA 
GAAAGAGGAAGAATACGAAGAGGAGGAAGAAGAACTAACCGTCACCGCCGCAGAGGAACC 
ACCGGCGGGACTTGATCTAAGCCACGTAGACTCACCGTTGCTATTAGGCGGCTGCTACAG 
CGAAATCGGAGAGTTCGGGTGGTTCTACGACGCGTCGATCTCATCATCATCTGGTTCTTC 
GAATTTCCTCGACGTAACTCTAGAGAGAGGTTTTTCAGTAGGCCAAGAGGAAGATGAGTC 
TTTGTTCGGTGATCTCGGTGATTTACCTGATTGCGCCTCCGTGTTCCGCCGTGGGACTGT 
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TGCGACGGAGGAGCAACATCGAAGATGTGATTTTGGCGCCATTCCTTTCTGTGATAGTTC 

TAGATGAGTTTGTGTGTGTAGCCAAAACCAAAAGAAAAAAACACAATTTTTTTATTTTCC 

ACTGTAAAGGTGTATCAATGGTGGATTCATTTTTTTAAAAAAAAAAAAAAAA 

>G1267 Amino Acid Sequence (domain in AA coordinates: 70-127) 

MHRRAAIQESDDEEDETYNDWPESPSSCEDSKISKPTPKKRRNVEKRWSVPIADVEGS 

KSRGEVYPPSDSWAWRKYGQKPIKGSPYPRGYYRCSSSKGCPARKQVERSRVDPSKLMIT 

YACDHNHPFPSSSAWTKSHHRSSWLKTAKKEEEYEEEEEELTVTAAEEPPAGLDLSHVD 

SPLLLGGCYSEIGEFGWFYDASISSSSGSSNFLDVTLERGFSVGQEEDESLFGDLGDLPD 

CASVFRRGTVATEEQHRRCDFGAIPFCDSSR* 

>G1269 (88.. 951) 

AACAATTCTCTCTCTCTTTATTCTTCTTCTTCAGCTTCAGATTTCAGATCTTAAATCTTC 
7VAGTCTTCTTCTTCTTCTTCTGCAACCATGGCTATGCAGGAACGTTGTGAGAGTTTATGT 
TCTGATGAACTTATATCTTCCTCAGATGCCTTTTACCTCAAGACAAGAAAGCCTTATACC 
ATCACTAAACAAAGAGAGAAATGGACAGAAGCAGAGCATGAGAAGTTTGTAGAAGCATTG 
AAACTCTATGGCAGAGCTTGGAGACGAATCGAAGAACATGTTGGAACAAAAACTGCAGTT 
CAGATTCGAAGCCATGCGCAGAAGTTCTTTACTAAGGTTGCTCGCGATTTTGGTGTTAGC 
TCTGAGTCCATTGAGATCCCGCCTCCAAGGCCAAAGAGAAAGCCGATGCATCCTTACCCT 
AGAAAGCTTGTGATTCCTGATGCAAAAGAGATGGTATACGCTGAACTAACCGGATCCAAG 
CTGATTCAGGATGAAGATAACCGATCTCCAACATCGGTTTTATCAGCTCATGGCTCAGAT 
GGATTAGGTTCCATTGGTTCAAATTCACCTAACTCTTCTTCAGCTGAGTTATCATCTCAC 
ACAGAGGAATCATTGTCTCTAGAAGCAGAGACCAAACAGAGCCTTAAGCTCTTTGGAAAA 
ACTTTTGTAGTTGGTGATTACAACTCTTCAATGAGTTGTGATGATTCTGT^AGATGGC/^AG 
AAGAAGCTATACTCAGAAACACAGTCTCTTCAATGTTCTTCTTCTACTTCAGAAAACGCT 
GAAACAGAAGTGGTAGTGTCGGAGTTCAAAAGAAGTGAGAGATCAGCTTTCTCTCAGTTA 
AAATCGTCGGTGACTGAGATGAACAACATGAGAGGGTTCATGCCTTACAAAAAGAGAGTA 
AAGGTGGAAGAAAACATTGACT^ATGTAAAATTATCATATCCTTTGTGGTGAAGTGTTCGT 
TTGTGTCAAGTCAGTTGTGTAAACTCTTTTGATCTCAACATCAGATTATGTGTATAATGT 

CTTTCCATATAACCAGTTAGAAATTGAGATCCTTGTACTTAAACATTTTTATTTGATCAA 
TCAAATCTTCTTGATGAAAAAAAAAA 

>G1269 Amino Acid Sequence (domain in AA coordinates: 27-83) 

M7VMQERCESLCSDELISSSDAFYLKTRKPYTITKQREKWTEAEHEKFVEALKLYGRAWRR 

IEEHVGTKTAVQIRSHAQKFFTKVARDFGVSSESIEIPPPRPKRKPMHPYPRKLVIPDAK 

EMVYAELTGSKLIQDEDNRSPTSVLSAHGSDGLGSIGSNSPNSSSAELSSHTEESLSLEA 

ETKQSLKLFGKTFWGDYWSSMSCDDSEDGKKKLYSETQSLQCSSSTSENAETEVVVSEF 

KRSERSAFSQLKSSVTEMNNMRGFMPYKKRVKTOENIDNVKLSYPLW* 

>G1452 (175. .1296) 

ATTTATTAAGCATCAATGAGAGAACTTCAGAGCTGGGTTTGAGTTCTGTCCAATAATACA 

TAACCACGTTATCZATTTTTGTCCTTTACTATCTCATTACACTCTTCTGTTATTCGCCCAA 

TTCTTACAGTCATTACTCTCTATAGGGCTCGAGCGGCCGCCCGGGCAGGTTTCTATGCAG 

ATGGTTC^CACTTCCCGCTCCATTGCCCAGATTGGGTTCGGTGTTAAGTCGCAATTAGTA 

CTCACTATAGGGCTCGAGCGGCCGCCCGGGCAGGTATU^GATCAAACAATGTCTAAAGAA 

GCTGAGATGTCGATCGCGGTGTCGGCTTTGTTCCCTGGTTTTAGATTCTCTCCTACTGAT 

GTTGAACTTATCTCGTACTATCTTCGTCGTAAAATCGATGGTGATGAGAACTCTGTTGCT 

GTGATTGCTGAGGTCGAGATTTACAAGTTCGAGCCGTGGGACTTGCCAGAGGAATCGAAA 

CTGAAATCGGAGAACGAGTGGTTTTACTTCTGCGCGAGGGGGAGGAAGTACCCGCACGGG 

TCACAAAGCCGGCGAGCCACACAGCTAGGATATTGGAAAGCGACCGGTAAAGAGCGGAGT * 

GTTAAATCCGGGAACCAAGTTGTTGGAACCAAGAGAACGCTTGTATTTCATATCGGTCGG 

GCTCCTCGTGGCGAGAGAACGGAGTGGATTATGCATGAATACTGCATCCATGGAGCCCCA 

CAGGATGCATTAGTGGTGTGCCGGTTAAGAAAAAATGCTGATTTTCGGGCTAGTTCGACC 

CAAAAAATTGAGGATGGTGTTGTGCAAGACGATGGCTACGTTGGCCAAAGAGGTGGTTTG 

GAC^GGAGGAC^^TCCTACTATGAATCTGAGCATCAGATACCAAATGGTGAC^TCG^ 

GAATCATCAAATGTTGTTGAGGATCAGGCCGATACCGATGATGATTGTTACGCCGAGATT 

CTGAACGATGATATAATAAAGCTCGACGAAGAAGCGTTGAAAGCTAGCCAAGCGTTTCGA 

CCAACTAATCCAACTCATCAAGAAACAATATCAAGCGAGTCATCGAGTT^AGAGGTCAAAA 

TGTGGTATAAAAAAAGAATCAACGGAAACAATGAATTGTTACGCTTTGTTCAGGATCAAG 

AACGTTGCCGGAACCGAOTCmGCTGGAGATTCCCGAACCCGTTCAAAATCAAGAAAGAT 
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GATAGCCAGAGATTGATGAAGAATGTTCTGGCCACTACTGTTTTCTTGGCTATCTTATTT 

TCTTTCTTTTGGACTGTATTAATAGCTAGGAACTAAAGCTAGTTACGACATACATATTAT 

TT ATACATAAATAAATATAGTATTTTGTCTATGG CAAAAAAAAAAAAAAAA 

>G1452 Amino Acid Sequence (domain in AA coordinates: 30-177) 

MQMVHTSRS IAQIGFGVKSQLVLTIGLERPPGQVKDQTMSKEAEMS IAVSALFPGFRFSP 

TDVELISYYLRRKIDGDENSVAVIAEVEIYKFEPWDLPEESICLKSENEWFYFCARGRKYP 

HGSQSRRATQLGYWKATGKERSVKSGNQWGTKRTLVFHIGRAPRGERTEWIMHEYCIHG 

APQDALWCRLRKNADFRASSTQKIEDGWQDDGYVGQRGGLDKEDKSYYESEHQIPNGD 

IAESSNWEDQADTDDDCYAEILlSrDDlIKLDEEALKASQAFRPTNPTHQETISSESSSKR 

SKCGIKKESTETMNCYALFRIKOTAGTDSSWRFPNPFKIK^ 

LFSFFWTVLIARN* 

>G1494 (114.. 1406) 

TCGACAGAGTTGTGTTGGGCGTGGAACTTGGACTAGTTCCACATATCAGGTTATATAGAT 
CTTCTCTTTCAACTTCTGATTCGTCCAGAAGCTTTCCTAATCTGAGATCTGACATGGAAC 
ACCAAGGTTGGAGTTTTGAGGAGAATTATAGTTTGTCCACTAATAGAAGATCTATCAGGC 
CACAAGATGAACTAGTGGAGTTATTATGGCGAGATGGACAAGTGGTTCTGCAGAGCCAAA 
CTCATAGAGAACAAACCCAAACCCAGAAACAAGATCATCATG7VAGAAGCCCTAAGATCCA 
GCACCTTTCTTGAAGATCAAGAAACTGTCTCTTGGATCCAATACCCTCCAGATGAAGACC 
CATTCGAACCCGACGACTTCTCCTCCCACTTCTTCTCAACCATGGATCCCCTCCAGAGAC 
CAACCTCAGAGACGGTTAAGCCTAAGTCCAGTCCTGAACCTCCTCAAGTCATGGTTAAGC 
CTAAGGCCTGTCCTGACCCTCCTCCTCAAGTCATGCCTCCTCCAAAATTTAGGTTAACAA 
ATTCATCATCGGGGATTAGGGAAACAGAAATGGAACAGTACTCGGTAACGACCGTTGGAC 
CTAGCCATTGCGGAAGCAACCCATCACAGAACGATCTCGATGTCTCAATGAGTCATGATC 
GAAGCAAAAACATAGAAGAAAAGCTTAATCCGAACGCAAGTTCCTCATCAGGTGGCTCCT 
CTGGTTGCAGCTTTGGCAAAGATATCAAAGAAATGGCTAGTGGAAGATGCATCACAACCG 
ACCGTAAGAGAAAACGTATAAATCACACTGACGAATCTGTATCTCTATCAGATGCAATCG 
GTAACAAGTCGAACCAACGATCAGGATCAAACCGAAGGAGTCGAGCAGCTGAAGTTCATA 
ATCTCTCCGAAAGGAGGAGGAGAGATAGGATCAATGAGAGAATGAAGGCTTTGCAAGAAC 
TAATACCTCACTGCAGTAAAACTGATAAAGCTTCGATTTTAGACGAAGCCATAGATTATT 
TGAAATCACTTCAGTTACAGCTTCAAGTGATGTGGATGGGGAGTGGAATGGCGGCGGCGG 
CGGCTTCGGCTCCGATGATGTTCCCCGGAGTTCAACCTCAGCAGTTCATACGTCAGATAC 
AGAGCCCGGTACAGTTACCTCGATTTCCGGTTATGGATCAGTCTGC^ATTCAGAACAATC 
CCGGTTTAGTTTGCCAAAACCCGGTACAAAACCAGATCATCTCCGACCGGTTTGCTAGAT 
ACATCGGTGGGTTCCCACACATGCAGGCCGCGACTCAGATGCAGCCGATGGAGATGTTGA 
GATTTAGTTCACCGGCGGGACAGCAT^AGTCT^ACAACCGTCGTCTGTGCCGACGAAGACCA 
CCGACGGTTCTCGTTTGGACCACTAGGTTGGTGAGCCACTTTGC 

>G1494 Amino Acid Sequence (domain in aa coordinates: 261-311) 

MEHQGWSFEENYSLSTNRRSIRPQDELVELLWRDGQVVLQSQTHREQTQTQKQDHHEEAL 

RSSTFLEDQETVSWIQYPPDEDPFEPDDFSSHFFSTMDPLQRPTSETVKPKSSPEPPQVM 

WPKACPDPPPQVMPPPKFRLTNSSSGIRETEMEQYSVTTVGPSHCGSNPSQNDLDVSMS 

HDRSKNIEEKLNPNASSSSGGSSGCSFGKDIKEMASGRCITTDRKRKRINHTDESVSLSD 

AIGNKSNQRSGSNRRSRAAEVHNLSERRRRDRINERM 

DYLKSLQLQLQVMWMGSGMA7VAAASAPMMFPGVQPQQFIRQIQSPVQLPRFPVMDQSAIQ 
NNPGLVCQNPVQNQI ISDRFARYIGGFPHMQAATQMQPMEMLRFS SPAGQQSQQPSSVPT 
KTTDGSRLDH* 
>G1548 (1..2511) 

ATGGCAATGTCTTGCAAGGATGGTAAGTTGGGATGTTTGGATAATGGGAAGTATGTGAGG 

TATACACCTGAACAAGTTGAAGCACTTGAGAGGCTTTATCATGACTGTCCTAAACCGAGT 

TCTATTCGCCGTCAGCAGTTGATCAGAGAGTGTCCTATTCTCTCTAACATTGAGCCTAAA 

CAGATCAAAGTGTGGTTTCAGAACCGAAGATGTAGAGAGAAACAAAGGAAAGAGGCTTCA 

CGGCTTCAAGCTGTGAATCGGAAGTTGACGGCAATGAACAAGCTCTTGATGGAGGAGAAT 

GACAGGTTGCAGAAGCAAGTGTCACAGCTGGTCCATGAAT^CAGCTACTTCCGTCAACAT 

ACTCCAAATCCTTCACTCCCAGCTAAAGACACAAGCTC 

CAGCACCAATTGGCATCTGAAAATCCTCAGAGAGA 

ATTGCAGAAGAAACTTTAGCAGAGTTTCTTTCAAAGGCAACTGGAACCGCTGTTGAGTGG 
GTTC^GATGCCTGGAATGAAGCCTGGTCCGGATTCCATTGGAATCATCGCTATTTCTCAT 
GGTTGCACTGGTGTGGCAGCACGCGCCTGTGGCCTAGTGGGTCTTGAGCCTACAAGGGTT 
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GCAGAGATTGTCAAGGATCGTCCTTCGTGGTTCCGCGAATGTCGAGCTGTTGAAGTTATG 

AACGTGTTGCCAACTGCCAATGGTGGAACCGTTGAGCTGCTTTATATGCAGCTCTATGCA 

CCAACTACATTGGCCCCACCACGCGATTTCTGGCTGTTACGTTACACCTCTGTTTTAGAA 

GATGGCAGCCTTGTGGTGTGCGAGAGATCTCTTAAGAGCACTCAAAATGGTCCTAGTATG 

CCACTGGTTCAGAATTTTGTGAGAGCAGAGATGCTTTCCAGTGGGTACTTGATACGGCCT 

TGTGATGGTGGTGGCTCAATCATACACATAGTGGATCATATGGATTTGGAGGCTTGTAGC 

GTGCCTGAGGTCTTGCGCCCGCTCTATGAGTCACCCAAAGTACTTGCACAGAAGACAACA 

ATGGCGGCACTGCGTCAGCTCAAGCAAATAGCTCAGGAGGTTACTCAGACTAATAGTAGT 

GTTAATGGGTGGGGACGGCGTCCTGCTGCCTTAAGAGCTCTCAGCCAGAGGCTAAGCAGA 

GGCTTCAATGAAGCTGTAAATGGTTTCACTGATGAAGGATGGTCAGTGATAGGAGATAGC 

ATGGATGATGTCACAATCACTGTAAACTCTTCTCCAGACAAGCTAATGGGTCTAAATCTT 

ACATTTGCCAATGGCTTTGCTCCTGTAAGCAATGTTGTTTTATGCGCAAAAGCATCAATG 

CTTTTACAGAATGTTCCTCCGGCGATCCTGCTTCGGTTTCTGAGGGAGCATAGGTCAGAA 

TGGGCTGACAACAACATTGATGCGTATCTAGCAGCAGCAGTTAAAGTAGGGCCTTGTAGT 

GCCCGAGTTGGAGGATTTGGAGGGCAGGTTATACTTCCACTTGCTCATACTATTGAGCAT 

GAAGAGTTTATGGAAGTCATCAAATTGGAAGGTCTTGGTCATTCCCCTGAAGATGCAATC 

GTTCCAAGAGATATCTTCCTTCTTCAACTTTGTAG CGGAATGGATGAAAATG CTGTAGGA 

ACCTGTGCGGAACTTATATTTGCTCCAATCGATGCTTCGTTTGCGGATGATGCACCTCTG 

CTTCCTTCTGGTTTTCGTATTATCCCTCTTGATTCCGCAAAGGAAGTATCTAGCCCAAAC 

CGAACCTTGGATCTTGCTTCGGCACTGGAAATTGGTTCAGCTGGAACAAAAGCCTCAACT 

GATCAATCAGGAAACTCCACATGTGCAAGATCTGTGATGACAATAGCATTTGAGTTTGGT 

ATCGAGAGCCATATGCAAGAACATGTAGCATCCATGGCTAGGCAGTATGTTCGAGGTATC 

ATATCATCGGTGCAGAGAGTAGCATTGGCTCTTTCTCCTTCTCATATCAGCTCACAAGTT 

GGTCTACGCACTCCTTTGGGTACTCCTGAAGCCCAAACACTTGCTCGTTGGATTTGCCAG 

AGTTACAGGGGCTACATGGGTGTTGAGCTACTTAAATCAAACAGTGACGGCAATGAATCT 

ATTCTTAAGAATCTTTGGCATCACACTGATGCTATAATCTGCTGCTCAATGAAGGCCTTG 

CCCGTCTTCACATTTGCAAACCAGGCGGGACTTGACATGCTGGAGACTACATTAGTTGCT 

CTTCAAGACATCTCTTTAGAGAAGATATTTGATGACAATGGAAGAAAGACTCTTTGCTCT 

GAGTTCCCACAGATCATGCAACAGGGCTTCGCGTGCCTTCAAGGCGGGATATGTCTCTCA 

AGCATGGGGAGACCAGTTTCGTATGAGAGAGCAGTTGCTTGGAAAGTACTCAATGAAGAA 

GAAAATGCTCATTGC^TCTGCTTTGTGTTCATCAATTGGTCCTTTGTGTGA 

>G1548 Amino Acid Sequence (domain in AA coordinates: 17-77) 

MAMSCKDGKLGCLDNGKYVRYTPEQVEALERLYHDCPKPSSIRRQQLIRECPILSNIEPK 

QI KVWFQNRRCREKQRKEASRLQAVNRKIjTAMNKLLMEENDRLQKQVS qlvhens yfrqh 

TPNPSLPAKDTSCESWTSGQHQLASQNPQRDASPAGLLSIAEETLAEFLSKATGTAVEW 

VQMPGMKPGPDSIGIIAISHGCTGVAARACGLVGLEPTRVAEIVKDRPSWFRECRAVEVM 

NVLPTANGGTVELLYMQLYAPTTIiAPPRDFWIiLRYTSVLEDGSLVVCERSLKSTQNGPSM 

PLVQNFVRAEMLSSGYLIRPCDGGGSIIHIVDHMDLEACSVPEVLRPLYESPKVIiAQKTT 

MAALRQIiKQIAQEVTQTNSSVNGWGRRPAALRALSQRLSRGFNEAVNGFTDEGWSVIGDS 

MDDOTITVNSSPDKlMGLmTFANGFA^^ 

WADNNIDAYLAAAVKVGPCSARVGGFGGQVILPLAHTIEHEEFMEVIKLEGLGHSPEDAI 
VPRDI FLLQLCSGMDENAVGTCAELI FAPIDASFADDAPLLPSGFRI IPLDSAKEVS S PN 
RTLDLASALEIGSAGTKASTDQSGNSTCARSVMTIAFEFGIESHMQEHVASMARQYVRGI 
I SS VQRVALALS PSHISS QVGLRTPLGTPEAQTLARWICQS YRGYMGVELLKSNSDGNES 
ILKNLWHHTDAI I CCSMKALPVFTFANQAGLDMLETTLVALQDI SLEKI FDDNGRKTLCS 
EFPQIMQQGFACLQGGICLSSMGRPVSYERAVAWKVLNEEENAHCICFVFINWSFV* 
>G1574 (1..1962) 

ATGGATGATACAAT^GGACATGAGTTCAGGTAGTGATGAAGAAGTACAAGAAGAGAAGACC 
ACTGTTAACGAGAGGGTCATCTATCAGGCTGC^TTACAAGATCTGAAGCAACCC^AGACC 
GAAAAGGATCTACCTCCTGGTGTTCTTACAGTTCCrrCTTATGAGGCATCAGAAAATTGCA 
TTGAACTGGATGCGTAAGAAAGAAAAAAGAAG CAGG CACTGTTTGGG AGGGATATTAG CA 
GATGATCAGGGACTTGGTAAAACGATCTCGACGATCTCTCnTATCCTGTTACAAAAGTTG 
AAGTCACAATCAAAGCAGAGAAAGCGAAAAGGTCAAAACTCTGGTGGTACATTGATTGTT 
TGTCCAGCAAGTGTTGTAAAAC^TGGGCAAGAGAAGTTAAAGAGAAGGTTTCTGATGAA 
CACAAACTCTCTGTTTTAGTCCACCATGGATCTC^CAGAACCAAAGATCCAACAGAAAT^ 
GCAATATATGATGTGGTC^TGACAACTTACGCCATTGTTACAAATGAAGTTCCACAAAAC 
CCTATGCTGAATCGTTATGATAGTATGAGAGGCAGAGAAAGCCTTGACGGATCGAGTTTG 
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ATTCAGCCTCACGTTGGTGCACTAGGAAGAGTTAGGTGGTTGAGAGTAGTATTAGATGAA 
GCTCATACAATTAAAAACCATAGAACCCTAATTGCAAAAGCTTGTTTTAGCCTTAGAGCC 
AAAAGGAGATGGTGTTTGACTGGAACGCCGATAAAGAACAAAGTAGACGATCTTTATAGC 
TATTTCAGATTTCTTAGATATCATCCATATGCCATGTGCAATTCATTTCACCAAAG7UVTC 
AAAGCTCCAATTGATAAA7^AGCCTCTTCATGGTTACAAGAAGCTTCAAGCTATTCTAAGG 
GGTATAATGTTGCGCCGCACCAAAGAATGGTCTTTCTACAGGTyVGCTTGAATTGAATTCA 
CGTTGGAAGTTTGAGGAATATGCTGCTGATGGGACTTTGCATG7VACACATGGCTTATCTT 
TTGGTGATGCTTTTGCGACTACGCCAAGCTTGTAACCATCCACAACTTGTTAACGGATAT 
AGTCACTCAGATACTACAAGAAAAATGTCAGATGGAGTTCGAGTAGCCCCTAGAGAGAAT 
CTAATCATGTTCCTCGATCTCTTGAAATTATCCTCAACCACCTGCTCTGTTTGTAGTGAT 
CCACCAAAAGACCCTGTTGTTACTTTGTGTGGCCATGTGTTTTGTTATGAGTGTGTGTCT 
GTAAACATTAACGGGGATAACAATACGTGCCCTGCACTTAATTGCCACAGCCAGCTTAAA 
CATGATGTTGTTTTCACTGAATCTGCAGTTAGAAGTTGCATCAACGATTATGATGATCCT 
GAAGATAAAAATGCTTTAGTTGCATCAAGGCGAGTTTATTTCATCGAAAATCCGAGCTGT 
GATAGAGATTCTTCAGTCGCTTGCAGAGCAAGGCAGTCCAGACACTCCACCAATAAAGAC 
AATAGTATCAGTGGACTGAATCTCATTTTTACGTTTCTCAAAGACAAATGTAATGATTAT 
GAAACAGGTGCGATGTTGATGTCTCTTAAAGCTGGAAACCTTGGATTGAATATGGTAGCT 
GCAAGTCATGTCATTCTACTGGACCTATGGTGGAATCCAACAACAGAGGATCAAGCTATT 
GATCGAGCTCATCGTATCGGACAAACTCGAGCTGTTACGGTCACTCGTATTGCCATCAAA 
AATACCGTTGAGGAACGAATTTTGACTCTTCATGAACGTAAAAGGAACATTGTTGCATCT 
GCATTGGGTGAAAAAAACTGGCAAAAGTTCTGCGATTCAACTAACACTAGAAGATCTCGA 
ATATCTGTTTTTTGGTGTGTAGAATATCCCAGAGTTTTTATTGATAAGAGGAATAAAACC 
TTTAGCTATTTAATAAGTCACAAGTGTGAATGTAATGAATAA 

>G1574 Amino Acid Sequence (domain in AA coordinates: 28-350) 
ITODTMDMSSGSDEEVQEEKTTVNERVIYQAALQDLKQP 

LNWMRKKEKRSRHCLGGILADDQGLGKTISTISLILLQKLKSQSKQRKRKGQNSGGTLIV 

CPASVVKQWAREVKEKVSDEHKLSVLVHHGSHRTKDPTEIAIYDWMTTYAIVTNEVPQN 

PMLNRYTlSMRGRESLDGSSLIQPHVGALGRWWIjRVVLDEAHTIKimRTLIAKACFSLR^ 

KRRWCLTGTPIKNKTODLYSYFRFLRYHPYMCWSFHQRIKAPIDKKPLHGYKKLQAILR 

GIMLRRTKEWSFYRKLEI^SRWKFEEYAADGTLHEHI^YLLVMLLRLRQACNHPQLTO 

SHSDTTRKMSDGWVAPRENLIMFLDLLKLSSTTCSVCSD^ 

VNINGDNNTCPALNCHSQLKHDWFTESAVRSCINDYDDPEDKNALVASRRVYFIENPSC 
DRDSSVACRARQSRHSTNKDNSISGLNLIFTFLKDKCNDYETGAMLMSLKAGNLGLNMVA 
ASHVILLDLWWNPTTEDQAIDRAHRIGQTRAVWTRIAIKNTVEERILTLHERKRNIVAS 
ALGEKNWQKFCDSTNTRRSRISVFWCVEYPRVFIDKRNKTFSYLISHKCECNE* 
>G1586 (1..807) 

ATGAATCAAGAAGGTGCTTCACATAGCCCATCCTCCACTTCCACCGAACCAGTCCGGGCA 

CGTTGGTCACCTAAACCGGAGCAAATCTTGATACTCGAATCCATCTTCAACAGTGGTACT 

GTTAACCCACCAAAAGATGAAACGGTGAGGATAAGAAAGATGCTTGAGAAATTCGGTGCT 

GTGGGAGACGCAAACGTCTTCTACTGGTTTCAAAACCGACGGTCAAGATCTCGCCGGAGA 

CACCGGCAGCTTTTAGCAGCCACCACCGCAGCCGCCACCTCCATAGGAGCTGAAGACCAC 

CAGCACATGACGGCCATGAGCATGCATCAATATCCTTGCAGCAACAACGAGATTGATTT^ 

GGGTTTGGAAGTTGTAGCAACTTATCAGCTAATTACTTCCTTAATGGATCGTCGTCATCT 

CAAATCCCTTCCTTTTTCCTCGGCCTCTCTTCTTCAAGTGGTGGGTGTGAGAACAACAAT 

GGTATGGAGAATCTCTTCAAAATGTATGGCCATGAATCTGATCATAATCATCAGCAGCAG 

CATCATAGCTC7UVATGCTGCATCAGTTTTAAACCCATCTGATCAAAACTCCAACTCCCAA 

TACGAACAAGAAGGGTTTATGACGGTGTTTATAAACGGAGTTCCTATGGAAGTAACAAAA 

GGAGCAATAGACATGAAAACAATGTTCGGTGATGATTCGGTGTTACTTCATTCCTCTGGT 

CTTCCTCTTCCCACTGATGAGTTTGGTTTCTTGA^ 

TATTTCCTGGTACCGAGACAGACATGA 

>G1586 Amino Acid Sequence (domain in AA coordinates: 21-81) 

MNQEGASHSPSSTSTEPVRARWSPKPEQILILESIFNSGTVNPPKDETVRIRKMLEKFGA 

VGDANVFYWFQNRRSRSRRRHRQLLMTTAAATSIGAEDHQHMTAMSMHQYPCSNNEIDL 

GFGSCSNLSANYFLNGSSSSQIPSFFLGLSSSSGGCENNNGMENLFKI^GHESDHNHQQQ 

HHSSNAASVLNPSDQNSNSQYEQEGFMTVFINGVPMEVTKGAIDMKTMFGDDSVLIiHSSG 

LPLPTDEFGFLMHSLQHGQTYFLVPRQT* 

>G1786 (1..1170) 
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ATGATCGTGTACGGTGGGGGAGCATCCGAGGACGGTG7VAGGTGGAGGGGTGGTTCTGAAG 
AAAGGGCCATGGACGGTGGCCGAGGACGAGACACTGGCGGCTTACGTACGGGAATACGGT 
GAAGGGAACTGGAATTCTGTTCAGT^AGAAGACATGGCTGGCTAGGTGTGGCAAGAGCTGC 
CGCCTCCGCTGGGCTAACCACTTACGACCTAATCTCAGGAAAGGCTCCTTCACCCCCGAG 
GAAGAACGTCTCATCATACAACTCCACTCTCAGCTAGGCAACAAATGGGCTCGCATGGCT 
GCTCAGTTACCAGGCAGAACAGATAACGAGATCAAGAACTACTGGAACACGAGGTTGAAA 
CGCTTCCAACGCCAAGGCCTCCCTCTCTACCCTCCAGAATATTCCCAAAACAATCATCAA 
CAACAAATGTATCCTCAACAGCCCTCCTCACCTCTCCCGTCCCAAACACCTGCTTCTTCC 
TTTACCTTTCCTCTCCTCCAACCGCCTTCTCTGTGTCCCAAACGTTGTTATAACACTGCC 
TTCTCTCCCAAGGCCTCATATATTTCTTCTCCAACCAATTTCCTTGTCTCGTCTCCGACC 
TTTCTTCACACCCATTCCTCTCTTTCCTCCTATCAGTCTACCAATCCGGTTTACTCCATG 
AAACATGAGCTCTCTTCAAACCAAATTCCATACTCTGCCTCTTTAGGAGTCTATCAAGTA 
AGCAAGTTCTCAGACAATGGGGATTGTAACCAAAACCTGAACACCGGTTTGCATACAAAT 
ACCTGTCAGCTGTTAGAGGATCTTATGGAGGAGGCCGAGGCTCTAGCTGATAGCTTTCGT 
GCTCCTAAGCGGAGACAAATCATGGCTGCGCTTGAGGACAACAACAACAACAACAACTTT 
TTCTCGGGAGGTTTCGGACATCGTGTTTCTTCCAACAGTCTATGTTCCTTGCAAGGTTTA 
ACACCAAAGGAAGATGAGTCTCTCCAGATGAACACAATGCAAGATGAGGACATAACAAAG 
CTTCTTGACTGGGGAAGTGAAAGTGAAGAAATCTCAAACGGGCAATCCTCTGTGATAACA 
ACAGAGAACAACCTTGTCCTTGACGATCACCAGTTCGCTTTTCTGTTTCCAGTTGATGAT 
GACACCAACAACTTGCCAGGGATCTGCTAG 

>G1786 Amino Acid Sequence (domain in AA coordinates: TBD) 

MIVYGGGASEDGEGGGVVLKKGPWTVAEDETLAATVREYGEGNWNSVQKKTWIiARCGKSC 

RLRWANHLRPNLRKGSFTPEEERLIIQLHSQLGNKWARMAAQLPGRTDNEIKNYWNTRLK 

RFQRQGLPLYPPEYSQNNHQQQMYPQQPSSPLPSQTPASSFTFPLLQPPSLCPKRCYNTA 

FSPKASYISSPTNFLVSSPTFLHTHSSLSSYQSTNPVYSMKHELSSNQIPYSASLGVYQV 

SKFSDNGDCNQNLNTGLHTNTCQLLEDLMEEM^ 

FSGGFGHRVSSNSLCSLQGIiTPKEDESLQMNTMQDEDITKLLDWGSESEEISNGQSSVIT 
TENNLVLDDHQFAFLFPVDDDTNNLPG I C* 
>G1792 (77.. 496) 

AATCCATAGATCTCTTATTAAATAACAGTGCTGACCAAGCTCTTACAAAGCAAACCAATC 
TAGAACACC^UVAGTTAATGGAGAGCTCAAACAGGAGCAGCAAC^^CCAATCACAAGATGA 
CAAGCAAGCTCGTTTCCGGGGAGTTCGAAGAAGGCCTTGGGGAAAGTTTGCAGCAGAGAT 
TCGAGACCCGTCGAGAAACGGTGCCCGTCTTTGGCTCGGGACATTTGAGACCGCTGAGGA 
GGCAGCAAGGGCTTATGACCGAGCAGCCTTTAACCTTAGGGGTCATCTCGCTATACTCAA 
CTTCCCTAATGAGTATTATCCACGTATGGACGACTACTCGCTTCGCCCTCCTTATGCTTC 
TTCTTCTTCGTCGTCGTCATCGGGTTCAACTTCTACTAATGTGAGTCGACAAAACCAAAG 
AGAAGTTTTCGAGTTTGAGTATTTGGACGATAAGGTTCTTGAAGAACTTCTTGATTCAGA 
AGAAAGGAAGAGATAATCACGATTAGTTTTGTTTTGATATTTTATGTGGCACTGTTGTGG 
CTACCTACGTGCATTATGTGCATGTATAGGTCGCTTGATTAGTACTTTATAACATGCATG 
CCACGACCATAAATTGTAAGAGAAGACGTACTTTGCGTTTTCATGAAATATGAATGTTAG 
ATGGTTTGAGTACAAAAAAAAAAAAAAAAAAAAAAA 

>GI792 Amino Acid Sequence (domain in aa coordinates: 17-85) 
MESSNRSSNNQSQDDKQARFRGVRRRPWGKFAAEIRDPSRNGARLWLGTFETAEEAARAY 
DRAAFNLRGHLAIIiNFPNEYYPRMDDYSLRPPYASSSSSSSSGSTSTNVSRQNQREVFEF 
E YLDDKVLEELLDS EERKR* 
>G1865 (48.. 899) 

AAGAAGAGGACATGAAGCACAGAGATTCTGCAGACTGCAGGTGACGAATGGACACTTTAT 

CAATAAAAACATACCTACTACTCTCTTACACTTTC^ 

TTAATCTCTCTTTCTTCTTCATCTCTCTTTCTCTTTCTCTCTTCATGG 

CATTCACAGAATCACAATGGGAAGAACTTGAAAACCAAGCTCTTGTGTTCAAGTACTTAG 

CTGCAAATATGCCTGTTCCACCTCATCTTCTCTTCCTCATCAAAAGACCCTTTCTCTTCT 

TTGGGTGGAATGTGTATGAGATGGGAATGGGAAGAAAGATAGATGCAGAGCCAGGAAGAT 
GTAGAAGAACTGATGGCAAGAAATGGAGATGCTCTAAAGAAGCTTACCCTGACTCTAAGT 
ACTGTGAGAGACATATGCATAGAGGCAAGAACCGTTCTTCCTCAAGAAAGCCTCCTCCTA 
CTCAATTCACTCCAAATCTCTTTCTCGACTCTTCTTCCAGAAGAAGAAGAAGTGGATACA 
TGGATGATTTCTTCTCCATAGAACCTTCCGGGTCAATCAAAAGCTGCTCTGGCTCAGCAA 
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TGGAAGATAATGATGATGGCTCATGTAGAGGCATCAACAACGAGGAGAAGCAGCCGGATC 

GACATTGCTTCATCCTTGGTACTGACTTGAGGACACGTGAGAGGCCATTGATGTTAGAGG 

AGAAGCTGAAACAAAGAGATCATGATAATGAAGAAGAGCAAGGAAGCAAGAGGTTTTATA 

GGTTTCTTGATGAATGGCCTTCTTCTAAATCTTCTGTTTCTACTTCACTCTTCATTTGAT 

CATCTTTTGTTCTTATAACCTTGTATTTCTTGTTAAGATGGTAATGCAAATT 

>G1865 Amino Acid Sequence (domain in AA coordinates: 124-149) 

MDTLSIKTYLLLSYTFNFPIQIPIFNLSFFFISLSLSLFMATRIPFTESQWEELENQALV 

FKYLAANMPVPPHLLFLI KRPFLFS SSS SSS S S S S FFS PTLSPHFGWNVYEMGMGRKIDA 

EPGRCRRTDGKKWRCSKEAYPDSKYCERHMHRGKNRSSSRKPPPTQFTPNLFLDSSSRRR 

RSGYMDDFFSIEPSGSIKSCSGSAMEDNDDGSCRGINNEEKQPDRHCFILGTDLRTRERP 

LMLEEKLKQRDHDNEEEQGSKRFYRFLDEWPSSKSSVSTSLFI* 

>G1886 (43.. 909) 

AGGAAACATAAGTAATCGTTGCTTCGATCCTTTGTTACATGGATGGATCCTGAACAGGAA 
ATCTCAAACGAGACTTTGGAAACTATATTGGTAAGTTCAACAAAAGGAAGCAATAATAAC 
AATAAGAAAATGGAAGAAGAAATGAAGAAGAAAGTATCAAGAGGAGAATTAGGAGGTGAA 
GCTCAAAATTGTCCAAGATGTGAATCTCCAAACACAAAGTTTTGTTACTACAACAACTAT 
AGTCTCTCACAACCTCGTTACTTCTGCAAATCTTGTCGGAGATATTGGACTAAAGGCGGT 
ACTCTTCGTAACGTTCCCGTCGGTGGTGGTTGCCGTCGAAACAAACGATCCTCTTCCTCA 
GCTTTCTCCAAGAACAACAACAATAAGTCTATTiVATTTCCATACTGATCCACTTCAGAAC 
CCTTTAATTACGGGAATGCCACCATCATCTTTTGGTTATGATCACTCCATTGATCTCAAC 
CTCGCTTTCGCTACTCTCCAAAAGCATCATTTATCCTCTCAAGCTACTACGCCTTCTTTT 
GGGTTTGGAGGTGATCTTTCTATTTATGGAAACTCAACGAATGATGTAGGGATCTTCGGA 
GGGCAAAACGGTACTTATAACAATAGTTTGTGTTATGGGTTTATGTCCGGAAATGGTAAT 
AATAATCAAAATGAAATCAAGATGGCTTCTACATTGGGGATGTCTTTGGAAGGAAACGAG 
AGAAAGCAAGAGAATGTGAACAATAACAATAATAACTCAGAGAATCCTAGCAAGGTGTTC 
TGGGGGTTTCCATGGCAGATGACCGGAGATTCCGCCGGAGTTGTACCGGAGATTGATCCC 
GGAAGGGAAAGCTGGAATGGGATGGTTTCATCTTGGAATAATGGTTTACTCAACACTCCT 
TTGGTCTAGCAGATCATTAA 

>G1886 Amino Acid Sequence (domain in aa coordinates: 17-59) 

MDPEQEISNETLETILVSSTKGSNNmKKMEEEMKKKVSRGELGGEAQNCPRCESPNTKF 

CYYNNYSLSQPRYFCKSCRRYWTKGGTLRNVPVGGGCRRNKRSSSSAFSKNNNN 

TDPLQNPL I TGMPPS S FG YDHS I DLNLAFATLQKHHLS S QATTPS FGFGGDLS I YGNSTN 

DVGIFGGQNGTYIWSLCYGFMSGNGNNNQNEIKMAST^^ 

NPSKVFWGFPWQMTGDSAGWPEIDPGRESWNGMVSSWNNGLLNTPLV* 

>G1933 (33.. 1418) 

AATTGAGATTAAAGTAATTTATCTTTCAGAAAATGGCGGTTGAAGACGATGTATCTTTGA 
TAAGAACGACGACGTTAGTGGCACCAACAAGACCCACGATTACAGTTCCTCATAGACCTC 
CGGCGATCGAAACGGCGGCGTATTTCTTTGGCGGTGGAGATGGGCTTAGTCTAAGCCCAG 
GGCCACTTTCTTTTGTCTCTTCTTTGTTTGTTGATAACTTCCCTGACGTCTTGACGCCGG 
ATAACCAACGGACGACGTCGTTTACTCAGCTTCTTAACGGAACTATGTCGGTGTCTCCTG 
GTGGCGGAGGACGTTCAACGGCGGGGATGTTCGCCGGAGGAGGTCCGATGTTTACAATCC 
CTTCTGGTTTCAGCCCTTCTAGTCTTCTCACCTCGCCCATGTTCTTTCCCCCGCAGTCGT 
CAGCTCATACCGGCTTTATTCAACCACGGCAGCAGTCACAACCGCAACCACAACGACCAG 
ACACGTTTCCTCACCATATGCCACCATCGACATCCGTCGCCGTCCATGGTCGTCAATCTT 
TAGACGTTTCACAAGTAGATCAAAGAGCTCGAAACCATTATAAT7VATCCGGGGAATAACA 
ATAATAACCGGTCGTATAACGTTGTGAACGTTGATA^CCGGCGGATGACGGTTATAACT 
GGAGGAAGTACGGACAAAAGCCTATCAAAGGGTGTGAATATCCAAGGAGTTATTACAAAT 
GTACACATGTTAACTCTCCGGTGAAGAAGAAAGTCGAACGGTCATCGGATGGACAGATCA 
CTCAGATCATTTACAAAGGTCAACATGATCACGAGAGGCCTCAGAATCGCCGTGGCGGTG 
GAGGCAGAGATTCCAGTGAGGTTGGTGGTGCAGGGCAAATGATGGAATCTAGTGATGATA 
GTGGTTATCGTAAGGATCATGATGATGATGATGATGATGATGAAGATGATGAAGATCTTC 
CGGCTTCAAAGATAAGAAGAATAGACGGTGTGTCGACGACTCACCGGACGGTGACCGAGC 
CTAAGATTATCGTTCAGACAAAAAGTGAAGTCGATCTTCTCGACGATGGCTATAGGTGGC 
GTAAGTACGGACAAAAAGTTGTCAAAGGAAATCCCCATCCAAGGAGCTATTATAAATGTA 
CAACGCCAAATTGTACGGTCCGTAAACATGTAGAGAGAGCTTCCACGGATGCTAAGGCTG 
TGATTACAACTTACGAAGGTAAACACAATCACGATGTCCCTGCCGCTAGAAACGGTACCG 
CGGCAGCAACCGCAGCTGCGGTGGGGCCGTCTGACCACCATCGTATGAGATCAATGTCGG 
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GGAACAATATGCAACAACATATGAGTTTCGGTAACAATAATAACACAGGCCAATCTCCGG 

TTCTTTTGAGGTTGAAAGAAGAGAAAATCACAATTTGACTTTTAAGAACCAAAGATTTCG 
AGATTGATATT 

>G1933 Amino Acid Sequence (conserved domain in AA coordinates : 205 -263 344-404) 

^4AVEDDVSLIRTTTLVAPTRPTITVPHRPPAIETAAYFFGGGDGLSLSPGPLSFVSSLFV 

DNFPDVLTPDNQRTTSFTQLLNGTMSVSPGGGGRSTAGMFAGGGPMFTIPSGFSPSSLLT 

SPMFFPPQSSAHTGFIQPRQQSQPQPQRPDTFPHHMPPSTSVAVHGRQSLDVSQVDQRAR 
NHYlWPGNNNNNRSYim/im^ 

VERSSDGQITQIIYKGQHDHERPQNRRGGGGRDSTEVG6AGQMMESSDDSGYRKDHDDDD 

DDDEDDEDLPASKIRRIDGVSTTHRTVTEPKIIVQTKSEVDLLDDGYRWRKYGQKWKGN 

PHPRSYYKCTTPNCTTOKHVERASTDAKAVITTYEGKHNHDTOAARNGTAAA 

DHHRMRSMSGNNMQQH^lSFG^rNNNTGQSPvIlLRLKEEKITI* 

>G2059 (58. .1089) 

TTAAGAACAGGCTTCATTCTCTGGACAAACACTCAAAAAACAAACAAAAAAAGGAACATG 

GAAGATCAGTTTCCTAAAATAGAAACTAGCTTCATGCACGACAAGCTCTTGTCTTCTGGA 

ATCTACGGGTTCTTGAGTTCTTCGACGCCGCCACAACTTCTCGGTGTTCCAATATTTTTG 

GAAGGTATGAAATCTCCTCTTCTTCCTGCTTCTTCGACTCCGAGCTACTTTGTGTCGCCT 

CATGATCATGAGCTCACATCTTCTATTCATCCATCTCCGGTAGCTTCTGTTCCTTGGAAC 

TTTCTAGAATCTTTTCCTCAGTCTCAACATCCTGATCATCATCCTTCTAAACCTCCAAAC 

CTTACTTTGTTCCTTAAAGAACCAAAGCTACTAGAACTTTCTCAATCCGAAAGCAACATG 

AGCCCTTACCATAAATACATCCCAAACTCCTTTTATCAATCAGACCAAAACAGAAACGAA 

TGGGTAGAGATCAATAAAACTCTAACCAACTATCCCTCGAAAGGTTTTGGAAACTATTGG 

CTAAGTACCACCAAGACTCAACCCATGAAGTCAAAAACAAGAAAGGTTGTTCAGACGACG 

ACCCCAACAAAACTGTATAGAGGAGTGAGACAT^AGACACTGGGGCAAATGGGTCGCAGAG 

ATTAGGCTTCCAAGGAACAGAACCCGTGTTTGGCTCGGCACTTTTGAAACCGCTGAGCAA 

GCAGCAATGGCTTACGATACAGCAGCTTATATCCTTCGTGGCGAATTCGCACACCTCAAC 

TTTCCTGATCTTAAACT^CCAGCTCAAGTCCGGTTCTTTGCGATGCATGATCGCCTCACTT 

TTGGAGTCCAAGATTCAACAGATCTCATCTTCCCAAGTAAGTAACTCTCCTTCTCCTCCT 

CCTCCAAAAGTGGGAACACCGGAGCAAAAGAATCATCACATGAAGATGGAGTCAGGAGAA 

GACGTGATGATGAAGAAACAGAAAAGCCATAAGGAAGTGATGGAAGGAGATGGTGTACAA 

TTGAGTAGGATGCCTTCTTTGGATATGGATCTCATTTGGGATGCTCTCTCATTTCCTCAT 

TCTTCTTGACTTCAAATTAATATTTGTCAMCTTATTTTACTTACTTCTACCCTTTTTTA 

TATCAAAAGTTTCCACCAAAGAAAGAAATTCATATTATGATGCCAAGATTGGTTTGCATT 

TGGGGTTGAACACATTGTAATTCTTCTTACGACCACATAATCAAGTGGTTCTCCTTTTTT 
TGTCTGCTAA 

>G2059 Amino Acid Sequence (conserved domain in AA coordinates : 184-254) 
MEDQFPKI ETS FMHDKLLS SGI YGFLSS STPPQLLGVP I FLEGMKS PLL PAS STPS YFVS 

PHDHELTSSIHPSPVASVPWNFLESFPQSQHPDHHPSKPPNLTLFLKEPKLLELSQSESN 
MSPYHKYiPNSFYQSDQNRNEWVEINKTL^ 

TTPTKLYRGVRQRHWGKWAEIRLPRNRTRVWLGTFETAEQAAMAYDTAAYILRGEFAHL 
NFPDLKHQLKSGSLRCMIASLLESKIQQISSSQVSNSPSPPPPKVGTPEQKNHHMKMESG 
EDVMMKKQKSHKEVMEGDGVQLSRMPSLDMDLI WDALS FPHSS * 
>G2105 (42.. 1487) 

CTCTCTGACTTGAACTCTTCT 

ATCCACAGTACGGTATAGAACAACCATCTTCTCAATTCTCCTCTGATCTC 

ACCTCGTTTCAGCGCCGGACCAGCACCATCGT(nTCATTrCACCGACCATGAGATAAGTT 

TATTGCCACGTGGAATACAAGGGCTTACGGTGGCTGGAAACAACAGTAACACTATTACAA 

CGATCCAGAGTGGTGGCTGTGTTGGTGGGTTTAGTGGCTTTACGGACGGCGGAGGAACAG 

GGAGGTGGCCGAGGCAAGAGACGTTGATGTTGTTGGAGGTCAGATCTCGTCTTGATCACA 

AGTTCAAAGAAGCTAATC^AAAGGGTCCTCTCTGGGATGAAGTTTCTAGGATTATGTCGG 

AGGAACATGGATACACTAGGAGTGGCAAGAAGTGTAGAGAGAAGTTCGAGAATCTCTACA 

AGTACTATAAAAAAACAAAAGAAGGCAAATCCGGTCGGCGACAAGATGGTAAAAACTATA 

GATTTTTCCGGCAGCTTGAAGCGATATACGGCGAATCCAAAGACTCGGTTTCTTC 

ACAACACGCAGTTCATAATGACCAATGCTCTTCATAGTAATTTCCGCGCTTCTAACATTC 

ATAACATCGTCCCTCATCATCAGAATCCCTTGATGACCAATACCAATACTCAAAGTCAAA 

GCCTTAGCATTTCTAACAATTTCAACTCCTCCTCCGATTTGGATCTAACTTCTTCCTCTG 

AAGGAAACGAAACTACTAAAAGAGAGGGGATGCATTGGAAGGAAAAGATCAAGGAATTCA 
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TTGGTGTTCATATGGAGAGGTTGATAGAGAAGC7VAGATTTTTGGCTTGAGAAGTTGATGA 
AGATTGTGGAAGACAAAGAACATCAAAGGATGCTGAGAGAAGAGGAATGGAGAAGGATTG 
AAGCGGAAAGGATCGATAAGGAACGTTCGTTTTGGACAAAAGAGAGGGAGAGGATTGAAG 
CTCGGGATGTTGCGGTGATTAATGCCTTGCAGTACTTGACGGGAAGGGCATTGATAAGGC 
CGGATTCTTCGTCTCCTACAGAGAGGATTAATGGGAATGGAAGCGATAAAATGATGGCTG 
ATAATGAATTTGCTGATGAAGGAAATAAGGGCAAGATGGATAAAAAACAAATGAATAAGA 
AAAGGAAGGAGAAATGGTCAAGCCACGGAGGGAATCATCCAAGAACCAAAGAGAATATGA 
TGATATACAACAATCAAGAAACTAAGATTAATGATTTTTGTCGAGATGATGACCAATGCC 
ATCATGAAGGTTACTCACCTTCAAACTCCAAGAACGCAGG7VACTCCGAGCTGCAGCAATG 
CCATGGCAGCTAGTACAAAGTGCTTTCCATTGCTTGAAGGAGAAGGAGATCAGAACTTGT 
GGGAGGGTTATGGTTTGAAGCAAAGGAAAGAAAATAATCATCAGTAAGCTACATTTTTCA 
TTCTCAAAATGAAGAATAAGAGAACTTAGAAACGAT 

>G2105 Amino Acid Sequence (domain in AA coordinates: 100-153) 

MEDHQNHPQYGIEQPSSQFSSDLFGFNLVSAPDQHHRLHFTDHEISLLPRGIQGLTVAGN 

NSNTITTIQSGGCVGGFSGFTDGGGTGRWPRQETLMLLEVRSRLDHKFKEANQKGPLWDE 

vsrimseehgytrsgkkcrekfenlykyykktkegksgrrqdgknyrffrqleaiygesk 
dsvscy™tqfimtnalhsnfrasnihnivphhqnplmtntntqsqslsisnnfnsssdi, 
dltsssegnettkregmhwkekikefigvhmerliekqdfwleklmkivedkehqrmlre 
eewrrieaeridkersfwtkererieardvavinalqyltgralirpdssspteringng 
sdkmmadnefadegnkgkmdkkqmnkto 

rjdddqchhegyspsnsicnagtpscsnamaastkcfpllegegdqnlwegyglkqrkennh 

Q* 

>G2117 (49.. 465) 

GTCTATAACCTTCCAAGTCA7VAACCCTAATCCACAGTCTTTATTCCAAATCTTTGTTGAT 
CGAGTACCACTTTCAAACTTGCCTGCCACGTCAGACGACTCTAGCCGGACTGCAGAAGAT 
AATGAGAGGAAGCGGAGAAGGAAGGTATCGAACCGCGAGTCAGCTCGGAGATCGCGTATG 
CGGAAACAGCGTCACATGGAAGAACTOTGGTCCATG(^GTTCAACTCATCAATAAGAAC 
AAATCTCTAGTCGATGAGCTAAGCCAAGCCAGGGAATGTTACGAGAAGGTTATAGAAGAG 
AACATGAAACTTCGAGAGGAAAACTCCAAGTCGAGGAAGATGATTGGTGAGATCGGGCTT 
AATAGGTTTCTTAGCGTAGAGGCCGATCAGATCTGGACCTTCTAATCGTCTCGTAAGCTT 
GTTGGTTTTTTGTTGTTTATTTAAAG 

>G2117 Amino Acid Sequence (conserved domain in AA coordinates :46-106) 

MAGSVYNLPSQNPNPQSLFQIFVDRVPLSNLPATSDDSSRTAEDNERKRRRKVSNRESAR 

RSRMRKQRHMEELWSMLVQLINKNKBLVDELSQARECYEKVIEENMKLREENSKSRKMIG 

EIGLNRFLSVEADQIWTF* 

>G2124 (87.. 923) 

GAACAGCAAAACCCTAGATTTCCTGTTCAAGCTCAAGACCGTACAAAACTTTGGAACTCA 
TATATAAAGATCTCGAGAATAGCATTATGAATATCGTCTCTTGGAAAGATGCAAACGACG 
AAGTTGCAGGCGGCGCTACGACAAGACGTGAAAGAGAAGTAAAAGAGGATCAAGAAGAAA 
CCGAAGTCAGAGC(^CCAGTGGCAAAACCGTAATTAAAAAGCAGCCTACATCGATCTCTT 
CTTCTTCTTCTTCGTGGATGAAATCCAAGGATCCGAGGATTGTTAGGGTTTCACGCGCCT 
TTGGAGGCAAAGACCGTCACAGCAAAGTGTGTACGTTACGTGGACTACGTGACAGACGCG 
TGAGATTATCAGTCCCAACGGCTATTCAGCTCTACGATCTTCAAGAACGGCTCGGTGTTG 
ACCAGCCTAGCATU^GCCGTTGACTGGTTGCTTGATGCAGCTAAAGAGGAGATCGACGAGC 
TACCTCCGTTACCTATCTCGCCGGAAAATTTC^ 

TGAATCTTGGTCAACGGCCCGGTCAAGATCCGACCCAACTCGGGTTTAAAATCAATGGAT 
GTGTACAAAAGTCTACTACTACTAGCCGCGAAGAAAACGATAGAGAGAAAGGAGAAAACG 
ATGTCGTTTACACAAACAATCATCATGTTGGGTCTTATGGAACTTATCACAACCTGGAAC 
ATC^TCATCATCATCACCAACATTTGAGTTTACAGGC^GATTATCATAGTmTCAACTAC 
ATAGTCTTGTCCC^TTTCCATCACAAATTTTGGTATGTCCAATGACGACATC^CCAAC^ 
CTACAAOTATACAATCTTTGTTTCCATC^TCATCGTCAGCTGGTTCAGGGACTATGGAGA 
CATTAGATCCGAGGCAAATGTAGCAACAATGGTGGTAGAGACATTGATAATCGGATGTCG 
TCGGTCCAATTCAACCGAACTAATAGCACTACAACGGCTAACATGTCGAGGCATCTAGGC 
TCGGAGCGTTGTACAAGTAGAGGAAGTGATCACCATATGTGAAGTTAGATTATTGAAACG 
ATATAATTGTTGTTTGATGTGTTCAGAAATAAGGGGACAC 

>G2124 Amino Acid Sequence (domain in AA coordinates: 75-132) 



47 



BNSDOCIO <WO_03013227A2_t_> 



WO 03/013227 PCT/US02/25805 

48/286 



MNI VS WKDANDEVAGGATTRRERBVKEDQEETE VRATSGKTVI KKQPTS I SS S SS S WMKS 

IO)PRIVRVSRAFGGKDRHSKVCTLRGLRDRRVRLSVPTAIQLYDLQERLGVDQPSKAVDW 

LLDAAKEEIDELPPLPISPENFSIFNHHQSFLNLGQRPGQDPTQLGFKINGCVQKSTTTS 

REENDREKGENDWYTNNHHVGSYGTYHNLEHHHHHHQHLSLQADYHSHQLHSLVPFPSQ 

ILVCPMTTSPTTTTIQSLFPSSSSAGSGTMETLDPRQM* 

>G2140 (148.. 1254) 

ACTCTCTTAACTTTCGTTTCTTCTCCTACCTTCTTTTACCAACCTTTCCTTTCTCTTACA 

CACATATATATATACATATATAGAGAGAGAGAAGAGGACAAAGAGTTGAAAGATGAAGAC 

TCTCATGTCTTCATAGAAACAAGTGATATGTGCGCTAAGAAAGAAGAAGAAGAAGAAGAA 

GAAGAAGACAGTTCTGAAGCCATGAACAACATACAAAATTACCAAAATGACCTCTTCTTT 

CACCAACTCATCTCTCATCATCACCATCATCATCATGATCCTTCTCAATCTGAAACTTTG 

GGAGCATCCGGTAACGTTGGATCTGGTTTCACTATCTTCTCTCAAGATTCCGTCTCTCCA 

ATATGGTCTCTACCTCCACCTACCTCGATCCAACCACCATTTGATCAGTTTCCTCCTCCT 

TCTTCTTCTCCAGCATCTTTCTACGGAAGTTTCTTCAACAGAAGTCGAGCTCATCATCAG 

GGATTACAGTTTGGGTACGAGGGTTTTGGTGGAGCCACGTCAGCAGCACATCATCATCAT 

GAACAACTTCGGATCTTGTCGGAAGCTTTAGGTCCGGTAGTACAAGCCGGGTCCGGTCCT 

TTTGGGTTACAAGCTGAGTTAGGGAAGATGACAGCACAAGAGATCATGGACGCTAAAGCT 

TTGGCTGCTTCAAAGAGTCATAGTGAAGCTGAGAGAAGAAGAAGAGAGAGAATCAATAAT 

CATCTCGCTAAGCTCCGTAGCATATTACCCAACACCACCAAAACGGATAAAGCGTCGTTA 

CTAGCTGAAGTGATCCAACATGTGAAAGAGTTGAAGAGAGAGACTTCAGTGATCTCAGAG 

ACAAATCTTGTCCCAACGGAAAGCGATGAGTTAACGGTAGCTTTCACGGAGGAGGAAGAA 

ACCGGAGATGGCAGATTTGTAATTAAAGCGTCGCTTTGCTGTGAAGACAGGTCGGATCTC 

TTGCCTGACATGATTAAAACATTGAAAGCTATGCGTCTCT^AAACGCTCAAGGCGGAGATA 

ACCACCGTTGGGGGACGAGTCAAGAACGTTTTGTTTGTTACCGGAGAAGAGAGCTCCGGT 

GAGGAAGTGGAGGAAGAGTACTGTATAGGGACGATTGAGGAAGCTTTGAAAGCGGTGATG 

GAGAAGAGCAATGTAGAGGAATCATCTTCTTCTGGAAATGCTAAGAGACAGAGAATGAGT 

AGTCACAACACTATCACTATCGTCGAACAACAACAACAATATAATCAGAGGT7VATCAATT 

TTTTACTTAAATCGCTTTTTTTTCTTACTTTCGGTGTATCTACTACGTGTGTTGTTTGCT 

GGTTATGGAAATGAATGTTGTACGTCACGTTATACTATAGATATATGTGTGTTTGTGTGT 

ATGTATAACGGAAGTATTTGTATCCGTTGTGGTCTTGGACTTTTGGTTTGGTTCTAAGAT 

ACTTATTTTTAAAAACTTGTATCGTTGAGTTGGTTTTCTAGATATGCTTAATGGGAGTAT 
GTGACGAAAAAAAA 

>G2140 Amino Acid Sequence (domain in AA coordinates : 167-242) 

MCAKKEEEEEEEEDSSEAMNNIQNYQNDLFFHQLISHHHHHHHDPSQSETLGASGNVGSG 

FTIFSQDSVSPIWSLPPPTSIQPPFDQFPPPSSSPASFYGSFFNRSRAHHQGLQFGYEGF 

GGATSAAHHHHEQLRILSEALGPWQAGSGPFGLQAELGKMTAQEIMDAKALAASKSHSE 

AERRRRERINNHLAKLRSILPNTTKTDKASLLAEVIQHVKELKRETSVISETNLVPTESD 

ELTVAFTEEEETGDGRFVIKASLCCEDRSDLLPDMIKTLKAMRLKTLKAEITTVGGRVKN 

VLFVTGEESSGEEVEEEYCIGTIEEALKAVMEKSNVEESSSSGNAKRQRMSSHNTITIVE 
QQQQYNQR* 

>G2144 (102. .1241) 

ATTAGGGTTTTGTTGTCGTGAGATTTGATTACAC7UUITTGCTGAATTTGGTTTCGATTAT 
TGGTGTTATTGTTTTCGAAGATTTCCAGTGAGTTTCCGTTTATGGATCTGACTGGAGGAT 
TTGGAGCTAGATCCGGCGGTGTTGGACCGTGCCGGGAACCAATAGGCCTTGAATCGCTAC 
ATCTCGGTGACGAATTTCGGCAACTAGTGACGACTTTACCTCCCGAGAACCCCGGCGGTT 
CGTTCACGGCTTTGCTTGAGCTTCCACCTACACAAGCAGTGGAGCTTCTCCATTTCACTG 
ATTCTTCGTCTTCTCAACAAGCGGCAGTGACAGGGATCGGTGGAGAGATTCCTCCGCCGC 
TTCACTCTTTCGGTGGGACATTGGCTTTTCCTTCTAACTCAGTTCTCATGGAGCGAGCAG 
CTCGTTTCTCGGTGATTGCCACTGAGCAACAAAACGGAAATATCTCCGGGGAGACTCCGA 
CGAGCTCTGTACCTTCCAATTCAAGTGCTAATCTCGACAGAGTCAAGACGGAGCCTGCTG 
AGACCGATTCATCTCAGCGGTTGATTTCTGATTCAGCGATTGAGAATCAAATCCCTTGCC 
CTAACC^GAAC^TCGAAATGGGAAGAGGAAAGATTTCGAAAAGAAGGGTAAAAGCTCGA 
CGAAGAAGAACAAAAGCTCTGAAGAGAACGAGAAGCTGCCATATGTTC^CGTTAGAGCTC 
GTCGTGGTCAAGCAACCGATAGCCATAGCTTAGCAGAACGAGCAAGAAGAGAGAAGATAA 
ATGCACGAATGAAGCTGTTACAGGAACTGGTCCCAGGCTGTGATAAGATTCAAGGTACCG 
CGCTGGTGCTGGATGAAATCATTAACCATGTCCAGTCATTACAACGTCAAGTGGAGATGC 
TATCTU^TGAGACTTGCTGCGGTAAACCCCAGAATCGACTTCAATCTCGACACCATATTGG 
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CTTCAGAAAACGGTTCTTTAATGGATGGGAGCTTCAATGCCGCACCAATGCAGCTTGCTT 
GGCCTCAGCAAGCCATTGAGACCGAACAGTCCTTTCATCACCGGCAACTGCAACAACCAC 
CAACACAACAATGGCCTTTTGACGGCTTGAACCAGCCGGTATGGGGAAGAGAAGAGGATC 
AAGCTCATGGCAATGATAACAGCAATTTGATGGCAGTTTCTGAAAATGTAATGGTGGCTT 
CTGCTAATTTGCACCCAAATCAGGTCT^AAATGGAGCTGTAAGTTGGGAAAACGGTAGAGA 
TCATGAATGTGTATATACATCGTATAAGCTCGTTTCTCTCTATATAAATATAATCATAAA 
TATAGATATCTGTTAAGAAGGTATCAGTCATTTGATTCAGAGAGACAACACTGGTATGAT 
TGTTTCTTATTCTTGTACCAGATTTCGACAATGTAGAATTTAGTAGGATATGATCATTTT 
GATCTCGTTATATATA 

>G2144 Amino Acid Sequence (domain in AA coordinates : 2 03-283) 

MDLTGGFGARSGGVGPCREPIGLESLHLGDEFRQLVTTLPPENPGGSFTALLELPPTQAV 

ELLHFTDSSSSQQAAVTGIGGEIPPPLHSFGGTLAFPSNSVLMERAARFSVIATEQQNGN 

ISGETPTSSVPSNSSANLDRVKTEPAETDSSQRLISDSAIENQIPCPNQNNRNGKRKDFE 

KKGKSSTKKNKSSEENEKLPYVHVRARRGQATDSHSLAERARREKINARMKLLQELVPGC 

DKIQGTALVLDEIINHVQSLQRQVEMLSMRLAAVNPRIDFNLDTILASENGSLMDGSFNA 

APMQLAWPQQAIETEQSFHHRQLQQPPTQQWPFDGLNQPVWGREEDQAHGNDNSNLMAVS 

EMVMVASANLHPNQVKMEL* 

>G2431 (47.. 1057) 

CCCTTTCGTTTTTATTTAAATTTCTTGGGTCGTTTCTTAAATTTGTATGTGTTTATTAAT 
GGAGATCAACAATT^ATGCCAACAATACTAATACTACTATTGATAATCACAAGGCAAAGAT 
GAGCCTTGTGTTGTCAACGGATGCTAAGCCAAGGTTGAAATGGACTTGTGATCTTCATCA 
CAAATTCATCGAAGCCGTTAATCAACTTGGAGGACCTAACAAAGCAACACCTAAGGGTTT 
GATGAAGGTTATGGAGATTCCTGGGCTTACCTTATACCATCTCAAGAGCCATTTACAGAA 
ATATCGGTTAGGGAAGAGCATGAAGTTCGATGATAACAAGCTAGAAGTTTCCTCTGCATC 
AGAGAATCAAGAAGTTGAGAGTAAAAACGATTCAAGAGATCTCCGAGGCTGCAGTGTCAC 
CGAAGAAAACAGCAATCCAGCTAAAGAAGGGCTACAAATCACAGAGGCTTTACAAATGCA 
GATGGAAGTTCAGAAGAAACTTCATGAACAAATCGAAGTTCAGAGGCATTTGCAGGTGAA 
GATTGAGGCACAAGGAAAGTATCTACAGTCCGTTTTAATGAAAGCTCAACAAACTCTCGC 
TGGCTACTCATCTTCAAATCTCGGCATGGATTTTGCGAGGACCGAGCTCTCTAGATTAGC 
TTCAATGGTGAACAGAGGCTGTCCAAGCACTTCGTTCTCAGAGCTAACGCAAGTAGAAGA 
AGAAGAAGAAGGTTTCTTGTGGTACAAGAAACCAGAAAACAGAGGAATTAGTCAGCTGAG 
ATGTTCAGTAGAGAGCTCGTTGACATCTTCAGAGACCTCAGAGACAAAACTGGATACTGA 
CAATAACCTTAATAAATCGATTGAACTTCCGTTGATGGAGATCAACTCGGAAGTGATGAA 
GGGGAAGAAGAGAAGCATAAACGACGTCGTTTGCGTGGAGCAGCCTCTAATGAAGAGAGC 
TTTTGGAGTTGATGATGATGAGCATTTGAAGTTGAGTTTGAATACTTACAAGAAAGACAT 
GGAGGCGTGTACGAACATAGGACTAGGGTTTAATTAAAAAAAAAACATTTTACTAAAGTT 
ATATAAAAATGTTTTAAAAGAATCCA 

>G2431 Amino Acid Sequence (conserved domain in AA coordinates : 38-88) 

MCLLMEINNNANNTOTTIDNH 

TPKGLMKVMEIPGLTLYHLKSHLQKYRLGKSMK^ 

GCSVTEENSNPAKEGLQITEALQMQMEVQKKLHEQIEVQRHLQVKIEAQGK^QSVTjMKA 

QQTlAGYSSSNLGMDFARTELSRLASMVmGCPSTSFSELTQVEEEEEGFLWYKJCPENRG 

ISQLRCSVBSSLTSSETSETKLDTDimLNKSIELPLMEINSEVMKGKKRSINDWCVEQP 

LMKRAFGVDDDEHLKLSLNTYKKDMEACTNIGLGFN* 

>G2465 (86.. 1150) 

CAATATTCTTTCTCCATTGAGATTAAGCTTCTTTCTCGCTGTCGTCTCTCTATAGATCTT 
GGTTCTTAGTCCCTTTTGAATAATAATGATGGTGGAGATGGATTACGCTAAGAAAATGCA 
GAAATGTCATGAATACGTTGAAGCACTTGAAGAAGAACAGAAGAAAATCCAAGTCTTTCA 
ACGCGAGCTTCCTTTATGTTTAGAGCTTGTCACTCAAGCGATCGAAGCTTGTCGGAAGGA 
GTTATCTGGTACGACGACAACTA(^TCAGAACAGTGTTCAGAACAGACCACAAGTGTTTG 
TGGTGGTCCTGTCTTTGAAGAGTTTATTCCTATCAAGAAAATTAGTTCCTTGTGTGAAG^ 
AGTACAAGAAGAAGAAGAAGAAGATGGTGAACATGAATCTTCTCCAGAACTTGTGAATAA 
TAAGAAATCAGATTGGCTTAGATCTGTTCAGCTATGGAATCATTCACCGGATCTAAATCC 
AAAAGAGGAGCGTGTAGCTAAGAAAGCGAAAGTGGTGGAGGTGAAACCAAAAAGCGGTGC 
GTTTCAGCCGTTTCAAAAGCGCGTTTTGGAGACTGATTTGCAACCGGCGGTGAAAGTAGC 
TAGTTCGATGCCAGCGACGACGACGAGTTCTACGACGGAAACTTGTGGTGGTAAAAGTGA 
TTTGATTAAAGCTGGAGATGAGGAAAGACGGATAGAGCAGCAGCAATCGCAGTCGCATAC 
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GCATAGAAAACAAAGGCGGTGCTGGTCGCCGGAATTACACCGTCGATTCCTAAACGCGCT 
TCAGCAGCTTGGAGGATCTCATGTTGCTACACCAAAGCAAATCAGGGATCACATGAAGGT 
TGATGGATTAACAAACGACGAAGTTAAAAGCCATTTACAGAAATATAGACTTCACACAAG 
AAGGCCAGCAGCAACATCCGTGGCGGCACAAAGTACCGGGAATCAGCAACAACCACAATT 
TGTGGTGGTTGGAGGCATATGGGTACCATCGTCACAAGATTTTCCACCACCGTCCGATGT 
AGCCAACAAGGGTGGTGTATATGCTCCGGTTGCGGTGGCGCAATCTCCAAAACGTTCGTT 
GGAGAGAAGTTGCAACTCGCCGGCGGCATCTTCCTCTACAAATACAAATACTTCTACTCC 
TGTGTCATAATCTGATAGTCATACTATAATCATCTCCTGATGTTGATTTTGGTGTAGGTT 
TGAAAATGTTTATGTGAATGTAA 

>G2465 Amino Acid Sequence (conserved domain in AA coordinates : 219-269) 
MMWMDYAKKMQKCHEYVEALEEEQKXIQVFQRELPLCLELVTQAIEACRKELSGTTTTT 
SEQCSEQTTSVCGGPVFEEFIPIKKISSLCEEVQEEEEEDGEHESSPELVNNKKSDWLRS 
VQLV^SPDLNPKEERVAKKAKWEVKPKSGAFQPFQKRVXETO 

SSTTETCGGKSDLIKAGDEERRIEQQQSQSHTHRKQRRCWSPELHRRFLNALQQLGGSHV 
ATPKQIRDHMKVDGLTNDEVKSHLQKYRLHTRRPAATSVAAQSTGNQQQPQFVVVGGIWV 
PSSQDFPPPSDVANKGGVYAPVAVAQSPKRSLERSCNSPAASSSTNTNTSTPVS* 
>G2583 (38.. 607) 

CAAATCAGAAAATATAGAGTTTGAAGGAAACTAAAAGATGGTACATTCGAGGAAGTTCCG 

AGGTGTCCGCCAGCGACAATGGGGTTCTTGGGTCTCTGAGATTCGCCATCCTCTATTGAA 

GAGAAGAGTGTGGCTTGGAACTTTCGAAACGGCAGAAGCGGCTGCAAGAGCATACGACCA 

AGCGGCTCTTCTAATGAACGGCC7UVAACGCTAAGACCAATTTCCCTGTCGTAAAATCAGA 

GGAAGGCTCCGATCACGTTAAAGATGTTAACTCTCCGTTGATGTCACCAAAGTCATTATC 

TGAGCTTTTGAACGCTAAGCTAAGGAAGAGCTGCAAAGACCTAACGCCTTCTTTGACGTG 

TCTCCGTCTTGATACTGACAGTTCCCACATTGGAGTTTGGCAGAAACGGGCCGGGTCGAA 

AACAAGTCCGACTTGGGTCATGCGCCTCGAACTTGGG7VACGTAGTCAACGAAAGTGCGGT 

TGACTTAGGGTTGACTACGATGAACAAACAAAACGTTGAGAAAGAAGAAGAAGAAGAAGA 

AGCTATTATTAGTGATGAGGATCAGTTAGCTATGGAGATGATCGAGGAGTTGCTGAATTG 

GAGTTGACTTTTGACTTTAACTTGTTGCAAGTCCACAAGGGGTAAGGGTTTTC 

>G2583 Amino Acid Sequence (domain in AA coordinates : 4- 71) 

MVHSRKFRGTOQRQWGSWSEIRHPLLKRRWLGTFETAEAAARAYDQAALLMNGQNA^ 

NFPWKSEEGSDHVTO)VNSPLMSPKSLSELLNAKLRKSCKDI,TPSLTCLRLDTDSSHIGV 

WQKRAGSKTSPTVW1RLELGNv^5SAvX)LGLTTMNKQNVEKEEEEEEAI I SDEDQLAME 

MIEELLNWS* 

>G2724 (1..651) 

ATGGAAATAGAAATAAGGAGAGGTCCATGGACTGTGGAAGAAGACATGAAGCTCGTCAGT 
TACATTTCTCTTCACGGTGAAGGAAGATGG7VACTCCCTCTCTCGTTCTGCTGGACTGAAT 
AGAACGGGGAAAAGTTGCAGATTGCGGTGGCTAAATTATCTCCGGCCGGATATCCGCCGT 
GGAGACATATCCCTTCAAGAACAATTTATCATCCTTGAACTCCATTCTCGTTGGGGAAAT 
CGGTGGTCAAAGATTGCTCAACATTTACCGGGAAGAACAGATAACGAGATAAAGAATTAT 
TGGAGAACACGTGTTCAAAAGCATGCAAAACTTCTAAAATGTGACGTGAACAGCAAGCAA 
TTCAAAGACACCATCAAACATCTCTGGATGCCTCGTCTCATCGAGAGAATCGCCGCCACT 
CAAAGTGTCCAATTTACCTCTAACCACTACTCGCCTGAGAACTCCAGCGTCGCCACCGCC 
ACGTCATCAACGTCGTCGTCTGAGGCTGTGAGATCGAGTTTCTACGGTGGTGATCAGGTG 
GAATTTGGAACGTTGGATCATATGACAAATGGTGGTTATTGGTTCAACGGCGGAGATACG 
TTTGAAACTTTGTGTAGTITTGACGAGCTCAACAAGTGGCTCATACAGTAG 

>G2724 Amino Acid Sequence (conserved domain in AA coordinates : 7-113) 
MEIEIRRGPWTVT3EDMKLVSYISLHGEGRWNSLSRSAGLNRTC 

GD1SLQEQFIILELHSRWGNRWSKIAQHLPGRTDNEIKNYWRTRVQKHAKLLKCDVWSKQ 
FKDTIKHLWMPRLIERIAATQSVQFTSNHYSPENSSVATATSSTSSSEAVRSSFYGGDQV 
EFGTLDHMTNGGYWFNGGDTFETLCSFDELNKWLIQ* 
>G377 (1..396) 

atgggtctctcgcattttccaacagcgtcagaaggagtactaccacttctggtgatgaac 
acggttgtttcaatcactctgttgaagaacatggtgaggtctgtttttcaaattgttgca 
tccgagactgaatcttccatggagatagacgacgagcctgaagatgattttgttactaga 
agaatctcgataacacagttcaagtctctatgtgagaacatagaagaggaagaagaagag 
aaaggtgtggagtgttgtgtgtgcctttgtgggtttaaagaggaagaggaagtgagtgag 
ttggtttcttgcaagcatttcttccacagagcttgtctagacaactggtttggtaataac 
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cacaccacatgccctctttgcaggtccattctctag 

>G377 Amino Acid Sequence (domain in AA coordinates : 85-128) 

MGLSHFPTASEGVLPLLVMNTWSITLLKNMVRSVFQIVASETESSMEIDDEPEDDFVTR 

RISITQFKSLCENIEEEEEEKGVECCVCLCGFKEEEEVSELVSCKHFFHRACLDNWFGNN 

HTTCPLCRS I L * 
>G428 (97.. 1032) 

TTACTTTTGTGTTTCTTCATATTCTTCAGAAGCAAGCACAAGGCTAGGGATCGAAGAAGC 

GGCGATCACTGATCGTATCTCACTACGATCACATTAATGGATAGAATGTGTGGTTTCCGC 

TCGACGGAAGACTATTCGGAGAAAGCGACGTTGATGATGCCGTCCGATTATCAGTCTTTG 

ATTTGTTCAACCACCGGAGACAATCAAAGACTGTTTGGATCCGACGAACTCGCTACCGCT 

TTGTCCTCGGAGTTGCTTCCGCGTATTCGAAAAGCTGAGGATAATTTCTCTCTTAGTGTC 

ATCAAATCCAAAATCGCTTCTCATCCTTTGTATCCTCGCTTACTCCAAACCTACATCGAT 

TGCCAAAAGGTGGGAGCGCCTATGGAAATAGCGTGTATATTGGAAGAGATTCAGCGAGAG 

AACCATGTGTACAAGAGAGATGTTGCTCCATTATCTTGCTTTGGAGCTGATCCTGAGCTT 

GATGAATTCATGGAAACCTACTGTGATATATTGGTTAAATACAAAACCGATCTTGCGAGG 

CCGTTCGACGAGGCTACAACTTTCATAAACAAGATTGAAATGCAGCTTCAGAACTTGTGC 

ACTGGTCCAGCGTCTGCTACAGCTCTTTCAGATGATGGTGCGGTTTCATCTGACGAGGAA 

CTGAGAGAAGATGATGACATAGCAGCGGATGACAGCCAACAAAGAAGCAATGACCGCGAT 

CTGAAGGACCAGCTACTACGCAAATTTGGTAGCCATATCAGTTCATTGAAACTCGAGTTC 

TCTAAAAAGAAGAAGAAAGGGAAGCTACCAAGAGAAGCAAGACAAGCGTTGCTCGATTGG 

TGGAATGTTCATAATAAATGGCCTTACCCTACTGAAGGCGACAAAATAGCTCTGGCTGAA 

GAAACAGGTTTGGATCAAT^AACAAATCAACAATTGGTTTATAAACCAAAGGAAACGCCAT 

TGGAAGCCTTCGGAGAACATGCCGTTTGATATGATGGACGATTCTAATGAAACATTCTTT 

ACCGAGGAATGAAAAGAGAGACATGGGATTGTGCATTGTATAATTTTTACACTGTTTTCC 

CAAGAAAAGAAAACAGTAAAAAGCTTTTGGTAAATGGGACATCATCGCGAATGAATGGAA 

CCAGTTAGCCAAAACGGTCAAGGGCGTGGCGTAACGAGACATTGTATTGGAAATAGTGGC 

AATATTATGTCACTAATCTTCCAATGGTCCAAAATGATAGATTTCTTATTTGTATTGAAC 

CTTACTTAGATAGCTGATGTGTCAACTAAATAATTTATTTTCATCCTTATACTACTTGTA 

TCAATGTCTCTAATTGATO^TTGTTGCTTGCTATTCAAAAAAAAAAAAAAAAAAAAAA 

>G428 Amino Acid Sequence (domain in AA coordinates: 229-292) 

MDRMCGFRSTEDYSEKATLMMPSDYQSLICSTTGDNQRLFGSDELATALSSELLPRIRKA 

EDNFSLSVIKSKIASHPLYPRLLQTYIDCQKVGAPMEIACILEEIQRENHVYKRDVAPLS 

CFGADPELDEFMETYCDILVKYKTDLARPFDEATTFINKIEMQLQNLCTGPASATALSDD 

GAVSSDEELREDDDIAADDSQQRSOTRDLKDQLLRKFGSHISSLKLEFSKKKKKGKIiPRE 

jVRQALLDWWNVHNKWPYPTEGDKIAIiAEETGL^ 

DDSNETFFTEE* 

>G447 (241.. 3501) 

CTTTTTAAGAGCTTAAAAATTTGCTTTGT^AGCTTCAAATATTCTTATGAACTAAAAAGAA 
GAAAAAAGCTTTTGTTTCTTTTTCCTTAGCAGCAGAATGATTTT^ 

ACTATTTAGTTTCTCTCGTGCTCTTCTCTTGAGCAAATACAGATTCGTTAATTTTGCTGA 
AGAAGAAGAACTCTGTTTCTTCCCTGCACCAAACCAATTTTTTCGTTCTTTCTATAAACC 
ATGAAAGCTCCATCAAATGGATTTCTTCCAAGTTCCAACGAAGGAGAGAAGAAGCCAATC 
AATTCTCAACTATGGCACGCTTGTGCAGGGCCTTTAGTTTCATTACCTCCTGTGGGAAGT 
CTTGTGGTTTACTTCCCTCAAGGACACAGCGAGCAAGTTGCAGCATCGATGCAGAAGCAA 
ACAGATTTTATACCAAATTACCCAAATCTTCCTTCTAAGCTGATTTGCTTGCTTCACAGT 
GTTACATTACATGCTGATACCGAAACAGATGAAGTCTATGCACAAATGACTCTTCAACCT 
GTGAATAAGTATGATAGAGAAGCATTGCTAGCTTCTGATATGGGCTTGAAGCTAAACAGA 
CAACCTACTGAGTTTTTTTGCAAGACTCTT^ 

TTCTCTGTACCGCGTCGTGCAGCTGAGAAAATATTCCCTCCTCTTGATTTCTCGATGCAA 
CCGCCTGCGCAAGAGATTGTAGCTAAAGATTTACATGATACTACATGGACTTTCAGACAT 
ATCTATCGAGGCCAACCAAAAAGACACTTGCTTACCACAGGTTGGAGCGTTTTTGTTAGC 
ACAAAGAGACTATTTGCGGGTGATTCAGTTTTGTTTGTAAGAGATGAGAAATCACAGCTG 
ATGTTGGGTATAAGACGTGCAAATAGACAAACTCCGACTCTTTCCTCATCGGTCATATCC 
AGCGACAGTATGCACATTGGGATACTTGCAGCTGCAGCTCATGCTAATGCCAATAGTAGC 
CCTTTTACCATCTTCTTCAATCCAAGGGCAAGTCCTTCAGAGTTTGTAGTTCCTTTAGCC 
AAATACAACAAAGCCTTATACGCTCAAGTATCTCTAGGAATGAGATTCCGGATGATGTTT 
GAGACTGAGGATTGTGGGGTTCGTAGATATATGGGTACAGTCACAGGTATTAGTGATCTT 
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GACCCTGTAAGATGGAAAGGCTCACAATGGCGTAATCTTCAGGTAGGATGGGATGAATCA 
ACAGCTGGAGATAGGCCAAGCCGAGTATCCATATGGGAAATCGAACCCGTCATAACTCCT 
TTTTACATATGTCCTCCTCCATTTTTCAGACCTAAGTACCCGAGGCAACCCGGGATGCCA 
GATGATGAGTTAGACATGGAAAATGCTTTCAAAAGAGCAATGCCTTGGATGGGAGAAGAC 
TTTGGGATGAAGGACGCACAGAGTTCGATGTTCCCTGGTTTAAGTCTAGTTCAATGGATG 
AGTATGCAGCAAAACAATCCATTGTCAGGTTCTGCTACTCCTCAGCTCCCGTCCGCGCTC 
TCATCTTTTAACCTACCAAACAATTTTGCTTCCAACGACCCTTCCAAGCTGTTGAACTTC 
CAATCCCCAAACCTCTCTTCCGCAAATTCCCAATTCAACAAACCGAACACGGTTAACCAT 
ATCAGCCAACAGATGCAAGCACAACCAGCCATGGTGAAATCTCAACAACAACAACAACAA 
CAACAACAACAACACCAACACCAACAACAACAACTGCAACAACAACAACAACTACAGATG 
TCACAGCAACAGGTG CAGCAACAAGGGATTTATAACAATGGTACGATTG CTGTTGCTAAC 
CAAGTCTCTTGTCAAAGTCCAAACCAACCTACTGGATTCTCTCAGTCTCAGCTTCAGCAG 
CAGTCAATGCTCCCTACTGGTGCTAAAATGACACACCAGAACATAAATTCTATGGGGAAT 
AAAGGCTTGTCTCAJUVTGACATCGTTTGCGCAAGAAATGCAGTTTCAGCAGCAACTGGAA 
ATGCATAACAGTAGCCAGTTATTAAGAAACCAGCAAGAACAGTCCTCTCTCCATTCATTA 
CAACAAAATCTGTCCCAAAATCCTCAGCAACTCCAAATGCAACAACAATCATCAAAACCA 
AGTCCTTCACAACAGCTTCAGTTGCAGCTACTGCAGAAGCTACAGCAGCAGCAACAGCAG 
CAGTCGATTCCTCCAGTAAGCTCATCCTTACAGCCACAATTATCAGCGTTGCAGCAGACA 
CAAAGCCATCAATTGCAACAACTTCTGTCGTCTCAAAATCAACAGCCCTTGGCACATGGT 
AATAACAGCTTCCCAGCTTCAACTTTCATGCAGCCTCCACAGATTCAGGTGAGTCCTCAG 
CAGCAAGGACAGATGAGTAACAAAAATCTTGTAGCCGCTGGAAGATCACATTCTGGCCAC 
ACAGATGGAGAAGCTCCTTCTTGTTCAACCTCACCTTCCGCCAATAACACGGGACATGAT 
AATGTTTCACCGACAAATTTCCTGAGCAGAAATCAACAGCAAGGACAAGCTGCATCTGTA 
TCTGCATCTGATTCAGTCTTTGAGCGCGCAAGCAATCCGGTCCAAGAGCTTTATACAAAA 
ACTGAGAGCCGGATCAGTCAAGGCATGATGAATATGAAGAGTGCTGGTGAACATTTCAGA 
TTTAAAAGCGCGGTAACAGATCAAATCGATGTATCCACAGCGGGAACGACGTACTGTCCT 
GATGTTGTTGGCCCTGTACAGCAGCAACAAACTTTCCCACTACCATCATTTGGTTTTGAT 
GGAGACTGCCAATCTCATCATCCAAGAAACAACTTAGCTTTCCCTGGTAATCTCGAAGCC 
GTAACTTCTGATCCACTCTATTCTCAAAAGGACTTTCAAAACTTGGTTCCCAACTATGGC 
AACAC^CCAAGAGACATTGAGACGGAGCTGTCCAGTGCTGCAATCAGTTCTCAGTCATTT 
GGTATTCCCAGCATTCCCTTTAAGCCCGGATGTTCAAATGAGGTTGGCGGCATCAATGAT 
TCAGGAATCATGAATGGTGGAGGACTGTGGCCCAATCAGACTCAACGAATGCGAACATAT 
ACAAAGGTTCAAAAACGAGGGTCAGTAGGTAGATCAATAGATGTTACCCGTTATAGCGGC 
TATGATGAACTTAGGCATGACTTAGCGAGAATGTTTGGCATCGAAGGACAGCTCGAAGAT 
CCGCTAACCTCTGATTGGAAACTCGTCTACACCGATCACGAAAACGATATTTTACTAGTT 
GGTGATGATCCTTGGGAAGAGTTTGTGAACTGCGTGCAGAACATAAAGATACTATCATCA 
GTAGAAGTTCAGCAAATGAGCTTAGACGGAGATCTTGCAGCTATCCCAACCACAAACCAA 
GCCTGCAGCGAAACAGACAGCGGAAATGCTTGGAAAGTACACTATGAAGACACTTCTGCT 
GCAGCTTCTTTCT^CAGATAGAAATAAAAAGATGCAAATATACCAAGTCAACTTACATTA 
TCATTCGAGGCCATCGCAAAGTACATGTTTTTT^ 

ACTGAGAAGAAGAAGATACTGCACGGTATATAAACATTTTTATAGGACAGTGATTTGATT 
TTTCATTCTAACTTGATGTTGTTGTACTTT^ 

TGCTTGACAAGTCTATGAGGAGCATATCTTATACAGAGATACTAAGATGTAATGTTAATG 
TAACTAAACAATTACCTTC^TTAATCATGAATCCTTTGGTCGTTTAAAA 

>G447 Amino Acid Sequence (conserved domain in AA coordinates : 22-356) 

MKAPSNGFLPSSNEGEKKPINSQLWHACAGPLVSLPPVGSLWYFPQGHSEQVAASMQKQ 

TDFIPNYPNLPSKLICLLHSTOLHADTETO 

QPTEFFCKTLTASD3ISTHGGFSVPRRAAEKIFPPLDFSMQPPAQEIVAKDLHDTTWTFRH 

IYRGQPKRHLLTTGWSVFVSTKRLFAGDSVLFVRDEKSQI^ 

SDSMHIGILAAAAHANANSSPFTIFFNPRASPSEFWPLAKYNK^ 

ETEDCGVRRYMGTVTGISDLDPVliWKGSQWRNLQVGWDESTAGDRPSRVSIWEIEPVITP 

FYICPPPFFRPKYPRQPGMPDDELDMENAFKRAMPWMGEDFGMKDAQSSMFPGLSLVQWM 

SMQQNNPLSGSATPQLPSALSSFNLPlTOFASNDPSKIi^^ 

ISQQMQAQPAMVKSQQQQQQQQQQHQHQQQQLQQQQQIiQMSQQQVQQQGIYNNGTIAVAN 
QVSCQSPNQPTGFSQSQLQQQSMLPTGAKMTHQNINSMGNKGLSQMTSFAQEMQFQQQLE 
MHNSSQLLRNQQEQSSLHSLQQNLSQNPQQLQMQQQSSKPSPSQQLQLQliLQKLQQQQQQ 
QSIPPVSSSLQPQLSALQQTQSHQLQQLLSSQNQQPLAHGNNSFPASTFMQPPQIQVSPQ 
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QQGQMSNICNLVAAGRSHSGHTDGEAPSCSTSPSANNTGHDNVSPTNFLSRNQQQGQAASV 
SASDSVFERASNPVQELYTKTESRISQGMMNMKSAGEHFRFKSAVTDQIDVSTAGTTYCP 
DWGPVQQQQTFPLPSFGFDGDCQSHHPRNNLAFPGNLEAVTSDPLYSQKDFQNLVPNYG 
NTPRDIETELSSAAISSQSFGIPSIPFKPGCSNEVGGINDSGIMNGGGLWPNQTQRMRTY 
TKVQKRGSVGRSIDVTRYSGYDELRHDLARMFGIEGQLEDPLTSDWKLVYTDHENDILLV 
GDDPWEEFWCVQNIKILSSVEVQQMSLDGDLAAIPTTNQACSETDSGNAWKVHYEDTSA 
AASFNR* 

>G464 (41.. 760) 

CTCTGCTGGTATCATTGGAGTCTAGGGTTTTGTTATTGACATGCGTGGTGTGTCAGAATT 
GGAGGTGGGGAAGAGTAATCTTCCGGCGGAGAGTGAGCTGGAATTGGGATTAGGGCTCAG 
CCTCGGTGGTGGCGCGTGGAAAGAGCGTGGGAGGATTCTTACTGCTAAGGATTTTCCTTC 
CGTTGGGTCTAAACGCTCTGCTGAATCTTCCTCTCACCAAGGAGCTTCTCCTCCTCGTTC 
AAGTCAAGTGGTAGGATGGCCACCAATTGGGTTACACAGGATGAACAGTTTGGTTAATAA 
CCAAGCTATGAAGGCAGCAAGAGCGGAAGAAGGAGACGGGGAGAAGAAAGTTGTGAAGAA 
TGATGAGCTCAAAGATGTGTCAATGAAGGTGAATCCGAAAGTTCAGGGCTTAGGGTTTGT 
TAAGGTGAATATGGATGGAGTTGGTATAGGCAGAAAAGTGGATATGAGAGCTCATTCGTC 
TTACGAAAACTTGGCTCAGACGCTTGAGGAAATGTTCTTTGGAATGACAGGTACTACTTG 
TCGAGAAAAGGTTAAACCTTTAAGGCTTTTAGATGGATCATCAGACTTTGTACTCACTTA 
TGAAGATAAGGAAGGGGATTGGATGCTTGTTGGAGATGTTCCATGGAGAATGTTTATCAA 
CTCGGTGAAAAGGCTTCGGATCATGGGAACCTCAGAAGCTAGTGGACTAGCTCCAAGACG 
TCAAGAGCAGAAGGATAGACAAAGAAACAACCCTGTTTAGCTTCCCTTCCAAAGCTGGCA 
TTGTTTATGTATTGTTTGAGGTTTGCAATTTACTCGATACTTTTTGAAGAAAGTATTTTG 
GAGAATATGGATT^AAAGCATGCAGAAGCTTAGATATGATTTGAATCCGGTTTTCGGATAT 
GGTTTTGCTTAGGTCATTCAATTCGTAGTTTTCCAGTTTGTTTCTTCTTTGGCTGTGTAC 
C AATTATCT ATGTTCTGTGAGAGAAAG CTCTT 

>G464 Amino Acid Sequence (domain in AA coordinates: 2 0-28, 71-82, 126-142, 187- 
224) 

MRGVSELEVGKSNLPAESELELGLGLSLGGGAWKERGRILTAKDFPSVGSKRSAESSSHQ 

GASPPRSSQWGWPPIGLHRMNSLVNNQAMKAARAEEGDGEKKWKNDELKD 

VQGLGFVKVNMDGVGIGRKVDMRM 

SDFVLTYEDKEGDWMLVGDVPWRMFINSVKRLRIMGTSEASGLAPRRQEQKDRQRNNPV* 
>G557 (192.. 698) 

CAGAGATCTGACGGCGGTAGCAGAGTAATCTATTCCTTCCCAAAATGTCTCGCAATTAGA 

TTCTTTCCAAGTTCTTCTGTAAATCCCAAGTCCCGCTCTTTTCCTCTTTATCCTTTTCAC 

CAGCTTCGCTACTAAGACAACAAATCTTTCCCTCTCTCTCTCGCCTGATCGATCTTCAAA 

GAGTAAGAAAAATGCAGGAACAAGCGACTAGCTCTTTAGCTGCAAGCTCTTTACCATCAA 

GCAGCGAGAGGTCATCAAGCTCTGCTCCACATTTGGAGATCAAAGAAGGAATTGAAAGCG 

ATGAGGAGATACGGCGAGTGCCGGAGTTTGGAGGAGAAGCTGTCGGAAAAGAAACTTCCG 

GTAGAGAATCTGGATCGGCX3ACCGGTCAGGAGCGGACACAGGCGACTGTCGGAGAAAGTC 

AAAGGAAGCGAGGGAGGACACCGGCGGAGAAAGAGAACAAGCGGCTGAAGAGGTTGTTGA 

GGAACAGAGTTTCAGCTCAGCAAGCAAGAGAGAGGAAAAAGGCTTACTTGAGCGAGTTGG 

AAAACAGAGTGAAAGACTTGGAGAACAAAAACTCTGAACTTGAAGAGCGACTCTCTAOT 

TTCAGAACGAGAACCAGATGCTTAGACA-TATTCTGAAGAACACAACAGGAA^ 

GAGGTGGTGGTGGTTCTAATGCTGATGCAAGCCTTTGATCTCCTTCTTCTTCTTGTGTTA 

TATTTTTGTGGATAAAATTTAGAGAGAATTGTATC^ 

GGGATGTGAGAGCTAATATTGCAATTGTAGACCAAGTTCTCTTAAAAAAAAAAAAAAA^ 
AA 

>G557 Amino Acid Sequence (domain in AA coordinates: 90-150) 

MQEQATSSLAASSLPSSSERSSSSAPHLEIKEGIESDEEIRRVPEFGGEAVGKETSGRES 

GSATGQERTQATVGESQRKRGRTPAEKENKRLKRLLRlHiVSAQQARERKKAYLSELENRV 

KDLENKNSELEERLSTLQNENQMLRHILKNTTGNKRGGGGGSNADASL* 

>G577 (44.. 2155) 

AAAAACAGACTGAGAGAGAGAGAGAGAGAGTGTGTTGTTGGCCATGGGATGCACGGCCTC 
CAAGCTCGACAGTGAGGATGCTGTCCGTCGCTGCAAGGAGCGGCGCCGTCTTATGAAGGA 
CGCCGTCTACGCTCGTCACCATCTCGCCGCCGCTCACTCTGACTACTGCCGCTCCCTTCG 
TCTCACTGGCTCTGCCCTCTCCTCCTTCGCCGCCGGCGAGCCCCTCTCCGTCTCCGAGAA 
TACTCCCGCTGTTTTTCTCCGCCCTTCCTCCAGTCAGGACGCGCCACGTGTCCCTTCTTC 
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CCATTCCCCAGAACCCCCTCCTCCGCCCATCCGCAGCAAGCCTAAGCCTACTAGGCCTAG 
GAGGCTTCCACACATTCTCTCCGACTCCTCTCCTTCTTCCTCTCCTGCCACCAGTTTCTA 
TCCCACTGCTCACCAGAACTCTACTTACTCTCGCTCTCCATCTCAAGCTTCCTCTGTCTG 
GAACTGGGAGAATTTCTACCCTCCCTCTCCCCCCGACTCCGAGTACTTCGAACGCAAAGC 
TCGCCAGAACCAC^GCACCGTCCTCCTTCCGACTACGACGCCGAAACTGAAAGATCCGA 
CCACGATTACTGCCACTCACGGAGAGATGCCGCCGAGGAAGTTCACTGCAGCGAGTGGGG 
CGACGACCACGACCGTTTCACTGCCACCTCTTCGTCCGACGGAGATGGGGAGGTCGAAAC 
TCACGTTTCCAGATCCGGTATTGAAGAAGAGCCTGTGAAACAACCACATCAAGACCCAAA 
TGGCAAAGAGCACTCTGACCATGTTACCACTTCTTCCGACTGCTACAAGACCAAATTGGT 
GGT7VAGGCACAAGAATTTGAAGGAGATCCTTGACGCCGTTCAAGACTACTTCGACAAGGC 
TGCCTCCGCTGGGGACCAGGTCTCCGCCATGCTTGAGATCGGCCGGGCTGAGCTCGACCG 
CAGCTTCAGCAAGCTGAGGAAGACGGTGTATCATTCAAGCAGTGTGTTCAGCAACTTGAG 
CGCAAGCTGGACCTCAAAACCCCCATTGGCAGTCAAATACAAGCTCGATGCATCTACCCT 
G AATGATG AACAAGG CGG C CTCAAG AGC CTCTG CT C CACTCTAGACCG ACTCCT CG CTTG 
GGAAAAGAAGCTTTATGAGGATGTCAAGGCAAGAGAAGGAGTTAAGATTGAGCACGAGAA 
GAAGCTGTCTGCGCTGCAGAGTCAGGAGTATAAGGGAGGTGATGAATCCAAGCTAGACAA 
GACTAAAACTTCCATAACCAGACTGCAATCACTCATCATTGTTTCTTCAGAAGCTGTTTT 
AACCACGTCTAATGCCATTCTCCGCCTCCGGGACACTGACCTTGTCCCTCAGCTTGTTGA 
ACTCTGCCACGGATTAATGTACATGTGGAAGTCAATGCACGAGTATCACGAAATCCAGAA 
CAACATCGTGCAACAAGTCCGTGGCCTGATCAACCAAACAGAGAGAGGTGAGTCAACATC 
AGAGGTACACCGGCAGGTGACGCGGGACCTAGAGTCAGCTGTGTCCTTGTGGCATTCGAG 
CTTCTGTCGCATCATTAAATTCCAGAGGGAGTTCATATGCTCTCTCCACGCATGGTTCAA 
GCTGAGCCTGGTTCCCCTGAGCAACGGAGACCCAAAGAAACAGCGGCCAGACTCATTTGC 
CTTGTGCGAGGAGTGGAAGCAGAGCCTGGAACGGGTGCCTGACACAGTGGCGTCAGAAGC 
CATAAAGAGCTTTGTAAACGTGGTACATGTGATATCAATAAAGCAGGCGGAAGAGGTGAA 
GATGAAGAAACGCACGGAGAGTGCAGGAAAGGAGCTGGAGAAGAAAGCATCCTCACTGAG 
GAGCATAGAGAGGAAGTACTACCAGGCATACTCGACGGTTGGGATAGGCCCTGGACCGGA 
GGTGTTGGACTCACGGGACCCGCTATCTGAGAAGAAATGTGAGCTGGCGGCATGTCAGAG 
GCAGGTGGAGGATGAGGTAATGAGGCACGTGAAGGCTGTGGAGGTGACACGAGCTATGAC 
TCTCAACAATCTACAAACCGGCCTGCCCAATGTATTCCAGGCCTTGACCAGCTTCTCATC 
TCTCTTCACTGAATCTCTCCAGACTGTCTGTTCTCGTTCCTACTCCATCAACTGATTATG 
TCCAAGTTTCTCATTTATTTTTAAGCTCTCATTACGTTGGTATCATGTAAATTTGAGGAT 
TGATTAAATTGAGTCTTGTGGTTTTGTGAGGACTCACAATCTTTCTCATTTAAAAAAAAA 
AAAAAAAAAA 

>G577 Amino Acid Sequence (domain in AA coordinates: TBD) 
MGCTASKIiDSEDAVRRCKERRRLMKDAWARHHLAAM 

LSVSENTPAVFIiRPSSSQDAPRVPSSHSPEPPPPPIRSKPKPTRPRRLPHILSDSSPSSS 

PATSFYPTAHQNSTYSRSPSQASSVWNWENFYPPSPPDSEYFERKARQNHKHRPPSDYDA 

ETBRSDHDYCHSRRDAAEEVHCSEWGDDHDRFTATSSSDGDGEVETHVSRSGIEEEPVKQ 

PHQDPNGKEHSDHVTTSSDCYKTKLVVRHKNLKEILDAVQDYFDKAASAGDQVSAMLEIG 

RAELDRSFSKLRKTVYHSSSVFSNLSASWTSKPPI^VKYKLDASTLNDEQ 

DRLIAWEKIOjYEDVKAREGVKIEHEKKLSALQ^ 

SSEAVLTTSNAILRLRDTDLVPQLVELCHGLMYMWKSMHEYHEIQNNIVQQVRGLINQTE 

RGESTSEVHRQVTRDLESAVSLWHSSFCRIIKFQREFICSLHAWFKLSLVPLSNGDPKKQ 

RPDS FALCEEWKQSLERVPDTVASEAIKS FVNWHVIS I KQAEEVKMKKRTES AGKELEK 

KASSLRSIERKYYQAYSTVGIGPGPEVLDSRDPLSEKKCELAACQRQVEDEVMRHVKAVE 

VTRAMTLNNLQTGLPNVFQALTSFSSL.FTESLQTVCSRSYSIN* 

>G674 (1..786)- 

ATGGTGTTTAAATCAGAAAAATCAAACCGGGAAATGAAATCAAAGGAGAAGCAAAGGAAG 
GGATTATGGTCACCCGAGGAAGATGAGAAGCTTAGGAGTCATGTCCTCAAATATGGCCAT 
GGATGCTGGAGTACTATTCCTCTTCAAGCTGGATTGCAGAGGAATGGGAAGAGTTGTAGA 
TTAAGGTGGGTTAATTATTTAAGACCTGGACTTAAGAAGTCTTTATTCACTAAACAAGAG 
GiWVACTATACTTCTTTCACITCATTCCATGTTGGGT 

TTCTTAC CAGGAAG AACCGACAACGAGATCAAAAACTATTGG CATTCTAATCTAAAGAAG 
GGTGTAACTTTGAAACAACATGAAACCACAAAAAAACATCAAACACCTTTA^ 
TCACTTGAGGCCTTGCAGAGTTCAACTGAAAGATCTTCTTCATCTATCAATGTCGGAGAA 
ACGTCTAATGCTCAAACCTCAAGCTTTTCGCCAAATCTCGTGTTCTCGGAATGGTTAGAT 
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CATAGTTTGCTTATGGATCAGTCACCTCAAAAGTCTAGCTATGTTCAAAATCTTGTTTTA 
CCGGAAGAGAGAGGATTCATTGGACCATGTGGCCCTCGTTATTTGGGAAACGACTCTTTG 
CCTGATTTCGTGCCAAATTCAGAATTTTTGTTGGATGATGAGATATCATCTGAGATCGAG 

ATGTAA 

>G674 Amino Acid Sequence (domain in AA coordinates: 20-120) 

MVFKSEKSNREMKSKEKQRKGLWSPEEDEKLRSHVLKYGHGCWSTIPLQAGLQRNGKSCR 

LRWVNYLRPGLKKSLFTKQEETIIjLSLHSMLGNKWSQISKFLPGRTDNEIKNYWHSNLKK 

GVTLKQHETTKKHQTPLITNSLEALQSSTERSSSSINVGETSNAQTSSFSPNLVFSEWLD 

HSLLMDQSPQKSSYVQNLVLPEERGFIGPCGPRYLGNDSLPDFVPNSEFLLDDEISSEIE 

FCTSFSDNFLFDGLINELRPM* 

>G736 (1. .513) 

ATGGCG ACTCAAG ATTCT CAAGGGATTAAACTCTTTGGCAAAACTATTG CATTTAACACT 
CGAACAATAAAAAATGAAGAAGAGACACACCCGCCGGAGCAAGAAGCCACAATAGCCGTT 
AGATCATCATCATCATCGGATCTGACGGCCGAGAAGCGTCCGGATAAGATCATAGCATGT 
CCAAGATGCAAGAGCATGGAGACAAAGTTCTGTTACTTCAACAACTACAACGGTAATCAG 
CCTCGACACTTTTGTAAAGGCTGCCACCGTTACTGGACCGCCGGTGGTGCACTCCGGAAC 
GTTCCCGTCGGCGCCGGTCGTCGGAAGTCCAAACCACCTGGTCGTGTCGTGGTTGGTATG 
CTTGGAGATGGAAATGGTGTTCGCCAAGTCGAGCTTATAAATGGCTTGCTCGTTGAGGAG 
TGGCAGCATGCCGCAGCCGCAGCTCACGGTAGTTTCCGGCATGATTTTCCCATGAAGCGG 
CTCCGGTGTTACTCCGACGGTCAATCGTGCTGA 

>G736 Amino Acid Sequence (domain in AA coordinates: 54-111) 
MATQDSQGIKLFGKTIAFNTRTIKNEEETHPPEQEATIAVRSSSSSDLTAEKRPDKIIAC 
PRCKSMETKFCYFNNYNGNQPRHFCKGCHRYWTAGGALRNVPVGAGRRKSKPPGRVWGM 
LGDGNG VRQ VEL I NG LL VEE WQHAAAAAHG S FRHD FPMKRLRC YS DG Q S C * 
>G903 (96.. 1496) 

CCCGGGTCGACCCACGCGTCCGCTCTCTCTCTCTGAACTATACAAAAACCTACTTTTAAT 

TTCTCTTCCAAGAAGTCAAGAACCCAGAAGAAGACATGACAAGTGAAGTTCTTCAAACAA 

TCTCAAGTGGATCAGGTTTTGCTCAGCCACAGAGCTCATCAACCCTGGATCATGATGAAT 

CTCTCATCAATCCTCCTCTTGTTAAGAAAAAGAGAAATCTCCCTGGAAATCCTGATCCGG 

AAGCTGAAGTGATAGCTTTATCCCCCACGACCTTGATGGCTACGAACCGGTTCCTATGTG 

AGGTATGTGGCAAAGGTTTCCAAAGAGACCAAAACTTACAGCTTCATCGGCGAGGACATA 

ATCTTCCATGGAAGTTGAAGCAGAGGACAAGCAAAGAAGTGAGAAAACGTGTCTACGTTT 

GCCCCGAGAAGACATGTGTCCACCATCACTCCTCTAGAGCTCTAGGCGATCTCACTGGAA 

TCAAAAAGCATTTTTGCCGGAAACACGGGGAGAAGAAGTGGACGTGCGAGAAATGTGCTA 

AGAGATACGCAGTCCAATCTGATTGGAAAGCTCATTCCAAGACTTGTGGTACTAGAGAGT 

ACCGTTGCGATTGTGGCACCATTTTCTCAAGGCGAGACAGCTTTATCACTCATAGAGCTT 

TCTGCGATGCCTTAGCGGAAGAAACCGCTAAGATAAACGCAGTGTCTCATCTCAACGGTT 

TAGCCGCGGCTGGAGCCCCAGGATCAGTTAATCTCAACTATCAATATCTCATGGGAACAT 

TCT^TCCCACCGCTTCAACCATTTGTACCACAACCGCAAACAAATCCAAACCAT 

AACATTTTCAGCCACCAACTTCTTCGTCGCTCTCTCTATGGATGGGACAAGATATCGCGC 

CGCCTCAACCGCAACCGGACTACGATTGGGTTTTTGGAAACGCTAAGGCAGCGTCTGCTT 

GCATTGATAATAATAATACTCACGATGAGCAGATTACGCAAAACGCAAACGC^VAGTTTGA 

CCACTACCACTACTCTCTCTGCCCCTTCTTTATTCAGCAGCGACCAACCACAAAACGCAA 

ACGCAAATTCAAACGTGAATATGTCCGCGACAGCTTTACTA 

GCGCTACTTCTACAACAACCGCAGCGACCAATGACCCATCAACGTTTCTTCAAAGTTO 

CGCTTAAATCCACGGATCAAACCACCAGTTATGACAGTGGCGAAAAGTTTTTTGCTTTGT 

TCGGGTCTAACAAGAACATTGGGTTAATGAGTCGTAGTCATGATCATCAAGAGATCGAGA 

ACGCTAGAAATGACGTTACGGTTGCGTCTGCCTTGGATGAATTACAGAATTACCCTTGGA 

AACGTAGAAGAGTTGATGGTGGAGGTGAAGTGGGTGGAGGAGGGCAAACTCGGGATTTCC 

TCGGGGTTGGTGTACAJ^CGTTGTGCCATCCATCGTCTATCAATGGATGGATTTGAAAGA 

GTTTAAAAATTTCGGGGTTAATGCATAAATTACGTAAAAGAAGAAGGAATCTTTTGTCAT 

TTCCACCATTTTCTAAGATAACATATGTATATGGTAATGGAAGTTGTTTTCTTTTATTAA 

TTCAATATTCTAAAACTTATGATATATGTATAATGAATGTGTTTATCTTCA7^A 

>G903 Amino Acid Sequence (domain in AA coordinates: 68-92) 

MTSEVLQTISSGSGFAQPQSSSTLDHDESLINPPLVKKKRNLPGNPDPEAEVIALSPTTL 

MATNRFLCEVCGKGFQRDQNLQLHRRGHNLPWKLKQRTSKEVRKRVYVCPEKTCVHHHSS 
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RALGDLTGIKKHFCRKHGEKKWTCEKCAKRYAVQSDWKAHSKTCGTREYRCDCGTIFSRR 

DSFITHRAFCDALAEETAKINAVSHLNGLAAAGAPGSVNLNYQYLMGTFIPPLQPFVPQP 

QTNPNHHHQHFQPPTSSSLSLWMGQ0IAPPQPQPDYDWVFGNAKAASACIDNNNTHDEQI 

TQNANASLTTTTTLSAPSLFSSDQPQNANANSNVNMSATALLQKAAEIGATSTTTAATND 

PSTFLQSFPLKSTDQTTSYDSGEKFFALFGSNNNIGLMSRSHDHQEIENARNDVTVASAL 

DELQNYPWKRRRVDGGGEVGGGGQTRDFLGVGVQTLCHPSSINGWI* 

>G917 (32.-679) 

TTAGGGTTTTAGAAAGATAGATCGATTGAAGATGAGGAAAGGTAAGAGAGTGATAAAAAA 
GATAGAGGAGAAAATAAAGAGACAAGTGACATTCGCAAAGAGAAAGAAGAGTCTAATCAA 
GAAGGCATATGAACTCTCTGTTCTCTGCGATGTCCACCTTGGTCTCATCATCTTCTCTCA 
CTCCAACAGGCTCTACGATTTCTGCTCCAACTCTACCAGCATGGAGAATCTCATCATGAG 
ATACCAAAAGGAAAAAGAAGGTCAAACCACTGCAGAACACAGTTTCCACTCGGATCAGTG 
TTCAGATTGCGTGAAGACGAAGGAATCAATGATGAGAGAGATAGAGAATCTTAAGCTGAA 
TCTTCAATTGTACGACGGACATGGCTTGAATCTCTTGACCTACGACGAGCTCCTTTCTTT 
TGAGCTCCATCTCGAATCTTCTCTACAACATGCTCGAGCTCGCAAGTCTGAGTTCATGCA 
TCAGCAGCAGCAGCAACAAACAGATCAAAAGCTTAAGGGAAAAGAAAAGGGTCAAGGAAG 
CTCTTGGGAGCAGCTGATGTGGCAAGCAGAGAGACAGATGATGACGTGTCAAAGACAAAA 
AGATCCTGCGCCGGCGAATGAAGGAGGAGTTCCTTTTTTACGGTGGGGAACAACCCACCG 
ACGTTCTTCACCTCCTTAAGCTACCACAACCAGGCCCAAATACAGGCCCATAACTTCTCT 
CTATCTATAAAAAACAACTGATAGTAAAAAGTATTGACCCGGTTTGGTTCGGTTATGTTG 
ATACCAGACTATTAATTAACTTCGGTTAGACGTATTTACGACTTGATGCTATCTAGACCT 
TTTTGCCCTTCAAAAAAA 

>G917 Amino Acid Sequence (conserved domain in AA coordinates : 2-57) 

MRKGKRVIKKIEEKIKRQVTFAKRKXSLIKKAYELSVLCDVHLGLIIFSHSNRLYDFCSN 

STSMENLIMRYQKEKEGQTTAEHSFHSDQCSDCVKTKESMMREIENLKLNLQLYDGHGLN 

LLTYDELLSFELHLESSLQHARARKSEFMHQQQQQQTDQKLKGKEKGQGSSWEQLMWQAE 

RQMMTCQRQKDP AP ANEGG VPFLRWGTTHRRS S PP * 

>G921 (116.. 1024) 

CCAAGATCGACTClTACTTCGAATCrCTCTCAACTTTCTTCCTCAGCTTACGGGAACTTC 
CACACATATACATCCACAAGAACCCATATCGAAGATTCATCCTACATATATTTACATGGA 
TCAGTACTCATCCTCTTTGGTCGATACTTCATTAGATCTCACTATTGGCGTTACTCGTAT 
GCGAGTTGAAGAAGATCCACCGACAAGTGCTTTGGTGQAAGAATTAAACCGAGTTAGTGC 
TGAGAACAAGAAGCTCTCGGAGATGCTAACTTTGATGTGTGACAACTACAACGTCTTGAG 
GAAGCAACTTATGGAATATGTT7UVCAAGAGCAACATAACCGAGAGGGATCAAATCAGCCC 
TCCCAAGAAACGCAAATCCCCGGCGAGAGAGGACGCATTCAGCTGCGCGGTTATTGGCGG 
AGTGTCGGAGAGTAGCTCAACGGATCAAGATGAGTATTTGTGTAAGAAGCAGAGAGAAGA 
GACTGTCGTGAAGGAGAAAGTCTCAAGGGTCTATTACAAGACCGAAGCTTCTGACACTAC 
CCTCGTTGTGAAAGATGGGTATCAATGGAGGAAATATGGACAGAAAGTGACTAGAGACAA 
TCC^TCTCCAAGAGCTTACTTCAAATGTGCTTGTGCTCC^GCTGTTCTGTCAAAAAGAA 
GGTTCAGAGAAGTGTGGAGGATCAGTCCGTGTTAGTTGC^CrrTATGAGGGTGAACACAA 
CCATCCAATGCCATCGCAGATCGATTCAAACAATGGCTTAAACCGCCACATCTCTCATGG 
TGGTTCAGCT^CAACACCCGTTGCAGCAAACAGAA 

TACCGTAGATATGATTGAATCGAAGAAAGTGACGAGCCCAACGTCAAGAATCGATTTTCC 
CCAAGTTCAGAAACTTTTGGTGGAGCAAATGGCTTC 

TACAGCAGCTTTAGCAGCAGCTGTTACCGGAAAATTGTATCAACAGAATCATACCGAGAA 

ATAGTTTAGCTTCAAATTCCGTTAGAGTTTTTAGATTTGAATTTGTCATGAGTAAGAGAA 

AGAGAGTAGATTATAATCCNTTGTGATACTGAAAAAAAAAAAAAAAAAAA 

>G921 Amino Ae±d Sequence (domain in AA coordinates: 146-203) 

MDQYSSSLVDTSLDIjTIGvTRMRvEEDPPTSAL 

LRKQLMEYWKSNITERDQISPPKKRKSPAREDAFSCAVIGGVSESSSTDQDEYLCKKQR 
EETWKEKVSRVYYKTEASDTTLWKDGYQWROT^ 

KKVQRSVEDQSVLVATYEGEHNHPMPSQIDSmiGLNRHISHGGSASTPVAANRRSS 

VTTTOMIESKKVTSPTSRIDFPQVQKLLVEQ 

EK* 

>G922 (1..1449) 

ATGGTGGCTATGTTTCAAGAAGATAATGGAACATCTTCTGTAGCTTCATCACCACTTCAA 
GTCTTCTC^CTATGTCACTCAAC^GACCGACTCTCCrrCGCTTCITCATCTCCGTTTC^T 
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TGTCTCAAAGATCTCAAACCAGAGGAGCGTGGTCTCTACTTAATCCACCTCTTGCTAACT 
TGTGCCAACCACGTGGCTTCAGGTAGCCTCCAAAACGCTAACGCAGCGCTCGAGCAGCTC 
TCTCACCTCGCTTCTCCTGACGGCGACACGATGCAGCGT^ATCGCTGCTTACTTCACCGAA 
GCGCTTGCTAACAGAATCCTTAAGTCCTGGCCTGGTCTTTACAAGGCTCTTAACGCAACT 
CAGACAAGAACTAACAATGTCTCTGAGGAGATTCATGTTAGAAGACTCTTCTTTGAGATG 
TTCCCGATACTCAAAGTCTCTTACTTGCTCACTAATCGAGCTATACTCGAGGCTATGG7VA 
GGAGAGAAGATGGTTCATGTGATTGATCTCGATGCTTCTGAGCCAGCTCAATGGCTTGCT 
TTGCTTCAAGCTTTTAACTCTAGGCCTGAAGGTCCACCTCATTTGAGAATCACTGGTGTT 
CATCACCAGAAGGAAGTGCTTGAACAAATGGCTCATAGACTCATTGAGGAAGCAGAGAAA 
CTCGATATCCCGTTTCAGTTTAATCCCGTTGTGAGTAGGTTAGACTGTTTAAATGTAGAA 
CAGTTGCGGGTTAAAACAGGAGAGGCCTTAGCCGTTAGCTCGGTTCTTCAATTGCATACC 
TTCTTGGCCTCTGATGATGATCTCATGAGAAAGAACTGCGCTTTACGGTTTCAGAACAAC 
CCTAGTGGAGTTGACTTGCAGAGAGTTCTAATGATGAGCCATGGCTCTGCAGCTGAGGCA 
CGTGAGAATGATATGAGTAACAACAATGGGTATAGCCCTAGCGGTGACTCGGCCTCATCT 
TTGCCTTTACCAAGTTCAGGAAGGACTGATAGCTTCCTCAATGCTATTTGGGGTTTGTCT 
CCAAAGGTCATGGTGGTCACTGAGCAAGACTCAGACCACAACGGCTCCACACTAATGGAG 
AGGCTATTAGAATCACTTTACACCTACGCAGCATTGTTTGATTGCTTGGAT^ACAAAAGTT 
CCAAGAACGTCTCAAGATAGGATCAAAGTGGAGAAGATGCTCTTCGGGGAGGAGATCAAG 
AACATCATATCCTGCGAGGGATTTGAGAGAAGAGAAAGACACGAGAAGCTTGAGAAATGG 
AGCCAGAGGATCGATTTGGCTGGTTTTGGGAATGTTCCTCTTAGCTATTATGCGATGTTG 
CAGGCTAGGAGATTGCTTCAAGGGTGCGGTTTTGATGGGTATAGAATCAAGGAAGAGAGC 
GGGTGCGCAGTAATTTGCTGGCAAGATCGACCTCTATACTCGGTATCAGCTTGGAGATGC 
AGGAAGTGA 

>G922 Amino Acid Sequence (conserved domain in AA coordinates : 225-242 ) 
MVAMFQEDNGTSSVASSPLQVFSTMSLNRPTLLASSSPFHCLKDLKPEERGLYLIHLLLT 
CAimVASGSLQNANAALEQLSHLASPDGDTMQRIAAYFTEAIANRILKSWPGLYKALNAT 
QTRTOmrSEEIHVRRLFFEMFPILKVSYLLTNRAI^ 

LLQAFNSRPEGPPHLRITGVlfflQKEVLEQMAHRLIEEAEKLDIPFQFNPVVSRLDCLNVE 
QLRVKTGEALAVSSVLQLHTFLASDDDLMRKNC^^ 

RENDMSNNNGYSPSGDSASSLPLPSSGRTDSFLNAIWGLSPKVMVVTEQDSDHNGSTLME 
RLLESLYTYAALFDCLETKVPRTSQDRIKVEKMLFGEE I KNI I SCEGFERRERHEKLEKW 
SQRIDLAGFGNVPLSYYAMLQARRLLQGCGFDGYRIKEESGCAVICWQDRPLYSVSAWRC 
RK* 

>G932 (206.. 1213) 

CCACGCGTCCGACCACTTGTACCTCTTTGTCTTAAGTACTCTTTAACCCTACAATTTCCT 

AAGCTCTCAAGCCACAAAAAACCACAAACCGTTCTTCACCAATATATATATCTGATCATC 

ATCAAAGTCCTTCTCTCTGCTCATACCACAAACCGTTCCATTCTTCCCCTAATCACAAAG 

TGATATTTACATAGAGAAGATAGAGATGGGAAGACCACCATGCTGTGACAAGATTGGAGT 

GAAGAAAGGACCATGGACACCAGAGGAAGATATCATCTTGGTTTCTTACATCCAAGAACA 

TGGTCCTGGAAACTGGAGATCTGTGCCTACTCACACAGGTTTGAGGAGATGTAGCAAAAG 

CTGTAGATTGAGGTGGACTAATTATCTTCGACCTGGGATCAAGCGTGGAAATTTCACCGA 

GCATGAAGAGAAGATGATTCTCCATCTTCAAGCTCTTTTGGGAAACAGGTGGGCAGCTAT 

AGCATC7^TATCTTCC^G7^GGACAGACAATGATATAAAGAACTATTGGAACACTCATTT 

GAAGAAAAAGCTCAAGAAGATGAATGATTCTTGTGATAGTACTATCAACAATGGCCTTGA 

TAATAAAGACTTCTCCATATCAAACAAAAACACTACCTCACATCAAAGCAGCAACTCCAG 

TAAAGGTCAATGGGAGAGAAGGCTTCAGACAGATATCAACATGGCTAT^CAAGCTCTTTG 

TGATGCCTTGTCTATTGACAAACCACAAAACCCAACTAATTTTTCTATTCCCGATCTTGG 

TTATGGTCCATCAT6TTCTTCGTCCTCTACCACCACCACCACCACCACCACCACCACGAG 

AAACACTAATCCATACCCATCTGGGGTCTATGCTTCAAGTGCTGAGAACATTGCTCGTTT 

GCTTCAGAATTTTATGAAAGACACACCAAAGACCTCGGTGCCCTTGCCGGTTGCAGCCAC 

CGAGATGGCTATCACCACGGCAGCTTCGAGCCCTAGCACT^ACCGAAGGAGACGGAGAAGG 

GATTGACC^TTCTTTGTTCAGCTTCAACTCC^TAGATGAAGCTGAAGAGAAGCCTAAACT 

AATAGACCATGACATT^TGGTCTAATTACACAAGGCTCTCTTTCTTTGTTCGAGAAATG 

GCTCTTTGATGAGCAAAGCCACGATATGATCATCAATAACATGTCACTAGAGGGTCAGGA 

AGTGTTGTTCTAGAAAGCATTAAAGTTTGACGATTTGCTTGAGGAACCACGAGGCTTAGT 

TATAAACAATTTGTATAATTAAGTACTCTTTAGT^ 

TATTGCAGTAATTAGGGATTTTAGTCTTTAGTAGTAACTC 
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CTCTATCTTTTTAGTAGTAACTCTTTATTTTTTCCTTAAATCTTTGTCGACGTGGAGATG 
ATATCTTCTATGTAGTAGAAACTCAAAAGTGTACATCATCTTTATTAATGTAACGTCTTT 
TTAAAAAAAAAAAAAAAAA 

>G932 Amino Acid Sequence (domain in AA coordinates: 12-118) 
MGRPPCCDKIGVKKGPWTPEEDIILVSYIQEHGPGNVJRSVPTHTGLRRCSKSCRLRWTNY 
LRPG I KRGNFTEHEEKM I LHLQALLGNRWAAI AS YLPERTDND I KNY WNTHLKKKLKKMN 
DSCDSTINNGLDNKDFS I SNKNTTSHQS SNSSKGQWERRLQTDINMAKQALCDALS IDKP 
QNPTNFSIPDLGYGPSSSSSSTTTTTTTTTTRNTNPYPSGVYASSAENIARLLQNFMKDT 
PKTSVPLPVAATEMAITTAASSPSTTEGDGEGIDHSLFSFNSIDEAEEKPKLIDHDINGL 
ITQGSLSLFEKWLFDEQSHDMIINNMSLEGQEVLF* 
>G599 (152.. 1579) 

TCGACAGAACAGCTTCGTTGTCACTTGTCATTCTATAAATCGCATCCCCATTGACAACCT 
TTCACTTCCATCAAAACTCTCTCTCTATATCTCTCTCTCTATATATCTCTCTCTATATCT 
CTCTCTCTCTTCACTCTCTCTTTCTTTCAAAATGGAAAAACTCATGGTTCCGACATGGAG 
ACCCGACCCGGTTTACCGTCCACCGGAAACACCACTCGAACCGATGGAGTTTTTAGCTCG 
TTCATGGAGCGTCTCTGCTCTCGAAGTCTCCAAGGCTCTAACACCACCCAACCCTCAGAT 
TCTCCTCTCCAAAACCGAAGAAGAAGAAGAAGAAGAACCCATCTCCTCTGTCGTAGACGG 
CGACGGCGACACGGAAGACACCGGACTTGTCACCGGAAACCCATTCTCCTTCGCTTGTTC 
AGAAACTTCTCAAATGGTCATGGATCGTATCTTGTCTCACTCTCAAGAAGTATCACCAAG 
AACATCTGGTCGGCTATCTCACAGTAGTGGTCCACTTAATGGTTCTTTGACCGACAGTCC 
TCCTGTGTCTCCTCCCGAATCCGACGACATTAAGCAATTTTGCAGAGCGAACAAAAATTC 
ATTGAACAGTGTAAATTCTCAGTTCCGTTCAACGGCGGCAACTCCGGGACCTATAACCGC 
TACAGCTACACAGTCCAAGACGGTGGGACGGTGGCTTAAGGACCGGAGAGAGAAAAAGAA 
AGAGGAGACTCGGGCTCATAACGCTCAGATTCACGCTGCTGTCTCTGTCGCCGGCGTTGC 
TGCAGCTGTTGCTGCTATTGCAGCAGCCACCGCTGCGTCTTCTAGCTGTGGTAAGGATGA 
GCAGATGGCTAAAACTGACATGGCCGTTGCTTCTGCTGCGACCCTTGTGGCTGCTCAGTG 
TGTGGAAGCTGCTGAAGTTATGGGAGCTGAGAGAGAGTATTTGGCTTCTGTTGTTAGCTC 
CGCCGTCAATGTTCGTTCTGCCGGAGATATTATGACTCTCACCGCCGGAGCAGCTACAGC 
TTTAAGAGGAGTGCAAACATTGAAGGCAAGGGCAATG7VAGGAAGTGTGGAACATAGCATC 
AGTGATACCAATGGATAAAGGACTCACTTCTACAGGAGGAAGCAGCAATAATGTTAATGG 
TAGCAATGGAAGCTCAAGCAGTAGTCACAGTGGTGAACTTGTACAACAGGAGAATTTCTT 
GGGAACTTGTAGTAGAGAATGGCTCGCTAGAGGTTGTGAACTCCTCAAACGCACTCGCAA 
AGGTGATCTCCACTGGAAGATAGTATCTGTTTACATCAACAAAATGAATCAGGTTATGTT 
GAAGATGAAGAGCAGGCATGTTGGAGGAACCTTCACCAAGAAGAAAAAGAACATTGTGCT 
TGATGTGATCAAGAATGTCCCGGCCTGGCCTGGACGACATTTGCTAGAGGGAGGAGATGA 
TCTAAGATACTTCGGTTTGAAGACGGTTATGCGAGGTGATGTTGAATTCGAGGTCAAGAG 
CCAAAGGGAATATGAAATGTGGACACAAGGTGTCTCAAGGCTTCTTGTTCTTGCTGCTGA 
GAGGAAGTTTAGGATGTGAATAAACGTTCAATGGCTGCTTGGTTTAAGTGTGAGTTTTTT 
TTTAACTTATGTGGTCAAATTTCATTAGTAGGGGra 

TTGGGTATAGGATAAAATGGACOTACCAGTCAAGGTGAGGAAGCATTTGGGTAAACAAAA 
CTTAGTGGGGGTGATCTGTAATATCTATGTTCTTAGTTTTTTTTTGGTTGTTGGTGGTCT 
TTTTGTATAAAAAAACAAAGTTGAAGTAATAGATATATAGT^ 

>G599 Amino Acid Sequence (domain in AA coordinates: 187-219, 264-300) 

MEKLMVPTWRPDPVYRPPETPLEPMEFLARSWSVSALEVSKALTPPOTQILLSKTEEEEE 

EEPISSWDGDGDTEDTGLVTGNPFSFACSETSQMVMDRILSHSQEVSPRTSGRLSHSSG 

PLNGSLTDSPPVSPPESDDIKQFCRANKNSLNSVNSQFRSTAATPGPITATATQSKTVGR 

WLKDRREKKKEETRAHNAQ IHAAVSVAGVAAAVAAIAAATAAS S S CGKDEQMAKTDMAVA 

SAATLVAAQC^AABVMGAEREYIASWSSAVNVRSAGDIMTLTAGAATALRGVQTLKAR 

AMKEVWNIASVIPMDKGLTSTGGSSNNVNGSNGSSSSSHSGELVQQENFLGTCSREWIAR 

GCELLKRTRKGDLHWKIVSWINKMNQVMLKMKSR 

GRHLLEGGDDLRYFGLKTVMRGDVEFEVKSQREYEMWTQGVSRLLVLAAERKFRM* 
>G804 (114.. 1139) 

ATACTCCAAGAATTTATAGGTTATAAGTAAAAATTCAGTA 
TTCCATTTTOTTGTGTGTTTTTTTCCCCATM 

CCCACAACAACAACCAGAGCAACAACAACACCACTGGTTCGGCCCATCTGGTCCCATCCA 
TGGGACCAATCTCCGGTTCAGTCT(^TTAACCACCACTGCTCGAAACTCCACTACCACCA 
CCGTCACCGCCGCTAAAACACCCGCAAAACGACCGTCCAAGGACCGTCACATCAAAGTAG 
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ACGGACGTGGCCGGAGGATACGTATGCCGGCTATCTGCGCAGCACGTGTCTTCCAACTAA 

CACGTGAGTTACAACACAAATCGGACGGCGAGACTATAGAGTGGCTGCTCCAACAAGCGG 

AGCCAGCTATCATCGCAGCCACCGGAACTGGAACCATACCGGCGAATATCTCTACTTTGA 

ACATCTCTCTTCGAAGCAGTGGCTCTACTCTTTCAGCTCCACTGTCTAAATCTTTCCACA 

TGGGAAGAGCGGCTCAAAACGCTGGCGTTTTTGGGTTCCAGCAACAGCTTTATCATCCTC 

ATCATATCACGACAGATTCTTCTTCTTCTTCTCTTCCCAAAACATTCCGTGAAGAAGATC 

TTTTTAAAGATCCTAATTTTCTAGATCAAGAACCCGGTTCAAGATCACCTAAACCGGGAT 

CCGAAGCTCCTGATCAAGATCCGGGTTCGACCCGGTCAAGAACACAAAATATGATACCGC 

CGATGTGGGCACTAGCGCCAACGCCAGCCTCCACAAACGGAGGTAGTGCTTTTTGGATGT 

TACCAGTCGGAGGAGGAGGAGGTCCGGCTAACGTTCAGGATCCATCACAGCACATGTGGG • 

CGTTTAATCCGGGTCATTACCCGGGTCGAATCGGGTCGGTTCAGCTAGGGTCTATGTTAG 

TGGGAGGTCAACAGTTAGGGTTAGGTGTTGCAGAAAATAACAATTTGGGGCTATTTTCCG 

GCGGAGGAGGAGACGGTGGTCGGGTTGGTCTCGGAATGAGTCTTGAGCAAAAGCCTCAAC 

ATCAAGTGAGTGATCATGCTACTAGAGACCAAAATCCTACTATAGATGGTTCTCCTTGAA 

AGACTTCATGATTTCTTTGGTTTTTAAAAAGTGTGAATGTGTGATTTATTGCAACTTTTG 

TTGAGGACTCCAATGTTAATATGGGTTTTAGGGTTGGCTTTTCGGGATTGCCAAATTGTT 

ATT 

>G804 Amino Acid Sequence (domain in AA coordinates: 54-117) 
MESHNNNQSlT^rNTTGSAHLVPSMGPIS 

KVDGRGRR IRMPAI CAARVFQLTRELQHKSDGETIEWLLQQAEPAI I AATGTGTI PAN IS 

TLNISLRSSGSTLSAPLSKSFHMGRAAQNAAVFGFQQQLYHPHHITTDSSSSSLPKTFRE 

EDLFKDPNFLDQEPGSRSPKPGSET^PDQDPGSTRSRTQNMIPPMWALAPTPASTNGGSAF 

WMLPVGGGGGPANVQDPSQHMWAFNPGHYPGRIGSVQLGSMLVGGQQLGLGVAENNNLGL, 

FSGGGGDGGRVGLGMSLEQKPQHQVSDHATRDQNPTIDGSP* 

>G1062 (297. .1781) 

CAAAAAAAAAGTTTCAATTTTTGAAAGCTCTGAGAAATGAAATCTA^ 
TATCTCTATCTTCCTTTTCAGATTTCGCTTCTTCAATTCATGAAATCCtCGTGATTCTAC 
TTTAATGCTTCTCTTTTTTTACTTTTCCAAGTCTCTGAATATTCAAAGTATATATCTTTT 
' GTTTTCAAACTTTTGCAGAATTGTCTTC^GCTTCCAAATTTCAGTTAAAGGTCTCAACT 
TTGCAGAATTTTCCTCTAAAGGTTCAGACTTTGGGGTAAAGGTGTCAACTTTGGCGATGG 
GTCTTGACGGAAACAATGGTGGAGGGGTTTGGTTAAACGGTGGTGGTGGAGAAAGGGAAG 
AGAACGAGGAAGGTTCATGGGGAAGGAATCAAGAAGATGGTTCTTCTCAGTTTAAGCCTA 
TGCTTGAAGGTGATTGGTTTAGTAGTAACCAACCACATCCACAAGATCTTCAGATGTTAC 
AGAATCAGCCAGATTTCAGATACTTTGGTGGTTTTCCTTTTAACCCTAATGATAATCTTC 
TTCTTCAACACTCTATTGATTCTTCTTCTTCTTGTTCTCCTTCTCAAGCTTTTAGTCTTG 
ACCCTTCTCAGCAAAATCAGTTCTTGTCAACTAACAACAACAAGGGTTGTCTTCTCAATG 
TTCCTTCTTCTGCAAACCCTTTTGATAATGCTTTTGAGTTTGGCTCTGAATCTGGTTTTC 
TTAACCAAATC(^TGCTCCTATTTCGATGGGGTTTGGTTCTTTGACACAATTGGGGAACA 
GGGATTTGAGTTCTGTTCCTGATTTCITC^ 

ACAACAACAACACAATGTTGTGTGGTGGTTTCACAGCTCCGTTGGAGTTGGAAGGTTTTG 

GTAGTCCTGCTAATGGTGGTTTTGTTGGGAACAGAGCGAAAGTTCTGAAGCCTTTAGAGG 

TGTTAGCATCGTCTGGTGCACAGCCTACTCTGTTCCAGAAACGTGCAGCTATGCGTCAGA 

GCTCTGGAAGCAAAATGGGAAATTCGGAGAGTTCGGGAATGAGGAGGTTTAGTGATGATG 

GAGATATGGATGAGACTGGGATTGAGGTTTCTGGGTTGAACTATGAGTCTGATGAGATAA 

ATGAGAGCGGTAAAGCGGCTGAGAGTGTTCAGATTGGAGGAGGAGGAAAGGGTAAGAAGA 

AAGGTATGCCTGCTAAGAATCTGATGGCTGAGAGGAGAAGGAGGAAGAAGCTTAATGATA 

GGCTTTATATGCTTAGATCAGTTGTCCCCAAGATCAGCAAAATGGATAGAGCATCAATA 

TTGGAGATGCAATTCATTATCTGAAGGAACTTCTACAAAGGATCAATGATCTTCACAATG 

AACTTGAGTCAACTCCTCCTGGATCTTTGCCTCC^CTTCATCAAGCTTCCATCCGTTGA 

CACCTACACCGCAAAGTCTTTCTTGTCGTGTCAAGGAAGAGTTGTGTCCCTCT^ 

CAAGTCCTAAAGGCCAGCAAGCTAGAGTTGAGGTTAGATTAAGGGAAGGAAGAGCAGTGA 

ACATTCATATGTTCTGTGGTCGTAGACCGGGTCTGTTGCTCGCTACCATGAAAGCTTTGG 

ATAATCTTGGATTGGATGTTCAGCAAGCTGTGATCAGCTGTTTTAATGGGTTTGCCTTGG 

ATGTTTTCCGCGCTGAGCAATGCCAAGAAGGACAAGAGATACTGCCTGATCAAATCAAAG 

CAGTGCTTTTCGATACAGCAGGGTATGCTGGTATGATCTGATCTGATCCTGACTTCGAGT 

CCATTAAGCATCTGTTGAAGCAGAGCTAGAAGAACTAAGTCCCTTTAAATCTGCAATTTT 

CTTCTCAACTTTTTTTCTTATGTCATAACTTCAATCTAAGCATGTAATGCAATTGCAAAT 
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GAGAGTTGTTTTTAAATTAAGCTTTTGAGAACTTGAGGTTGTTGTTGTTGGATACATAAC 

TTCAACCTTTTATTAGCAATGTTAACTTCCATTTATGTTTCATCTT 

>G1062 Amino Acid Sequence (domain in AA coordinates: 308-359) 

MGLDGNNGGGVWLNGGGGEREENEEGSWGRNQEDGSSQFKPMLEGDWFSSNQPHPQDLQM 

LQNQPDFRYFGGFPFNPNDNLLLQHSIDSSSSCSPSQAFSLDPSQQNQFLSTNNNKGCLL 

NVPSSANPFDNAFEFGSESGFLNQIHAPISMGFGSLTQLGNRDLSSVPDFLSARSLLAPE 

SNttJIWTMLCGGFTAPLELEGFGSPANGGFTC 

QSSGSKMGNSESSGMRRFSDDGDMDETGIEVSGLNYESDEINESGKAAESVQIGGGGKGK 

KKGMPAKNLMAERRRRKKLNDRLYMLRSWPKISKMDRASILGDAIDYLKELLQRINDLH 

NELESTPPGSLPPTSSSFHPLTPTPQTLSCRVKEELCPSSLPSPKGQQARVEVRLREGRA 

VNIHMFCGRRPGLLLATMK^DNLGLDVQQAVISCFNGFALDVFRAEQCQEGQEILPDQI 
KAVLFDTAGYAGMI * 

>G1322 (213.. 833) 

AAAGTTATTGATAGTTTCTGTTACTTATTAATTTTTAAGGTTATGTGTATTATTACCAAT 

TGGAGGACTATATAGTCGCAAGTCTCAACCCTATAAAAGAAAACATTCGTCGATCATCTT 

CCCGCCTCGAGTATCTCTCTCTCTCTCTCTCTTCTCTGTTTTCTTTATTGATTGCATAGA 

CAAAAATACACACATACACAACAGAAAGAAAGATGGAGACGACGATGAAGAAGAAAGGGA 

GAGTGAAAGCGACAATAACGTCACAGAAAGAAGAAGAAGGAACAGTGAGAAAAGGACCTT 

GGACTATGGAAGAAGATTTCATCCTCTTTAATTACATCCTTAATCATGGTGAAGGTCTTT 

GGAACTCTGTCGCCAAAGCCTCTGGTCTAAAACGTACTGGAAAAAGTTGTCGGCTCCGGT 

GGCTGAACTATCTCCGACCAGATGTGCGGCGAGGGAACATAACCGAAGAAGAACAGCTTT 

TGATCATTCAGCTTCATGCTAAGCTTGGAAACAGGTGGTCGAAGATTGCGAAGCATCTTC 

CGGGAAGAACGGACAACGAGATAAAGAACTTCTGGAGGACAAAGATTCAGAGACACATGA 

AAGTGTCATCGGAAAATATGATGAATCATCAACATCATTGTTCGGGAAACTCACAGAGCT 

CGGGGATGACGACGCAAGGCAGCTCCGGCAAAGCCATAGACACGGCTGAGAGCTTCTCTC 

AGGCGAAGACGACGACGTTTAATGTGGTGGAACAACAGTCAAACGAGAATTACTGGAACG 

TTGAAGATCTGTGGCCCGTCCACTTGCTTAATGGTGACCACCATGTGATTTAAGATATAT 

ATATAGACCTCCTATACATTTATATGCCCCAGCTGGGTTTTTTTGTATGGTACGTTATTT 

GGTTTTTCTATTGCTGAAATGTCGTTGCATTTAATTTACATACGAAAAGTGC^TTAAATC 

ATTAAATCTTCAATACATATGGAGGTGGTGTTTGAGTAAAAAAAAAAAAAA 

>G1322 Amino Acid Sequence (domain in AA coordinates : 26-130) 

METTMKKKGRVKATITSQKSEEGTVRKGPWTMEEDFILFNYILNHGEGLWNSVAKASGLK 

RTGKSCRLRWLNYLRPDVRRGNITEEEQLLIIQLHAKLGNRWSKIAKHLPGRTDNEIKNF 

WRTKIQRHMKVSSENMMNHQHHCSGNSQSSGMTTQGSSGKAIDTAESFSQAKTTTFNVVE 
QQSNENYWNVEDLWPVHLLNGDHHVI * 
>G1331 (1. .786) 

ATGGTGGAAGAAGTTTGGAGAAAGGGTCCATGGACCGCCGAAGAAGACCGTCTTTTGATC 

GAATACGTCCGTGTTCACGGTGAAGGTCGTTGGAACTCTGTCTCTAAACTCGCAGGATTG 

AAAAGGAATGGCAAAAGCTGCAGACTAAGATGGGTGAATTACCTTAGACCAGACCTCAAG 

AGAGGACAGATCACTCCACATGAAGAAAGTATAATACTTGAGCTACACGCTAAGTGGGGA 

AATAGGTGGTCAACAATTGCACGTAGTTTACCAGGAAGAACAGACAATGAGATCAAGAAC 

TATTGGAGAACCCATTTCAAGAAAAAGGCAAAGCCTACGACTAACAATGCGGAGAAGATA 

AAGAGTCGTCTCCTAAAAAGGCAACACTTCAAGGAACAGAGAGAAATAGAGTTGCAACAA 

GAACAGCAGTTGTTTCAGTTCGACCAACTCGGTATGAAAAAGATCATCTCTTTACTCG^ 

GAAAACAATAGCAGTAGCAGTAGCGATGGCGGTGGTGATGTGTTCTATTATCCTGATCAA 

ATAACA(^TTCATCAAAACCCTTTGGCTATAACTCTAATTCATTAGAGGAGCAGTTAC^ 

GGTAGATTTTCTCCTGTAAACATACCTCATGCTAATACTATGAACGAAGACAATGCCATA 

TGGGACGGGTTTTGGAACATGGATGTTGTAAATGGACATGGTGGGAACTTGGGTGTTGTG 

GCTGCTACTGCTGCTTGTGGCCCAAGGAAGCCCTATTTCCATAACTTGGTGATTCCATTT 
TGTTAA 

>G1331 Amino Acid Sequence (conserved domain in AA coordinates : 8-109) 

MVEEVWRKGPWTAEEDRLLIEYVRTOGEGRWNSVS 

RGQITPHEESIILELHAKWGNRWSTIARSLPGRTD^ 

KSRLLKRQHFKEQREI ELQQEQQLFQFDQLGMKKI I SLLEENNS S S S SDGGGDVFYYPDQ 
I THSS KPFGYNSNSLEEQLQGRFS PVN I PDANTMNEDNAI WDGFWN^VVNGHGGNLGVV 
AATAACGPRKPYFHNLVI PFC * 
>G1521 (1..891) 
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ATGCCTCCATTACCGTCCTCCACGGCGCCTTCGTCTTCGAGACATCTTCGATCGCCGGAA 

AGTATCGCGAAATTTGCAGGGAGAGCAATATTTCCTGCTTTACAGGGGAAATCGTGTCCG 

ATATGCCTCGAAAATCTAACCGAGCGAAGATCCGCCGCCGTGATCACGGTGTGCAAGCAC 

GGATACTGCCTTGCTTGTATTCGGAAGTGGAGCAGCTTCAAGAGGAATTGTCCTCTTTGT 

AACACTCGTTTTGATTCCTGGTTTATCGTTAGTGATTTTGCTTCTAGAAAATACCATAAG 

GAGCAATTACCAATTCTTCGTGATCGTGAGACTTTAACTTATCATCGGAATAATCCTTCC 

GATCGCCGGAGGATAATTCAAAGGTCGAGGGATGTTTTGGAAAACTCTAGCTCAAGATCA 

AGGCCATTGCCATGGCGGAGATCATTTGGACGACCAGGTTCAGTTCCTGATTCTGTTATC 

TTCCAGCGAAAGCTTCAGTGGCGAGCTAGCATATACACTAAGCAATTACGAGCTGTTCGA 

TTACATTCAAC3G CGCTTGG AACTAAGTTTGGCGGTGAATG ATTACACCAAAG CAAAGATA 

ACTGAAAG7VATTGAGCCATGGATTAGAAGAGAGCTTCAGGCAGTCCTTGGAGATCCTGAT 

CCCTCAGTTATTGTTCATTTTGCGTCAGCTCTTTTCATCAAAAGGCTTGAGAGAGAGAAT 

AATCGACAAACCGGGCAGACCGGGATGTTGGTGGAAGATGAAGTCTCCTCTCTTCGAAAA 

TTCTTGTCTGATAAGGTGG ATATATTTTGG CATGAACTAAGATGTTTTG CGG AGAGTATA 

CTCACGATGGAGACTTATGATGCAGTGGTTGAATACAATGAGGTGGAGTAA 

>G1521 Amino Acid Sequence (domain in AA coordinates: 39-80) 

MPPLPSSTAPSSSRHLRSPES IAKFAGRAIFPALQGKSCPICLENLTERRSAAVITVCKH 

GYCLACIRKWSSFKRNCPLCNTRFDSWFIVSDFASRKYHKEQLPILRDRETLTYHRNNPS 

DRRRIIQRSRDVLENSSSRSRPLPWRRSFGRPGSVPDSVIFQRKLQWRASIYTKQLRAVR 

LHSRRLELSLAVNDYTKAKITERIEPWIRRELQAVLGDPDPSVIVHFASALFIKRLEREN 

^TRQTGQTGMLVEDEVSSLRKFLSDKVDIFWHELRCFAESILTMETYDAVVEYNEVE* 

>G183 (1..1458) 

ATGAGTGATTTTGATGAAAACTTCATCGAAATGACGTCGTATTGGGCTCCACCATCCAGT 
CCTAGCCCAAGAACGATATTGGCAATGCTGGAGCAAACCGACAATGGTCTGAATCCAATC 
AGTGAGATCTTCCCTCAAGAAAGCTTGCCAAGAGATCATACTGATCAATCTGGACAAAGA 
TCTGGTCTTCGTGAGAGACTGGCTGCAAGAGTAGGATTCAATCTTCCAACACTCAATACA 
GAAGAAAACATGAGTCCTTTGGATGCATTTTTCAGGAGCTCGAATGTTCCTAATTCTCCT 
GTCGTTGCAATCTCTCCAGGATTCAGTCCATCAGCACTATTGCATACTCCCAATATGGTC 
AGTGATTCTTCCCAGATTATCCCTCCGTCTTCAGCCACCAATTACGGACCTCTAGAGATG 
GTGGAAACTTCCGGTGAAGACAATGCAGCGATGATGATGTTCAACAACGATCTTCCTTAT 
CAGCCGTACAATGTTGATCTGCCTTCTCTAGAAGTCTTTGATGATATTGCAACGGAAGAG 
TCCTTTTATATCCCATCTTATGAACCTCATGTTGACCCAATTGGAACTCCTTTAGTCACA 
TCCTTTGAATCTGAACTCGTTGACGATGCCCATACCGACATCATCTCCATTGAGGACAGT 
GAGAGCGAGGATGGAAACAAAGATGATGACGACGAGGACTTCCAATACGAAGACGAAGAC 
GAAGACCAATACGACCAAGATCAAGATGTAGATGAAGATGAAGAGGAAGAAAAAGATGAA 
GACAATGTTGCATTAGATGATCCTCAACCTCCACCTCCAAAGAGAAGGAGATATGAGGTA 
TCAAACATGATTGGAGCCACAAGAACAAGCAAGACACAAAGGATCATACTTCAGATGGAA 
AGCGACGAAGACAATCCTAACGATGGTTATCGCTGGAGAAAATACGGTCAGAAAGTCGTC 
AAAGGAAATCCTAATCCGAGGAGTTACTTCAAGTGCACAAACATCGAGTGCAGAGTGAAA 
AAACATGTGGAGAGAGGAGCAGACAATATCAAGTTGGTTGTGACTACATACGATGGGATA 
CACAACCATCCTTCACCACCTGCACGTAGAAGCAATTCCAGTTCAAGGAACCGGTCTGCA 
GGGGCAACAATACCTCAAAATCAGAATGATCGAACCAGTCGGTTAGGTAGGGCTCCTCCT 
ACTCCTACTCCTCCTACTCCTCCTCCTTCGTCTTACACACCTGAGGAGATGAGGCCTTTC 
TCTTCGTTGGCTACAGAAATTGATCTGACAGAGGTTTATATGACCGGAATCTCTATGCTG 
CCGAATATACCGGTTTACGAGAATTCGGGTTTTATGTACCAGAATGATGAACCGACGATG 
AATGCGATGCCGGATGGTTCAGATGTGTACGATGGGATCATGGAACGCCTGTATTTTAAG 
TTTGGTGTCGACATGTAG 

>G183 Amino Acxd Sequence (domain in AA coordinates: TBD) 

MSDFDENFIEMTSYWAPPSSPSPRTILAMLEQTDNGLNPISEIFPQESLPRDHTDQSGQR 

SGLRERIAARVGFNLPTLNTEENMSPLDAFFRSSNVPNSPVVAISPGFSPSALLHTPNMV 

SDSSQIIPPSSATNYGPLEMVETSGEDNAAI^^FNNDLPYQPYNTOLPSLEVFDDIATEE 

SFYIPSYEPHVDPIGXPLVTSFESELVDDAHTDIISIEDSESEDGNKDDDDEDFQYEDED 

EDQYDQDQDVDEDEEEEKDEDNVALDDPQPPPPKRRRYEVSNMIGATRTSKTQRIILQME 

SDEDNPNDGYRWRKYGQKWKGNPNPRSYFKCTNIECRVKKHVERGADNIKLVVTTYDGI 

HNHPSPPARRSNSSSRNRSAGATIPQNQNDRTSRLGRAPPTPTPPTPPPSSYTPEEMRPF 

SSLATEIDLTEVYMTGISMLPNIPVYENSGFMYQNDEPTMNAMPDGSDVYDGIMERLYFK 

FGVDM* 
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>G2555 (177.. 956) 

CTGTTTTTGTATCCGTGTAAATTAATCACACGGTAGTTTTTGATGAAAAGACAACAATCG 
GAGAACAATCTGGTCTGCTGCTAAAATTTAATAAATTGTTTTGTCTAATTGTCTCCACCC 
ATAAAAAAGCGCGAATTCAATTCACCGACTAAAGACATTCTCCGGTGGAGACCCCGATGC 
AATCCACTCATATAAGCGGCGGAAGTAGCGGTGGTGGTGGTGGAGGAGGAGGAGAGGTGA 
GTCGAAGTGGATTATCTCGGATCCGTTCAGCTCCAGCTACTTGGATTGAAACCCTACTCG 
AAGAAGATGAAGAAGAAGGTTTAAAACCTAACCTTTGTTTAACAGAGCTGCTTACTGGTA 
ATAATAACTCTGGAGGAGTGATAACGAGTCGTGACGACTCGTTCGAGTTCCTGAGTTCTG 
TTGAGCAAGGATTGTATAATCATCATCAAGGTGGTGGCTTTCACCGTCAGAATAGTTCTC 
CGGCTGATTTTCTTAGTGGGTCTGGTTCTGGGACTGATGGGTATTTCTCTAATTTTGGTA 
TTCCGGCGAATTATGACTATTTGTCGACCAACGTTGATATTTCTCCGACTAAACGGTCTA 
GAGATATGGAAACACAGTTTTCTTCTCAGCTGAAAGAAGAGCAAATGAGTGGTGGGATAT 
CAGGAATGATGGATATGAACATGGACAAGATTTTTGAGGATTCAGTTCCTTGTAGGGTTC 
GTGCTAAACGTGGTTGTGCTACTCATCCTCGTAGCATTGCTGAACGGGTGAGAAGAACGC 
GAATAAGTGATCGGATTAGGAGGCTGCAAGAGCTTGTTCCTAACATGGATAAGCAAACCA 
ACACTGCAGACATGTTGGAAGAAGCTGTGGAGTATGTGAAGGCTCTTCAAAGCCAGATCC 
AGGAATTGACAGAGCAGCAGAAGAGATGCAAATGCAAACCTAAAGAAGAACAATAATGTA 
TCCTTTAGGATTTGATATATCTGTATTTTATTTTTGTACTATCTAAAAATGGTGATGATC 
TGTTCGAAAATTCGAAACATGATCTTATATATTGAACTAGAAAAAATAGATATATATGAA 
TTTTAGCTGTAAAATTTTTGTACAATAAGGAGAAAAAGATTTAGAAGAGTCAATAAAAAG 
ATGATGTTTACAAGTCAAAAAAAAAAA 

>G2555 Amino Acid Sequence (domain in AA coordinates: 175-245) 
MQSTHISGGSSGGGGGGGGEVSRSGLSRIRSAPATWIETLLEEDEEEGLKPNLCLTELLT 
GNNNSGGVITSRDDSFEFLSSVEQGLYNHHQGGGPHRQNSSPADFLSGSGSGTDGYFSNF 
GIPAITODYLSTNVDISPTKRSRDMETQFSSQLKEEQMSGGI^ 

VRAKRGCATHPRS I AERVRRTRI SDRI RRLQELVPNMDKQTNTADMLEEAVEYVKALQSQ 

IQEIiTEQQKRCKCKPKEEQ* 

>G375 (53 . .1171) 

TCGACAAAAACTCrCAOTCTCCCTCAAACTAAACAAACATACAGAACACAAAATGGGTCT 
CACTTCTCTTCAAGTTTGCATGGATTCTGATTGGCTCCAGGAATCCGAGTCATCAGGAGG 
AAGCATGTTAGACTCTTCAACGAATTCTCCGTCAGCAGCCGACATACTAGCAGCTTGCAG 
CACTAGACCACAAGCCTCGGCCGTGGCTGTAGCCGCTGCAGCTCTGATGGACGGTGGAAG 
GAGGCTGCGTCCACCTCACGACCATCCTCAAAAGTGTCCTCGTTGCGAGTCAACACATAC 
TAAGTTCTGTTACTACAATAACTACAGCCTCTCTCAGCCTCGTTACTTCTGCAAGACTTG 
TCGCCGTTACTGGACAAAAGGCGGAACTCTAAGGAATATTCCGGTTGGTGGTGGATGCCG 
TAAAAACAAGAAACCATCTTCCTCTAATTCCTCCTCCTCCACTTCTTCCGGCAAAAAACC 
ATCCAACATCGTTACCGCCAATACCTCTGATCTTATGGCTTTAGCACATTCTCATCAAAA 
TTACCAACATTCTCCTCTAGGGTTTTCACATTTTGGTGGGATGATGGGGTCTTACTCAAC 
TCCGGAGCATGGTAACGTTGGTTTCTTGGAGAGCAAGTATGGCGGTTTGCTTTCGCAGAG 
CCCTAGACCTATTGATTTCTTGGACAGTAAGTTTGATCTCATGGGAGTGAACAATGACAA 
CCTGGTCATGGTTAATCATGGAAGTAACGGAGATCATCATCATCATCATAATCATCACAT 
GGGTCTGAATCACGGTGTAGGTCTTAACAACAACAACAACAATGGTGGATTTAATGGGAT 
TTCTACGGGAGGCAATGGAAATGGTGGTGGTCTCATGGATATATCGACATGCCAAAGACT 
TATGCTATCTAATTATGATCATCACCATTACAATCATCAAGAAGATCATCA7UVGGGTAGC 
AACAATAATGGATGTGAAG CCAAATCCGAAGTTGTTATCG CTTG ATTGGCAG CAAGATCA 
ATGCTACTCCAATGGTGGTGGTAGCGGAGGCGCAGGAAAATCCGACGGTGGTGGATACGG 
.CAATGGTGGTTATATCAACGGTTTAGGTTCGTCGTGGAATGGTTTGATGAATGGCTATGG 
AACGTCCACTAAAAeAAACTCCTTGGTTTGATAAGTTAATCAGAACTTCTTTTTTCTTGT 
CGTCATCAAOTAGTAGTAGTAGTAATAGTAGTTGGAGACTAGAGAAGCACTTCAAATTAT 
TTATGGGTTTGTTTGCTAAGCCAGTTTTAC 

>G375 Amino Acid Sequence (domain in AA coordinates: 75-103) 
MGLTSLQVCmSDWLQESESSGGSMLDSSTNSPSAADILAACSTRPQASAVAVAAAALMD 
GGRRLRPPHDHPQKCPRCESTHTKFCYYNNYSLSQPRYFCKTCIUiYWTKGGTLRNIPVGG 
GCRKNKKPSSSNSSSSTSSGKKPSNIVTANTSDI J MALAHSHQNYQHSPLGFSHFGG^IMGS 
YSTPEHGNVGFLESKYGGLLSQSPRPIDFLDSKFDLMGVNNDNLVMVNHGSNGDHHHH^ 
HHMGLNHGVGLNNNNNNGGFNGI S TGGNGNGGGLMD I STCQRLMLSNYDHHHYNHQEDHQ 
RVATIMDVKPNPKLLSLDWQQDQCYSNGGGSGGAGKSDGGGYGNGGYINGIiGSSWNGLMN 
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GYGTSTKTNSLV* 
>G1007 (86.. 763) 

ATTCCTTCTTGCCTAGGAACTAATTGTTGCACACTTCGGTACACAATTTTTTGAGCACTT 
CGACATCAAAACGAGAGAGAAAAGAATGGTGGATTCTCATGGCTCCGACACGGAATGTTC 
CTCCAAGAAGAAAAAGGAGAAAACGAAAGAAAAGGGGGTATATCGTGGGGCTCGCATGAG 
GAGCTGGGGGAAATGGGTCTCGGAGATTCGGGAGCCCCGTAAGAAATCAAGAATCTGGCT 
CGGGACTTTCCCCACGGCGGAGATGGCAGCGCGTGCCCATGATGTTGCGGCATTGAGTAT 
CAAAGGAAGTTCCGCAATCCTTAACTTCCCTGAGCTCGCGGATTTTCTGCCAAGACCAGT 
CTCGCTCAGCCAACAGGATATCCAGGCCGCAGCCGCCGAAGCCGCTCTTATGGATTTCAA 
AACTGTACCATTCCATCTTCAGGATGACTCAACGCCGTTGCAAACTAGGTGTGATACTGA 
GAAGATCGAAAAGTGGTCATCCTCATCGTCCTCAGCCTCATCCTCATCCTCATCTTCGTC 
CTCGTCCTCATCATCTATGCTTTCGGGGGAGCTAGGAGATATTGTGGAGTTGCCGAGTCT 
TGAAAACAATGTAAAATACGATTGTGCGCTGTATGACTCGTTGGAGGGGCTGGTGTCGAT 
GCCCCCATGGTTAGATGCTACCGAAAATGATTTTAGGTATGGAGATGATTCGGTACTGTT 
GGACCCATGTCTCAAAGAAAGCTTTTTGTGGAATTATGAGTAAGGTTTTTTTTTGGAAAG 
AAATGTGGTTTTTTGTTTCCTCCTCTCTTTTATACTTTCGATCTTTTTTTCTT^AGCATAT 
ATATCTTCTACATATGTAATACTTTTCCATTAGTAAACAATGATTCGGTTTCGGGTACAA 
AAAAAAAAAAAAAAAAAAAAAAA 

>G1007 Amino Acid Sequence (domain in AA coordinates: TBD) 

MVDSHGSDTECSSKICKKEKTKEKGWRGAI^RSWGK^^SEIREPRKKSRIWLGTFPTAEM 

AARAHDVAALS I KGS SAI LNFPELADFLPRPVS LSQQD I QAAAAEAALMD F KTVPFHLQD 

DSTPLQTRCDTEKIEKWSSSSSSASSSSSSSSSSSSSMLSGELGDIVELPSLENNVKYDC 

ALYDSLEGLVSMPPWLDATENDFRYGDDSVLLDPCLKESFLWNYE* 

>G1010 (344.. 1276) 

ATTCTTCTTCTAAAAAATCTTGACAACTTTTTGTTTTTGTTTTCTTTCTCTGAATTTTTT 

AAAAGAGAGAGAGCTATGTAGCTATGAAACAGTAAGAGATATAGATATAGAGAGACAGAG 

AAAGATGATGATCAGTGAAGTTAGGCTAAACCCACTTTCTATTTATGTATAATTAGGTCA 

ATCACATCACCAATCTCCTCCTCCAATTCTCCTCCTCTCCTTCCAAATTCTAGGGTTTTG 

CTTGTATCTCACCCCCTTTCTCAATTCCCTAGGGAAACTGTGAATTTCATCAAATTCCAT. 

TATTTTTTGGTCACACCCTTAAAGAGATCTGAGAGTTCTAAAGATGATGACAGATTTATC 

TCTCACGAGAGATGAAGATGAAGAAGAAGCAAAGCCCTTAGCAGAAGAAGAAGGAGCGCG 

TGAAGTAGCAGACAGAGAGCACATGTTCGACAAAGTTGTGACTCCAAGTGATGTCGGAAA 

ACTAAACCGACTTGTGATCCCAAAGCAACACGCAGAGAGATTCTTCCCTTTAGATTCATC 

TTCAAACGAGAAAGGTTTGCTTTTAAACTTCGAAGATCTCACTGGCAAATCTTGGAGGTT 

CCGTTACTCTTACTGGAACAGTAGTCAAAGCTATGTCATGACTAAAGGTTGGAGCAGATT 

CGTTAAAGACAAAAAGCTTGACGCCGGAGATATTGTCTCTTTCCAAAGATGTGTCGGAGA 

TTCAGGAAGAGATAGCCGTTTGTTTATTGATTGGAGGAGAAGACCTAAAGTCCCTGACCA 

TCCTCATTTCGCCGCCGGAGCTATGTTCCCTAGGTTTTACAGCTTTCCTTCGACCAATTA 

CAGTCTTTATAATCATCAGCAGCAACGTCATCATCACAGTGGTGGTGGTTATAATTATCA 

T(^^TTCCGAGAGAATTTGGTTATGGTTACTTCGTTAGGTCAGTGGATCAGAGGAAC^A 

TCCTGCGGCTGCGGTGGCTGATCCGTTGGTGATTGAATCTGTGCCGGTGATGATGCACGG 

GAGAGCTAATCAGGAACTTGTTGGAACGGCCGGGAAGAGACTGAGGCTTTTTGGAGTTGA 

TATGGAATGCGGCGAGAGCGGAATGACCAACAGTACGGAGGAGGAATCATCATCTTCCGG 

TGGAAGTTTGCCACGTGGAGGCGGTGGTGGTGCTTCATCTTCCTCTTTCTTTCAGCTGAG 

ACTTGGAAGCAGCAGTGAAGATGATCACTTCACTAAGAAAGGAAAGTCTTCATTGTCTTT 

TGATTTGGATCAATAATAATGATGATGATGAAATTAGTTGGTATTTTAAGAAAAAAAACA 

TACATATATAATTCTATATATATGACAACATAATGCATTGATTTCCTT 

>G1010 Amino Acid Sequence (domain in AA coordinates: 33-122) 

MMTDLSLTRDEDEEEAKPLAEEEGAREVADREHM 

FPLDSS SNEKGLLLNFEDLTGKS WRFRYS YWNS S QS YVMTKG WSRFVKDKKLDAGD I VS F 

QRCVGDSGRDSRIiFIDWRRRPKVPDHPHFAAGAMFPRFYSFPSTOTSLYNHQQQRHHHSG 

GGYimiQIPREFGYGYFVRSVDQRNNPAAAVADPLVIESVPVMMHGRANQELVGTAGKRL 

RLFGVDMECGESGMTNSTEEESSSSGGSLPRGGGGGASSSSFFQLRLGSSSEDDHFTKKG 

KSSLSFDLDQ* 

>G1014 (174.. 1112) 

CACAAACCACAGTCTCTCTTTCTCTCTCTATCTATCTTCTCTTTCTCTCTCTATCTCTAT 
GACTGAAACCCAAAGAGATCCACCATTTGTTCT^ 
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CTTCCACACTTCCTTTTTACTAGGCAGTGTTAACCAATTGAGAGAGAAAAATGATGGTTG 

ATGAAAATGTGGAAACCAAGGCCTCTACTTTAGTGGCAAGTGTTGATCATGGGTTTGGAT 

CCGGGTCGGGTCATGATCATCATGGGTTATCGGCGTCTGTGCCTCTTCTTGGTGTTAACT 

GGAAGAAGAG7VAGGATGCCTAGACAGAGACGATCTTCTTCTTCCTTTAACCTTCTCTCTT 

TCCCTCCTCCTATGCCTCCTATTTCCGACGTGCCAACTCCTCTCCCCGCACGTAAAATTG 

ACCCAAGAAAGCTAAGATTCCTCTTCCAAAAGGAACTCAAGAACAGTGACGTCAGCTCTC 

TCCGACGTATGATACTCCCGAAGAAAGCCGCGGAGGCTCACTTGCCGGCACTTGAATGCA 

AGGAAGGGATTCCTATAAGAATGGAAGATTTGGACGGTTTTCACGTTTGGACCTTCAAGT 

ATAGGTACTGGCCAAACAACAATAGCAGAATGTACGTGCTAG7VAAACACAGGCGATTTTG 

TGAATGCTCATGGTCTGCAGCTAGGTGACTTCATCATGGTTTACCAAGATCTCTACTCAA 

ACAATTACGTTATACAAGCAAGAAAAGCATCGGAAGAAGAAGAAGTAGACGTAATCAATC 

TTGAAGAAGACGACGTTTACACAAACTTAACAAGGATCGAAAACACTGTGGTTAACGATC 

TTCTCCTCCAAGATTTTAATCATCACAACAACAACAACAACAACAACAGCAACAGCAACA 

GCAACAAATGTTCTTACTATTATCCAGTCATAGATGATGTCACCACAAACACAGAGTCTT 

TTGTCTACGACACGACGGCTCTTACCTCCAACGATACTCCTCTCGATTTTTTGGGTGGAC 

ATACGACGACTACTAATAATTATTACTCCAAGTTCGGAACATTCGATGGTTTGGGCTCCG 

TTGAGAATATCTCTCTCGATGACTTCTACTAGATAATCAATCGATGGGCTCATGGTATTC 

TTGATGGTGATCAGCTATTTAATATCCTTATAATATATATAAGAATTAAATGCAATTTGC 

ATATATATTATCAAGTGTTGTAATATAACATTACAGTTTAAAAAAAAAAAAAAAAAA 

>G1014 Amino Acid Sequence (domain in AA coordinates: 90-172) 

MTOENVETKASTLVASVDHGFGSGSGHDHHGLSASVPLLGVNWKKRRMPRQRRSSSSFNL 

LSFPPPMPPISHVPTPLPARKIDPRKLRFLFQKELKNSDVSSLRRMILPKKAAEAHLPAL 

ECKEGIPIRMEDLDGFHVWTFKYRYWPN^SRMYVLENTGDFVNAHGLQLGDFIMVYQDL 
YSNNWIQARKASEEEEVDVINLEEDDVYTO 

SNSNKCSYYyPVIDDVTTNTESFVYDTTALTSNDTPLDFLGGHTTTTNNYYSKFG 
GSVENISLDDFY* 

>G1035 (103.. 624) 

CCATAATAATATATTAAAACTATATACTATAATCTTTTTACATAATAAACTTTGGGTCCT 

GCGTCTTAATCATAGTACTTAATTTTCTCTGTGTGTTTTAATATGAATAATAAAACTGAA 

ATGGGATCTTCCACAAGTGGAAATTGCTCGTCGGTTTCAACCACTGGTTTAGCTAACT 

GGTTCAGAATCTGATCTCCGGCAACGTGATCTAATCGACGAGCGGAAGAGAAAGAGGAAA 

CAGTCGAAGAGAGAATCTGCGAGGAGGTCGAGGATGAGGAAGCAGAAGCATTTGGATGAT 

CTCACTGCTCAGGTGACTCATCTACGTAAAGAAAACGCTCAGATCGTCGCCGGAATCGCC 

GTCACGACGCAGCACTACGT(^CTATCGAGGCGGAGAACGACATTCTCAGAGCTCAGGTT 

CTTGAACTTAACCACCGTCTCCAATCTCTTAACGAGATCGTTGATTTCGTCGAATCTTCT 

TCTTCAGGATTCGGTATGGAGACCGGTCAGGGATTATTCGACGGTGGATTATTCGACGGC 

GTGATGAATCCTATGAATCTAGGGTTTTATAATCAACCAATCATGGCTTCTGCTTCTACT 

GCTGGTGATGTTTTCAACTGTTAGAAAACTTCACATCATTATCATCGTGAGTGAGACTAA 

TCATCGCAGCAGGGGTAAAACTGTAATTTTTC^ 

CTTTATTTTATAAGATGGTTAATTAGTGTTTAAAACTGACT^ 

GAAATGTGTGATATC^TGGAGATGGTGATGTGAGTTTGGTACAAATATTTTAAG 

TCTTTCTATATATTAAAAGTGAAGAAATAATATTTTGTCATTTTCTTAAAAAAAAAAAAA 
AAA 

>G1035 Amino Acid Sequence (domain in AA coordinates: 39-91) 

MNITCTEMGSSTSGNCSSVSTTGI^^ 

QKHIJDDLTAQVTHLRKENAQIVAGIAVTT^ 

DFVESSSSGFGI^TGQGLFDGGLFDGVMNPMNLGFYNQPIMASASTAGDVFNC* 
>G1046 (1.,567-h 

ATGATTAGACATCTAAAACCCTACATGGAGTCGTCTAGTGTCCATCGCTCTCATTGTTTC 
GATATTCTTGATGGAGTCCCACTAGACGACGAT 

ACTGACTTTAATGTTCATTTGCAGTCAAACGTATCGACCCGCATCAACAATCAGTCTCAC 

TTAGACCCAAATGCAGAAAACATTTTCCATAACGAAGGTCTTGCTCC^GAAGAAAGAAGA 

GCyUVGAAGAATGGTCTCTAACCGGGAATCTGCAAGGAGGTCACGTATGCGCAAAAAGAAG 

CAGATCGAAGAGCTGCAACAACAAGTTGAAC^CTCATGATGTTGAATCATCACTTGTCT 

GAGAAAGTC^TCAACTTGTTGGAAAGCAACCATCAGATCCTACAAGAGAACTCACAGCTG 

AAAGAGATUVGTCTCTTCCTTTCACTTGCTC^TGGCAGATGTGCTATTACCCATGAGA 

GCAGAGAGCAACATCAATGACCGCAATGTGAATTATCTAAGAGGAGAACCATCAAACCGT 
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CCCACCAACAGTCCCTTTGGTAAGTAA 

>G1046 Amino Acid Sequence (conserved domain in AA coordinates : 79- 138) 

MIRHLKPYMESSSVHRSHCFDILDGVPLHDDHFNSAFLPNTDFNVHLQSNVSTRINNQSH 

LDPNAENIFHNEGLAPEERRARRMVSNRESARRSRMRKKKQIEELQQQVEQLMMLNHHLS 

EKVINLLESlfflQILQENSQLKEKVSSFHLLMADv^ 

PTNSPFGK* 

>G1049 (29.. 550) 

CTAACTTTCTTCCCAAGTAAACTTCAAAATGCAGCCGCAAACAGACGTTTTCAGCCTCCA 
TAACTACCTAAACTCATCGATACTGCAGTCTCCGTATCCTTCTAATTTCCCGATATCTAC 
GCCATTTCCAACCAACGGTCAAAACCCGTACCTCCTCTACGGATTCCAAAGCCCTACAAA 
CAATCCACAATCCATGAGCCTAAGCAGCAACAACTCAACATCAGATGAAGCAGAAGAGCA 
GCAGACGAACAACAATATAATCAACGAGCGGAAGCAGAGAAGGATGATTTCAAACCGAGA 
ATCCGCAAGGAGATCGCGTATGAGGAAGCAAAGACACCTTGACGAGCTTTGGTCACAAGT 
GATGTGGTTAAGGATCGAGAATCATCAGTTGCTTGATAAGCTTAACAATCTCTCTGAGTC 
TCACGACAAGGTTCTTCAAGAGAATGCTCAGCTTAAAGAAGAAACATTTGAGCTTAAGCA 
AGTGATCAGCGATATGCAAATTCAAAGCCCTTTCTCTTGCTTTAGAGACGATATAATCCC 
CATTGAATAAAGCATTTTTCCCCGATTCATATTTATGAAAATTTTCTTCAAGAGTATGTT 
TCTTTGTATGTATATGTGGAGATGTATTTCAGGGTTTTGATAATATGACCCTTTACGACG 
ACGTTTTTAGATTGTAGTAAATTTATAAACTATVAGAAGATTAGTGTTAATGAAGAACAAA 
TATAA 

>G1049 Amino Acid Sequence (domain in AA coordinates 77-132) 
MQPQTDVFSLHNYLNSSILQSPYPSNFPISTPFPTNGQNPYLLYGFQSPTNNPQSMSLSS 
NNSTSDEAEEQQTNNNIINERKQRRMISNRESARRSRMRKQRHLDELWSQVMWLRIENHQ 
LLDKLNNLSESHDKVLQENAQLKEETFELKQVISDMQIQSPFSCFRDDIIPIE* 
>G1069 (89.. 934) 

TTGGAACCCTAGAGGCCTTTCAAGCAAATCATCAGGGTAACAATTTCTTGATCTTTCTTT 

TTAGCGAATTTCCAGTTTTTGGTCAATCATGGCAAACCCTTGGTGGACGAACCAGAGTGG 

TTTAGCGGGCATGGTGGACC^TTCGGTCTCCTC^GGCCATCACCAAAACCATCACCACCA 

AAGTCTTCTTACCAAAGGAGATCTTGGAATAGCCATGAATCAGAGCCAAGACAACGACCA 

AGACGAAGAAGATGATCCTAGAGAAGGAGCCGTTGAGGTGGTCAACCGTAGACCAAGAGG 

TAGACCACCAGGATCCAAAAACAAACCCAAAGCTCCAATCTTTGTGACAAGAGACAGCCC 

CAACGCACTCCGTAGCCATGTCTTGGAGATCTCCGACGGCAGTGACGTCGCCGACACAAT 

CGCTCACTTCTCAAGACGCAGGCAACGCGGCGTTTGCGTTCTCAGCGGGACAGGCTCAGT 

CGCTAACGTCACCCTCCGCCAAGCCGCCGCACCAGGAGGTGTGGTCTCTCTCCAAGGCAG 

GTTTGAAATCTTATCTTTAACCGGTGCTTTCCTCCCTGGACCTTCCCCACCCGGGTCAAC 

CGGTTTAACGGTTTACTTAGCCGGGGTCCAGGGTCAGGTCGTTGGAGGTAGCGTTGTAGG 

CCCACTCTTAGCCATAGGGTCGGTCATGGTGATTGCTGCTACTTTCTCTAACGCTACTTA 

TGAGAGATTGC CCATGG AAGAAGAGG AAGACGGTGG CGGCTCAAGACAGATTCACG GAGG 

CGGTGACTCACCGCCCAGAATCGGTAGTAACCTGCCTGATCTATCAGGGATGGCCGGGCC 

AGGCTACAATATGCCGCCGCATCTGATTCCAAATGGGGCTGGTCAGCTAGGGCACGAACC 

ATATACATGGGTCCACGCAAGACCACCTTACTGACTCAGTGAGCCATTTCTATATATAAT 

GGTCTATATAAATAAATATATAGATGAATATAAGCAAGCAATTTGAGGTAGTCTATTACA 

AAGCTTTTGCTCTGGTTGGAAAAATAAATAAGTATCAAAGCTTTGTTTGTTCTT 

AATATAGAGCTTGGGAAGGTAGAAAGAGACGACATT 

>G1069 Amino Acid Sequence (domain in AA coordinates: 67-74) 
MANPWWTNQSGIAGMVDHSVSSGHHQNH^ 

A VISVVNRRPRGRPPGSKNKPKAP I FVTRDS PNALRSHVLE I SDGSDVADT I AHFSRRRQR 
GVCVIjSGTGSVANWLRQAAAPGGWSLQGRFEILSLTGAFLPGPSPPGSTGLTVYLAGV 
QGQWGGSWGPLLAIGSVMVIAATFSNATYERLPMEEEEDGGGSRQIHGGGDSPPRIGS 
NLPDLSGMAGPGYNMPPHLI PNGAGQLGHEPYTWVHARPPY * 
>G1070 (170.. 1144) 

TCGACmGCTTGGATTTCGTTGTTCATCATTACTACTCTCTTTCTTCTTCTAGCTAGCTA 
GTTTTGACAGCAAAATAAGAAGCATUUy^AAAGGTCAACTAAAAAAGATCTGTTCTTAGAT 
CACTCTCTTCTTCTTTTTTTGATCCAATTCCACCATTG7VATCATAGATCATGGATCCAGT 
ACAATCTCATGGATCACAAAGCTCTCTACCTrc^ 

ACATCTTCAACAACAGCAACAAGAGTTCTTCCTCCACCATCACCAGCAACAAAGAAACCA 
AACCGATGGTGACCAACAAGGAGGATCAGGAGGAAACCGACAAATCAAGATGGATCGTGA 
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AGAGACAAGCGACAACATAGACAACATAGCTAACAACAGCGGTAGTGAAGGTAAAGACAT 

AGATATACACGGTGGTTCAGGAGAAGGAGGTGGTGGCTCCGGAGGAGATCATCAGATGAC 

AAGAAGACCAAGAGGAAGACCAGCGGGATCCAAGAACAAACCAAAACCACCGATTATCAT 

CACACGGGACAGCGCAAACGCGCTTAGAACCCACGTGATGGAGATCGGAGATGGCTGCGA 

CTTAGTCGAAAGCGTTGCCACTTTTGCACGAAGACGCCAACGCGGCGTTTGCGTTATGAG 

CGGTACTGGAAATGTTACTAACGTCACTATACGTCAGCCTGGATCTCATCCTTCTCCTGG 

CTCGGTAGTTAGTCTTCACGGAAGGTTCGAGATTCTATCTCTCTCAGGATCTTTTCTCCC 

TCCTCCGGCTCCTCCTACAGCCACCGGATTGAGTGTTTACCTCGCTGGAGGACAAGGACA 

GGTGGTTGGAGGAAGCGTAGTTGGTCCGTTGTTATGTGCTGGTCCTGTCGTTGTCATGGC 

TGCGTCTTTTAGCAATGCGGCGTACGAAAGGTTGCCTTTAGAGGAAGATGAGATGCAGAC 

GCCGGTTCATGGCGGAGGAGGAGGAGGATCATTGGAGTCGCCGCCAATGATGGGACAACA 

ACTGCAACATCAGCAACAAGCTATGTCAGGTCATCAAGGGTTACCACCTAATCTTCTTGG 

TTCGGTTCAGTTGCAGCAGCAACATGATCAGTCTTATTGGTCAACGGGACGACCACCGTA 

TTGATCAAATATACACACACACTCATAATCGTTGCTAGCTAGCTAACGATGAATCATGAG 

TTTAGTGGATATATATATGATTAAAAGAGGTTAGCTTATGAACATTAATAAGAGTTTGGA. 

TTCTATCGAGCTTCATTATGTTTGGGTCATCGTTC 

>G1070 Amino Acid Sequence (domain in AA coordinates: 98-120) 

MDPVQSHGSQSSLPPPFHARDFQLHLQQQQQEFFLHHHQQQRNQTDGDQQGGSGGNRQIK 

MDREETSDNIDNIT^NNSGSEGKDIDIHGGSGEGGGGSGGDHQMTRRPRGRPAGSKNKPKP 

PIIITRDSANALRTHVMEIGDGCDLVESVATFARRRQRGVC^SGTGNVTWVTIRQPGSH 

PSPGSWSLHGRFEILSLSGSFLPPPAPPTATGLSVYLAGGQGQWGGSWGPLLCAGPV 

WMAASFSNAAYERLPLEEDEMQTPVHGGGGGGSLESPPMMGQQLQHQQQAMSGHQGLPP 

NLLGSVQLQQQHDQSYWSTGRPPY* 

>G1076 (198. .1076) 

ATTTTAGTCTTCCTATAACTTCTTCTCAATCCTCTCTCATATCTTTTTTCTTAGTTTAAA 
TTTCAATAAAATAGAAAAAAAGATATACAAATCTACAGAGAAGAGAAGCTTTATTTTAAT 
CTTGTGTGTGTGTGTGTGTTTTATATAATTTTTATTTTTTTTCAAATTAAAATCTCTTCT 
TTGCTTTTGATGTGGGCATGGCTGGTCTTGATCTAGGCACAGCTTTTCGTTACGTTAATC 
ACCAGCTCCATCGTCCCGATCTCCACCTTCACCACAATTCCTCCTCCGATGAGGTCACTC 
CCGGAGCCGGGATGGGTCATTTCACCGTCGACGACGAAGACAACAACAACAACCATCAAG 
GTCTTGACTTAGCCTCTGGTGGAGGATCAGGAAGCTCTGGAGGAGGAGGAGGTCACGGCG 
GGGGAGGAGACGTCGTTGGTCGTCGTCCACGTGGCAGACCACCGGGATCCAAGAACAAAC 
CGAAACCTCCGGTAATTATCACGCGCGAGAGCGCAAACACTCTAAGAGCTCACATTCTTG 
AAGTAACAAACGGCTGCGATGTTTTCGACTGCGTTGCGACTTATGCTCGTCGGAGACAGC 
GAGGGATCTGCGTTCTGAGCGGTAGCGGAACGGTCACGAACGTCAGCATACGTCAGCCAT 
CTGCGGCTGGAGCGGTTGTGACGCTACAAGGAACGTTCGAGATTCTTTCTCTCTCCGGAT 
CGTTTCTTCCTCCTCCGGCACCTCCCGGAGCAACGAGTTTGACAATTTTCTTAGCCGGAG 
GACAAGGTCAGGTGGTTGGAGGAAGCGTTGTGGGTGAGCTTACGGCGGCTGGACCGGTGA 
TTGTGATTGCAGCTTCGTTTACTAATGTTGCTTATGAGAGACTTCCTTTAGAAGAAGATG 
AGGAGCAGCAACAGCTTGGAGGAGGATCTAACGGCGGAGGTAATTTGTTTCCGGAGGTGG 
CAGCTGGAGGAGGAGGAGGACTTCCGTTCTTTAATTTACCGATGjy\TATGCAACCAAATG 
TGCAACTTCCGGTGGAAGGTTGGCCGGGGAATTCCGGTGGAAGAGGTCCTTTCTGATGTG 
TATATATTGATAATCATTATATATATACCGGCGGAGAAGCTTTTCCGGCGAAGAATTTGC 
GAGAGTGAAGAAAGGTTAGAAT^GCTTTTAATGGACTAATGAATTTCAAATTATCATCGT 

AATTTTATGTTTGAATCCTTTTTTTTTTCTGTGAAACTCTATTG 
AAAAAAAAATTCTCAAAAAAAA 

>G1076 Amino Aeid Sequence (domain in AA coordinates: 82-89) 
MAGLDLGTAFRYVNHQLHRPDLHLHHNSSSDDVTPGAGMG 

GGGSGS SGGGGGHGGGGDWGRRPRGRPPGSKNKPKPPVI ITRESANTLRAHILEVTNGC 
DVFDCVATYARRRQRGICVLSGSGTVTNVSIRQPSAAGAVVTLQGTFEILSLSGSFLPPP 
APPGATSLTIFLAGGQGQWGGSWGELTAAGPVIVIAASFTNVAYERIjPLEEDEQQQQL 
GGGSNGGGNIiFPEVAAGGGGGLPFFNLPMNMQPNVQLPVEGWPGNSGGRGPF* 
>G1089 (31.. 2427) 

AAGTAAGAGAGCTTCTTAAGGAAGAAGAAGATGGGTTGTGCTCAATCAAAGATCGAGAAC 
GAAGAAGCAGTTACTCGTTGCAAAGAACGAAAACAATTGATGAAAGACGCCGTCACTGCT 
CGTAACGCTTTCGCCGCCGCTCACTCAGCTTACGCTATGGCTCTTAAAAACACCGGAGCT 
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GCTCTTTCCGATTACTCTCACGGCGAGTTTTTAGTCTCTAATCACTCGTCTTCCTCCGCA 

GCTGCAGCAATCGCTTCTACTTCTTCTCTTCCCACTGCTATATCTCCTCCTCTTCCTTCT . 

TCCACCGCTCCGGTTTCTAATTCAACCGCTTCTTCTTCCTCCGCTGCGGTTCCTCAGCCG 

ATTCCTGATACTCTTCCTCCTCCTCCTCCTCCACCACCGCTTCCTCTTCAACGTGCTGCT 

ACTATGCCGGAGATGAACGGTAGATCCGGTGGTGGTCATGCTGGTAGTGGACTCAACGGA 

ATTGAAGAAGATGGAGCCCTAGATAACGATGATGATGACGATGATGATGATGATGACTCT 

GAAATGGAGAATCGTGATCGTTTGATTAGGA7VATCGAGMGCCGTGGAGGTAGTACTAGA 

GGAAATAGGACGACGATTGAAGATCATCATCTTCAGGAGGAGAAAGCTCCGCCACCTCCC 

CCTTTGGCGAATTCGCGGCCAATTCCGCCGCCACGTCAGCATCAGCATCAACATCAGCAA 

CAGCAACAACAACCTTTCTACGATTACTTCTTCCCTAATGTTGAGAATATGCCTGGAACT 

ACTTTAGAAGATACTCCTCCACAACCACT^ACCACAACCAACAAGGCCTGTGCCTCCTCAA 

CCACATTCACCAGTCGTTACTGAGGATGACGAAGATGAGGAGGAGGAAGAGGAGGAAGAG 

GAGGAGGAAGAGGAGACGGTGATTGAACGGAAACCACTGGTGGAGGAAAGACCGAAGAGA 

GTAGAGGAAGTGACGATTGAATTGGAAAAAGTTACTAATTTGAGAGGGATGAAGAAGAGT 

AAAGGGATAGGGATTCCCGGAGAGAGGAGAGGAATGCGAATGCCGGTGACTGCGACGCAT 

TTGGCGAATGTATTCATTGAGCTTGATGATAATTTCTTGAAAGCTTCTGAAAGTGCTCAT 

GATGTTTCTAAGATGCTTGAAGCTACTAGGCTCCATTACCATTCTAATTTTGCAGATAAC 

CGAGGACATATTGATCACTCTGCTAGAGTGATGCGTGTAATTACATGGAATAGATCATTT 

AGAGGAATACCAAATGCTGATGATGGGAAAGATGATGTTGATTTGGAAGAGAATGAAACT 

CATGCTACTGTTCTTGACAAATTGCTAGCATGGGAAAAGAAGCTCTATGACGAAGTCAAG 

GCTGGCGAACTCATGAAAATCGAGTACCAGAAAAAGGTTGCTCATTTAAATCGGGTGAAG 

AAACGAGGTGGCCACTCGGATTCATTAGAGAGAGCTAAAGCAGCAGTAAGTCATTTGCAT 

ACAAGATATATAGTTGATATGCAATCCATGGACTCCACAGTTTCAGAAATCAATCGTCTT 

AGGGATGAACAACTATACCTAAAGCTCGTTCACCTTGTTGAGGCGATGGGGAAGATGTGG 

GAAATGATGCAAATACATCATCAAAGACAAGCTGAGATCTCAAAGGTGTTGAGATCTCTA 

GATGTTTCACAAGCGGTGAAAGAAACAAATGATCATCATCACGAACGCACCATCCAGCTC 

TTGGCAGTGGTTCAAGAATGGCACACGCAGTTTTGCAGGATGATAGATCATCAGAAAGAA 

TACATAAAAGCACTTGGCGGATGGCTAAAGCTAAATCTCATCCCTATCGAAAGCACACTC 

AAGGAGAAAGTATCTTCGCCTCCTCGAGTTCCCAATCCCGCAATCCAAAAACTCCTCCAC 

GCTTGGTATGACCGTTTAGACAAAATCCCCGACGAAATGGCTAAAAGTGCCATAATCAAT 

TTCGCAGCGGTTGTAAGCACGATAATGCAGCAGCAAGAAGACGAGATAAGTCTCAGAAAC 

AAATGCGAAGAGACAAGAAAAGAATTGGGAAGAAAAATTAGACAGTTTGAGGATTGGTAC 

CACAAATACATCCAGAAGAGAGGACCGGAGGGGATGAATCCGGATGAAGCGGATAACGAT 

CATAATGATGAGGTCGCTGTGAGGCAATTCAATGTAGAACAAATTAAGAAGAGGTTGGAA 

GAAGAAGAAGAAGCTTACCATAGACAAAGCCATCAAGTTAGAGAGAAGTCACTGGCTAGT 

CTTCGAACTCGCCTCCCCGAGCTTTTTCAGGCAATGTCCGAGGTTGCGTATTCATGTTCG 

GATATGTATAGAGCTATAACGTATGCGAGTAAGCGGCAAAGCCAAAGCGAACGGCATCAG 

AAACCTAGCCAGGGACAGAGTTCGTAAGAACTAATGTAAGATCAGAGTAATGTCTTCtTC 

TTCTTTGATCTTGAATATTTAAGCACACACATACATACAACGTATAGCTAAATCTTTATC 

ATTGCTTTCTTATATTAAGGTTTTGGCTTTTGTAAGAAGGTTTCTTACATATGAGATT^ 

TATAGTGTTTGATTCTTAAGGAACTGTTCTGTTGAGTAATAAGAAAGTTGTGTATTGAT^A 

TAGAGTTG CATTTGTTAATTTTG 

>G1089 Amino Acid Sequence (domain in AA coordinates 425-500) 
MGGAQSKIENEEAVTRCKERKQLMKDAVTAKNAFAAAHSAYAMAL 

LVSNHSSSSAAAAIASTSSLPTAISPPLPSSTAPVSNSTASSSSAAVPQPIPDTLPPPPP 

PPPLPLQRAATMPEMNGRSGGGHAGSGLNG I EEDGALDNDDDDDDDDDDSEMENRDRL IR 

KSRSRGGSTRGNRTTIEDHHLQEEKAPPPPPLANSRPIPPPRQHQHQHQQQQQQPFYDYF 

FPNVENMPGTTLED5PPQPQPQPTRPVPPQPHSPVVTEDDEDEEEEEEEEEEEEETVIER 

KPLVEERPKRVEEVTIELEKVTNLRGMKKSKGIGIPGERRGMRMPVTATHLANVFIELDD 

NFLKASESAHDVSKMLEATRLHYHSNFADl^GHIDHSARVMRVITWNRSFRGIPNADDGK 

DDVDLEENETHATVLDKLLAWEKKLYDEVKAGELMKIEYQKKVAH^ 

RAKAAVSHLHTRYIVDMQSMDSTVSEINRLRDEQ 

AEISKVLRSLDVSQAVKETNDHHHERTIQLLAWQEWHTQFCRMIDHQKEYIK^ 
LNLIPIESTLKEKVSSPPRVPNPAIQKLLHAWYDRLDKIPDEMAKSAIINFAAWSTIMQ 
QQEDEISLRNKCEETRKELGRKIRQFEDWYHKYIQKRGPEGMNPDEADNDHNDEVAVRQF 
NVEQIKKRLEEEEEAYHRQSHQVREKSLASLRTRLPELFQAMSEVAYSCSDMYRAITYAS 
KRQSQSERHQKPSQGQSS * 
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>G1093 (1..531) 

ATGGGTTATCCGGTGGGGTACACTGAGCTCCTCCTCCCAAGAATCTTCCTTCACTTACTC 

TCTCTCTTAGGCTTAATACGAACACTCATAGACACGGGTTTTCGGATATTGGGTCTACCC 

GACTTTCTCGAATCCGACCCGGTTTCATCGTCATCGTCATGGCTGGAACCACCGTATATG 

TCCACGGCGGCGCATCATCACC/^AGAAAGCTCATTTTTCTTCCCAGTGGCGGCGAGGCTA 

GCTGGAGAAATCTTGCCCGTCATCAGATTCTCGGAGCTAACTCGACCCGGATTCGGATCC 

GGATCCGATTGCTGCGCGGTGTGCCTCCACGAGTTCGAGAACGATGACGAGATCCGACGG 

CTGACGAATTGTCAACACATATTTCACCGGAGCTGTTTAGACCGTTGGATGATGGGTTAT 

AATCAGATGACGTGTCCACTTTGTAGAACGCCGTTTATTTCTGATGAGTTACAAGTTGCT 

TTTAACCAACGAGTTTGGTCTGAATCTGAACTTCTCGCAGAATCAAATTAG 

>G1093 Amino Acid Sequence (domain in AA coordinates: 105-148) 

MGYPVGYTELLLPRIFLHLLSLLGLIRTLIDTGFRILGLPDFLESDPVSSSSSWLEPPYM 

STAAHHHQESSFFFPVAARLAGEILPVIRFSELTRPGFGSGSDCCAVCLHEFENDDEIRR 

LTNCQHIFHRSCLDRWMMGYWQMTCPLCRTPFISDELQVAFNQRVWSESELLAESN* 

>G1127 (191.. 1351) 

GACAGACTCTCTCTGTATGTGTGCGAGAAGCGAGAAGCGAGAGAGAGAGAGAGAGAGTTG 
TTAGCTCACACGCTTTCTCTATTTTCTCGGAATTCACAAAACAGAAAGTTTCATCCTTTA 
CGAGAATTAAGCCGAAAGAAACAATCTTTGAGTTTGATTTCTTCTTCCTTCCTTCTCTCT 
CTCTGCTCTAATGGATTCCAGAGACATCCCACCGTCACATAACCAGCTTCAACCACCACC 
GGGAATGTTAATGTCTCATTACCGTAACCCTAACGCCGCCGCTTCACCATTAATGGTTCC 
CACTTCCACATCTCAACCGATTCAACACCCTCGTCTTCCTTTTGGCAATCAACAACAATC 
TCA7UVCGTTTCATCAGCAGCAACAACAACAAATGGATCAGAAGACTCTTGAATCTCTTGG 
ATTTGGTGATGGATCACCTTCTTCTCAACCGATGCGATTCGGGATCGATGATCAGAATCA 
GCAACTGCAAGTGAAGAAGAAGCGAGGAAGGCCGAGAAAGTATACTCCTGATGGTAGCAT 
TGCTTTAGGTTTAGCTCCTACGTCTCCTCTTCTCTCTGCAGCTTCTAATTCTTACGGTGA 
GGGTGGTGTTGGAGATAGTGGTGGAAATGGAAACTCTGTTGATCCACCTGTTAAACGTAA 
CAGAGGAAGGCCTCCTGGTTCTAGTAAGAAACAGCTTGATGCTTTAGGAGGAACTTCAGG 
AGTTGGGTTTACACCTCATGTCATTGAAGTGAACACAGGAGAGGACATAGCGTCAAAGGT 
GATGGCTTTTTCGGATCAAGGGTCAAGAAC^VATTTGTATTCTCTCTGCAAGTGGTGCAGT 
TTCTAGAGTGATGCTTCGTCAAGCTTCTCATTCTAGTGGAATCGTTACTTATGAGGGACG 
ATTTGAGATCATTACTCTCTCAGGCTCAGTCTTGAATTATGAGGTAAATGGTTCCACCAA 
CAGAAGTGGTAACTTGAGTGTGGCTTTGGCTGGACCTGATGGCGGCATCGTAGGTGGCAG 
TGTAGTTGGTAATCTAGTAGCTGCAACACAAGTCCAGGTGATAGTGGGAAGCTTTGTTGC 
AGAAGCAAAGAAACCGA7VAC7UVAGTAGTGTTAACATTGCTCGGGGGCAGAATCCTGAACC 
GGCTTCAGCGCCGGCTAACATGTTGAACTTTGGATCAGTCTCTCAAGGACCATCGAGCGA 
GTCATCAGAAGAGAATGAGAGCGGTTCTCCTGCAATGCACCGTGACAATAATAATGGGAT 
ATATGGAGCTCAACAACAACAACAACAACAACCTCTTCATCCTCATCAGATGCAAATGTA 
CCAACATCTTTGGTCTAATCATGGTCAATAAAATGAAGCGGAAATTAATTTGTTTCCGTT 
TTGGTTACGGTTATGGTTTGATTTCTT 

>G1127 Amino Acid Sequence (domain in AA coordinates : 103-110, 155-162) 
MDSRDIPPSHNQLQPPPGMLMSHYRNPNAAASPLMVPTSTSQPIQHPRLPFGNQQQSQTF 
HQQQQQQMDQKTIiESLGFGDGSPS SQPMRFG IDDQNQQLQVKKKRGRPRKYTPDGS IALG 
LAPTSPLLSAASNSYGEGGVGDSGGNGNSVDPPVKRNRGRPPGSSKKQLDALGGTSGVGF 
TPHVIEVl^GEDIASKVMAFSDQGSRTICILSASGAVSRVMLRQASHS SGIVTYEGRFE I 
ITLSGSVLNYEVNGSTSTCSGNLSVALAGPDGG 

KPKQSSVNIARGQNPEPASAPANMLNFGSVSQGPSSESSEENESGSPAMHRDNl^GIYGA 

QQQQQQQPLHPHQMQMYQHLWSNHGQ* 

>G1131 (57..75S) 

TCGACTCCTCTCCTGATTGCTTCACCTTCTTC 

CCATGGATTGCTTAAGCTACTTCITrAACTACGATCCTCCTGTCCAGCTCC^ 

TTATTCCCGAGATGGATATGATTATCCCTGAAACCGATAGTTTCTTCTTCCAATCTCAAC 

CGCAACTGGAGTTTCATCAGCCATTGTTTC^ 

ACCCTTTCTGCGACCAGTTTCTTTCTCCGCAAGAAATCTTTCTCC 

AAATCTTCAACGAAACACACGACCTCGATTTCTTTC 

TTGTTAACTCCAGCTACAATTGTAACACTCAAAACCATTTCCAGAGCCGTAACCCGAATT 
TCTTCGACCCTTTCGGCGACACTGATTTCGTTCC^GAATCTTGTACCTTCCAGGAGTTTC 
GAGTTCCGGATTTCTCTTTAGCTTTCAAGGTAGGCCGGGGAGATCAAGATGACTCAAAGA 
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AACCGACGCTTTCATCTCAGAGCATCGCGGCTAGAGGGAGGAGAAGAAGAATTGCAGAGA 
AGACTCACGAGCTCGGAAAACTCATCCCCGGTGGCAATAAACTTAACACCGCCGAGATGT 
TCCAAGCCGCCGCTAAGTATGTCAAGTTTTTGCAGAGTCAAGTTGGGATTCTCCAACTGA 
TGCAGACCACAAAGAAGGTAATAACCAACCCCAAATAAGAACTTTATCATCCAATTGAAA 
CTCTAATCGTGTTTTCTCACAAGCTTCTTAATTTGTTTACGCAGGGTAGCTCTAATGTGC 
AAATGGAAACTCAGTATTTGCTTGAATCGCAAGCAATCCAGGAGAAGTTATCAACAGAGG 
AAGTGTGTTTGGTACCGTGTGAAATGGTTCAAGATCTAACAACTGAAGAAACCATTTGCA 
GAACCCCGAATATTTCTCGAGAAATCAACAAGTTACTGTCTAAACATCTGGCTAACTAGT 
TTTAGTTTCAAGCCTGAAGTTCTCTATGCCTAAATTTGTGTCTGTTATCGTTGTTTTGTC 
TTCTTAGTTAGTGTTTTGTCTTGTTGATTTAGGGGCTAATTATCCTGGTTAATCTCCTCT 
TAACTGGGAA 

>G1131 Amino Acid Sequence (domain in AA coordinates: 173-22 0) 

MSMDCLSYFFNYDPPVQLQDCFIPEMDMIIPETDSFFFQSQPQLEFHQPLFQEEAPSQTH 

FDPFCDQFLSPQEIFLPNPKNEIFNETHDLDFFLPTPI<CRQRLVNSSYNCNTQNHFQSRNP 

NFFDPFGDTDFVPESCTFQEFRVPDFSLAFKVGRGDQDDSKKPTLSSQSIAARGRRRRIA 

EKTHELGKLIPGGNKLNTAEMFQAAAXYVKFLQSQVGILQLMQTTKKVITNPK* 

>G1145 (243.. 1142) 

GTGATTTCTCTCTGCCATTTCCTTCGATTTGATTTCTGGGTTCTCTTCTTCTCGTCTCTC 
TTCTGCATGTTTCGCCACTCTACCTTAGAAAAAAGGTTACTTTCGCCTCCGATTTAGGCT 
CGATTTGATGAATTCGTCGTCGTGTGGCTATTTATCAAATTGAGCATTAGGGTTTCTGAT 
TTGTGGGTTCAGAATTGTTTTTATCTATCTGTCTTGTTGTTTTTTGTCCGCTACAAAAGC 
CTATGGATTCTCAGAGGGGTATTGTTGAACAAGCTAAATCTCAGTCCTTGAATAGGCA7U\ 
GCTCTCTTTACAGCTTAACACTTGATGAGGTTCAAAATCACTTGGGGAGTTCTGGTAAAG 
CTCTGGGAAGCATGAACCTTGATGAGCTTTTGAAGAGTGTCTGTTCTGTTGAAGCTAATC 
AGCCATCGTCTATGGCTGTCAATGGTGGAGCAGCTGCTCAGGAGGGTCTTTCTCGCCAGG 
GGAGTTTGACTTTGCCTCGGGATCTCAGCAAAAAGACTGTTGATGAGGTTTGGAAAGACA 
TTCAGCAGAATAAGAATGGAGGTAGTGCTCATGAGAGGAGGGATAAGCAGCCTACACTTG 
GGGAAATGACGCTTGAAGACCTGTTGTTGAAAGCAGGAGTGGTCACTGAGACTATCCCTG 
GTTCGAACCATGATGGTCCTGTTGGTGGTGGTAGTGCTGGTTCAGGTGCTGGTTTAGGGC 
AAAACATTACTCAAGTTGGCCCATGGATTCAATATCATCAGCTCCCATCAATGCCACAGC 
CTCAAGCATTTATGCCCTATCCGGTTTCAGATATGCAAGCAATGGTGTCTCAGTCTTCTT 
TGATGGGTGGTTTGTCAGATACACAAACTCCTGGAAGGAAGAGGGTAGCTTCAGGAGT^AG 
TTGTAGAGAAGACTGTAGAGAGGAGGCAGAAGAGAATGATAAAGAACAGAGAGTCTGCTG 
CTCGTTCCCGAGCTAGGAAACAGGCTTACACTCATGAGCTAGAGATCAAAGTTTCACGGT 
TAGAAGAAGAAAACGAAAGACTCAGGAAGCAAAAGGAGGTGGAAAAATCCTCCCAAGTGT 
ACCACCGCCTGATCCCAAGCGGCAGCTCCGACGGACAAGCTCGGCTCCTTTCTGATCTCT 
AAACTCTTTTTGTCTTTTTCTTTTTTTCTCTTCTGTGTCGGTTCACTTATAAAAAAGAGA 
GGAAAACAGCTTTGTTTCTTTGTACATTCCGTAGACTTTCCT 

TAACTTTAAAATATTCTCGAGTTATTGTAGTAGC^GACTAGCAGCAGT7^TGGTTTTCAT 
GAGTCCGATTGAAATTCAGAGATTGAACAGGAAAAAA 

>G1145 Amino Acid Sequence (conserved .domain in AA coordinates : 227-270) 

MDSQRGIVEQAKSQSLNRQSSLYSLTLDEVQNHLGSSGKALGSMNLDELLKSVCSVEANQ 

PSSMAWGGAAAQEGLSRQGSLTLPRDLSKKTTOEW 

EMTLEDLLLKAGvVTETIPGSNHpGPVGGGSAGSGAGLGQNITQVGPWIQYHQLPSMPQP 
QAFMPYPVSDMQAIWSQSSLMGGLSDTQTPGRKRVASGEVV^ 

RSRARKQAYTHELEIKVSRLEEENERLRKQKEVEKSSQVYHRLIPSGSSDGQARLLSDL* 
>G1229 (123.. 1217) 

TTTGGGCGGGTCTTTCTTTCCCTAAATCTTTCTTTTATTTTGCTGTTTAAAAAAAAAATC 
CAACCATAAGACAAAACAACGAACGAGGAAGAGAGAGAGAGAAGGATATATCTCTAATCA 
CGATGCAGGAGATAATACCGGATTTTCTTGAAGAGTGTGAATTTGTCGACACTTCACTAG 
CCGGAGATGATCTATTTGCCATCTTAGAGAGTCTTGAAGGTGCCGGAGAGATATCTCCGA 
CAGCTGCATCTACACCTAAAGATGGAACCACAAGTTCCAAGGAGTTAGTTAAGGATCAAG 
ATTATGAAAACTCATCTCCTAAGAGGAAAAAGCAAAGACTAGAAACCAGGAAAGAAGAGG 
ACGAAGAAGAAGAAGACGGAGACGGAGAAGCAGAAGAAGATAATAAGCAAGATGGGCAAC 
AAAAGATGTCTCATGTT^ACCGTGGAACGTAACCGGAGAAAGCAAATGAACGAGCACTTAA 
CCGTTTTGCGTTCTCTTATGCCTTGTTTCTACGTCAAACGGGGGGACCAAGCATCGATCA 
TAGGAGGAGTTGTGGAGTACATAAGCGAGTTACAACAAGTTCTCCAATCTTTGGT^AGCCA 
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AGAAACAACGTAAAACCTACGCCGAAGTCCTAAGCCCGAGAGTTGTCCCGAGCCCTCGTC 
CTTCACCGCCTGTTCTAAGCCCAAGAAAACCGCCTCTTAGCCCGCGCATCAACCACCACC 
AGATTCACCACCACCTACTTCTCCCTCCCATAAGTCCTCGAACACCTCAGCCAACAAGCC 
CATACCGGGCCATTCCACCGCAACTACCACTCATCCCACAGCCTCCGCTTCGCTCTTACA 
GCTCATTGGCCAGTTGCAGCAGCTTAGGAGATCCACCTCCATACTCTCCTGCTTCATCTT 
CTTCATCTCCTTCAGTTAGTAGTAACCATGAGAGTAGTGTGATCAATGAGCTTGTTGCTA 
ACTCAAAATCGGCTTTGGCTGATGTGGAAGTGAAGTTTTCAGGAGCTAACGTGCTGCTCA 
AAACGGTGTCGCATAAGATCCCGGGACAAGTTATGAAGATAATTGCTGCTCTTGAAGATT 
TGGCTCTTGAGATTCTTCAGGTTAATATTAACACCGTCGACGAAACCATGCTTAATTCTT 
TCACCATCAAGATTGGAATTGAGTGCCAACTAAGTGCAGAAGAACTGGCTCAACAAATTC 
AGCAAACATTCTGCTAGTAAAGAAGGATTTAATATAGCTTCGTATAAACCTTAACGAGAG 
AGCAGTACGTACTCACTTTCTCTCCTTAGTATCCCTTTAATTATCTTTTCAGTTTTCTGC 
AAAGATATGGAGTTTAAAAAAATAAAATTGTTATCTAAAGTTTTAATCAAATATTGATTA 
ATTATAACTAATATAGGTATAAGTGAGTTTTAAAGATTATCAGCTTCATAACAGCCATCG 
TCATGTTTACTTTCTTTTAAATTTTAGAATTTAGACGTACTCCTACCATGTAATTTTATT 
TCTGTCATTACATCAAGCATTGTAGCTGTAATTGCATATGAATGAACAATAGTGTATGAG 
TGATCTCATGAATAATATTCTTCTTGCAACACAAAAAAAAAAAA 

>G1229 Amino Acid Sequence (domain in AA coordinates: 102-160) 
MQEIIPDFLEECEFVDTSLAGDDLFAILESLEGAGEISPTAASTPKDGTTSSKELVKDQD 
YENS S PKRKKQRLETRKEEDEEEEDGDGEAEEDNKQIX3QQ 

VLRSLMPCFYVKRGDQASIIGGVVEYISELQQVLQSLEAKKQRKTYAEVLSPRWPSPRP 
SPPVLSPRKPPLSPRINHHQIHHHLLLPPISPRTPQPTSPYRAIPPQLPLIPQPPLRSYS 
SLASCSSLGDPPPYSPASSSSSPSVSSNHESSVINELVANSKSALADVEVKFSGANVLLK 
TVSHKIPGQVMKIIAALEDLAiEILQVNINTVDETMLNSFTIKIGIECQLSAEELAQQIQ 
QTFC* 

>G1246 (1..1746) 

ATGATCATGTACGGAGGAGGAGGAGCAGGGAAGGACGGTGGATCCACCAATCACTTATCA 

GACGGAGGAGTGATATTGAAGAAAGGTCCATGGACGGCGGCGGAAGATGAGATACTTGCT 

GCGTACGTTAGAGAGAACGGTGAAGGGAATTGGAACGCCGTTCAGAAAAACACAGGTTTG 

GCTCGTTGCGGCAAAAGCTGCCGTCTTCGATGGGCCAATCACCTCCGACCAAATCTGAAA 

AAAGGCTCTTTCACCGGTGACGAAGAACGTCTCATCATTCAGCTTCATGCTCAGCTTGGT 

AACAAATGGGCTCGCATGGCTGCTCAGTTACCGGGAAGAACAGACAACGAGATTAAGAAC 

TATTGGAACACGAGATTGAAACGACTTCTTCGCCAAGGACTTCCTCTTTATCCTCCAGAT 

ATTATCCCTAACCATCAACTCCATCCACATCCACATCATCAACAACAACAGCAACATAAC 

CATCATCATCATCATCATCAACAACAACAACAACATCAACAAATGTATTTTCAACCACAA 

TCTTCACAACGAAACACACCATCATCTTCCCCTCTTCCATCTCCAACACCAGCAAACGCA 

AAGTCCTGATCATCCTTCACTTTTCATACCACGACTGCTAACCTCCTCCATCCACTTAGC 

CCTC^CACTCOVAACACACCATCTCAACTCTCTTCCACACCGCCTCCACCACCACTTO 

TCTCCTTTATGTTCCCCTCGCAACAACCAATACCCGACCCTTCCCCTCTTTGCCCTCCCG 

CGTTCCCAMTCAACAACAACAACAACGGAAATTTCACTTTCCCTAGACCTCCACCTCTC 

CTTCAACCGCCTTCATCACTCTTCGCAAAACGTTACAAC^TGCTAACACTCCTCTTAAT 

TGCATCAACCGCGTCTCAACCGCACCATTTTCCCCTGTTT 

TTTCTTACATTGCCTTACCCTTCCCC7UVCCGCTCAAACCGCTACTTACCACAATACTAAT 
AACCCTTACTCmCCTCTCCTTCCTTOTCTTTAAACCCTTCTTCITC 
TCAACTTCTTCCCCAAGCTTTCTTCACTCCCATTACACTCCTTCTTCCACCTCATTTCAT 
ACCAACCCAGTTTACTCCATGAAACAAGAGCAGCTC 

GATGGCTTGAATAACGTCAACAACTTCACAGACAACGAGAGACAGAATCATAACCTTAAC 

AGTTCCGGTGCTCATAGAAGAAGTAGTAGCTGC^GCCTCTTAGAGGATGTCTTCGAAGAG 

GCCGAAGCTTTAGCCTCTGGAGGCAGAGGCCGACCTCCAAAACGAAGACAACTCACAGCT 

TCTCTTCCGAACCACAACAACAACACCAACAACAACGACAACTTCTTCTCGGTTAGTTTC 

GGACATTATGATTCTTCrrGACAACTTATGTTCCTTGCAAGATTTGAAATCA^ 

GAGTCTCTTCAAATGAACACAATGCAGGAGGACATAGCTAAGCTTCTTGATTGGGGAAGT 

GATAGTGGAGAGATCTCTAATGGACAATCATCTGTTGTCACTGACGACAATCTTGTTCTT 

GATGTTCATCAATTAGCTTCACTATTCCCGGCTGATTCTACAGCCGTCGTAGCCGCAACA 

AACGACCAACACAACAAGAATAATAACAATAATTGTTCCTGGGATGACATGCAGGGAATA 

AGGTAG 

>G1246 Amino Acid Sequence (domain in AA coordinates: 27-139) 
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MIMYGGGGAGIODGGSTNHLSDGGVILKKGPWTAAEDEIL71AYVRENGEGNWNAVQKNTGL 
ARCGKS CRLRWANHLRPNLKKGS FTGDEERLI IQLHAQLGNKWARMAAQLPGRTDNE I KN 
YWNTRLKRLLRQGLPLYPPDIIPNHQLHPHPHHQQQQQHNHHHHHHQQQQQHQQMYFQPQ 
SSQRNTPS S S PLPSPTPANAKSSS S FTFHTTTANLLHPLS PHTPNTPSQLSSTPPPPPLS 
SPLCSPRIWQYPTLPLFALPRSQINNNNNGN^^ 

CINRVSTAPFSPVSRDSYTSFLTLPYPSPTAQTATYHNTONPYSSSPSFSLNPSSSSYPT 
STSSPS FLHSHYTPSSTSFHTNPVYSMKQEQLPSNQI PQIDGFNNVNNFTDNERQNHNLN 
SSGAHRRSSSCSLLEDVFEEAEAIiASGGRGRPPKRRQLTASLPNHNNNTNNWDNFFSVSF 
GHYDS S DNLC S LQDLKS KEEE SLQMNTMQED I AKLLDWG SDS GE I SNGQS S WTDDNLVL 
DVHQLASLFPADSTAWAATNDQHNKNNNNNCSWDDMQGIR* 
>G1255 (138. .1388) 

CAGCTCAAACTCTCTAGGACTACACTAAATCTAACTTTTTGCAGAGAGCAAAAGATTCAA 
TAATTGAGATTGATCTCAAAACCAAAGCTCTCGTGCTCTTGTCGTTGATGTTGGTTGTGT 
AGACTTTGTATACAATGATGAAAAGTTTGGCGAATGCTGTTGGAGCGAAGACGGCGAGGG 
CTTGCGACAGCTGCGTGAAGAGACGTGCACGGTGGTACTGCGCGGCCGACGATGCTTTTC 
TTTGCCAGTCTTGCGACAGTTTGGTCCATTCAGCAAACCCTCTTGCTCGCCGCCACGAGA 
GAGTCCGTTTGAAGACGGCTAGCCCGGCGGTCGTAAAGCATAGCAACCACTCATCAGCTT 
CTCCTCCACATGAGGTCGCCACGTGGCATCACGGGTTTACTCGTAAAGCTCGAACGCCAC 
GTGGCTCTGGTAAGAAAAACAATTCGTCGATATTTCATGACTTGGTTCCTGATATTAGTA 
TTGAGGATCAGACAGACAACTATGAGCTTGAAGAGCAGCTGATCTGTCAAGTGCCGGTTC 
TAGATCCGTTGGTGTCTGAGCAGTTCTTGAACGATGTCGTTGAGCCCAAGATCGAGTTTC 
CTATGATCAGAAGTGGTTTGATGATCGAGGAGGAGGAAGACAACGCTGAAAGTTGTCTTA 
ATGGATTTTTCCCGACCGACATGGAGCTTGAGGAGTTTGCTGCTGACGTGGAGACTCTGC 
TCGGTCGCGGGTTAGACACGGAGTCGTATGCCATGGAGGAGCTAGGGTTATCTAATTCAG 
AGATGTTCAAAATCGAAAAAGATGAGATTGAAGAAGAAGTAGAAGAGATAAAAGCCATGA 
GCATGGATATATTTGATGATGATCGAAAAGACGTGGATGGAACAGTACCGTTTGAGCTAA 
GCTTTGATTACGAGTCGTCACACAAGACGTCCGAAGAAGAGGTAATGAAGAACGTTGAAA 
GTAGTGGTGAATGTGTTGTTAAGGTGAAAGAGGAAGAACATAAGAATGTTCTGATGCTAA 
GATTAAACTATGACTCGGTGATATCCACTTGGGGAGGTCAAGGTCCACCGTGGAGTTCAG 
GAGAGCCACCGGAACGAGACATGGACATCAGCGGTTGGCCAGCCITrrCCATGGTGGAGA 
ATGGAGGAGAAAGTACTCATCAGAAGCAATACGTTGGTGGATGTTTACCATCAAGTGGGT 
TTGGAGATGGAGGTAGAGAAGCTAGAGTTTCGAGATACAGAGAGAAGAGGAGGACAAGGT 
TGTTTTCTAAGAAGATACGGTACGAGGTACGTAAATTGAATGCAGAGAA7^AGACCACGAA 
TGAAAGGAAGATTCGTGAAGAGAGCCTCGCTCGCTGCTGCTGCTTCACCATTAGGTGTTA 
ATTACTGAATAGTTAATATCTATTCATGTTATATCTCACTTTACAAATTTCGGTGAATCT 
TTTTTCTTCTGAAACAACAGAAGTTATTTTGGCACTTAATTGTGCTTTGAGGACTTGTAT 
GTACATAGAAGTAACCAATAATAATGTGACTTTTACTA 

>G1255 Amino Acid Sequence (domain in aa coordinates: 18-56) 
MKSLANAVGAKTARACDSCVKRRARWYCAADDA^ 

AS PAWKHSNHS SAS PPHEVATWHHGFTRKARTPRGSGKKNNS S I FHDLVPDI S IEDQTD 
NYELEEQLICQVPVLDPLVSEQFLITOVVEPKIEFPMIR^ 

DMELEEFAMVETLLGRGLDTESYAMEELGLSNSEMFKIEKDEIEEEVEEIKAMSMDIFD 
DDRKDVDGTVPFELSFDYESSHKTSEEEVMKNVESSGE<^ 

VISTWGGQGPPWSSGEPPERDMDISGWPAFSMVENGGESTHQKQYVGGCLPSSGFGDGGR 
EARVSRYREK^TRLFSKKIRYEVRKLNAEKRPRMKGRFVKRASLAAAASPLGVOT 
>G1304 (1..978) 

ATGGGGCGATCACCATGTTGCGATGAGAATGGTCTAAAGAAAGGGCCATGGACACAAGAG 
GAGGATGATAAACTGATAGATCACATTC^AAAACATGGCCATGGCAGCTGGAGAGCTCTT 
CCAAAGCAAGCCGGTTTAAACCGATGCGGAAAGAGTTGTAGATTAAGATGGACCAACTAC 
TTGAGACCTGACATCAAGAGAGGAAATTTC^CTGAAGAGGAAGAACAAACTATTATCAAC 
CTCCATTCCCTTCTTGGAAACAAGTGGTCGTCGATAGCCGGTAATCTTCCTGGAAGAACG 
GACAATGAAATAAAAAACTATTGGAACACACATTTGAGAAAGA 

ATTGATCCGGTGACCCATAGGCCAAGAACCGACCATCTAAACGTTTTAGCAGCTCTCCCG 
CAGCTTATAGCCGCCGCAAATTTCAACAGCCTCTTC 

GATGCAACAACTCTTGCTAAAGCTCAACTGCTACACACTATGATTCAAGTCCTTAGCACC 
AATAACAAGACCACCAATCCTTCTTTTTCT 

CXCTTTGGCCAAGCTTCTTACTTAGAGAACCAAAATCTTTTTGGTCAGTCTCAAAACTTC 
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TCTCACATTCTTGAGGATGAGAATTTGATGGTCAAAACCCAAATTATTGATAACCCTTTG 
GACTCTTTTTCTTCCCCCATACAACCCGGTTTTCAAGATGATCATAATTCACTCCCTCTA 
TTGGTTCCGGCGTCTCCTGAAGAATCTAAAGAAACTCAAAGGATGATCAAGAACAAAGAC 
ATCGTCGATTACCATCATCATGATGCTTCAAACCCTTCATCATCAAACTCAACGTTTACA 
CAAGATCATCATCACCCATGGTGTGACACTATTGATGATGGAGCAAGTGATTCTTTTTGG 
AAAGAGATAATAGAGTAA 

>G1304 Amino Acid Sequence (conserved domain in AA coordinates : 13-118) 

MGRSPCCDENGLKKGPWTQEEDDKLIDHIQKHGHGSWRALPKQAGLNRCGKSCRLRWTNY 

LRPDIK^GNFTEEEBQTIINLHSLLGNKWSSIAGNLPGRTDNEIKNYWNTHLRKKLLQMG 

IDPVTHRPRTDHLNVLAALPQL I AAANFNSLLNLNQNVQLDATTLAKAQLLHTM IQVLST 

NNNTTNPSFSSSTMQNSNTNLFGQASYLENQNLFGQSQNFSHILEDENLMVKTQIIDNPL 

DSFSSPIQPGFQDDHNSLPLLVPASPEESKETQRMIKNKDIVDYHHHDASNPSSSNSTFT 

QDHHHPWCDTIDDGASDSFWKEIIE* 

>G1318 (7.. 849) 

AAAAATATGAGGAAGCCAGAGGTAGCCATTGCAGCTAGTACTCACCAAGTAAAGAAGATG 
7VAGAAGGGACTTTGGTCTCCTGAGGAAGACTCAAAGCTGATGCAATACATGTTAAGCAAT 
GGACAAGGATGTTGGAGTGATGTTGCGAAAAACGCAGGACTTCAAAGATGTGGCAAAAGC 
TGCCGTCTTCGTTGGATCAACTATCTTCGTCCTGACCTCAAGCGTGGCGCTTTCTCTCCT 
CAAGAAGAGGATCTCATCATTCGCTTTCATTCCATCCTCGGCAACAGGTGGTCTCAGATT 
GCAGCACGATTGCCTGGTCGGACCGATAACGAGATCAAGAATTTCTGGAACTCAACAATA 
AAGAAAAGGCTAAAGAAGATGTCCGATACCTCCAACTTAATCAACAACTCATCCTCATCA 
CCCAACACAGCAAGCGATTCCTCTTCTAATTCCGCATCTTCTTTGGATATTAAAGACATT 
ATAGGAAGCTTCATGTCCTTACAAGAACAAGGCTTCGTCAACCCTTCCTTGACCCACATA 
CAAACCAACAATCCATTTCCAACGGGAAACATGATCAGCCACCCGTGCAATGACGATTTT 
ACCCCTTATGTAGATGGTATCTATGGAGTAAACGCAGGGGTACAAGGGGAACTCTACTTC 
CCACCTTTGGAATGTGAAGAAGGTGATTGGTACAATGCAAATATAAACAACCACTTAGAC 
GAGTTGAACACTAATGGATCCGGAAACGCACCTGAGGGTATGAGACCAGTGGAAGAATTT 
TGGGACCTTGACCAGTTGATGAACACTGAGGTTCCTTCGTTTTACTTCAACTTCAAACAA 
AGCATATGAATATTTTTACGTCATCTTATTCTTTTTTCTATTGCGGTTTATACTCAAGAT 
TCTTAGCCACACACACATAAATGCAAATATATATACATTGTTAGAGAGTATTTTGTATTT 
CGTATAATCTTTTCGTACTAGGGCTTGAGCCTTGAGGTCCCATGTAACGATTAGTCAATG 
TAAAACATATATCCTATAATAAATAAATAAAAGAAATAATAAGCACATAAAAAAAAAA7VA 
A 

>G1318 Amino Acid Sequence (domain in AA coordinates: 20-123) 

MRKPEVAIAASTHQVKKMKKGLWSPEEDSKLMQYMLSNGQGCWSDVAKNAGLQRCGKSCR 

LRWINYIiRPDLKRGAFSPQEEDLIIRFHSILGNRWSQIAARLPGRTDNEIKNFWNSTIKK 

RLKKMSDTSNLINNSSSSPNTASDSSSNSASSLDIKDIIGSFMSLQEQGFVNPSLTHIQT 

NNPFPTGNMISHPCNDDFTPYVDGIYGVNAGVQGELYFPPLECEEGDWYNANINWHLDEI* 

NTNGSGNAPEGMRPVEEFWDLDQLMNTEVPSFYFNFKQSI* 

>G1320 (39.. 788) 

GAAGATCATAMGATCAAAAGGAGAGAGGTATTAAAAAATGATGTGTAGTCGAGGCCATT 
GGAGACCTGCAGAAGACGAGAAGCTAAGAGAACTCGTCGAGCAATTTGGTCCTCATAATT 
GGAACGCCATAGCTCAGAAGCTCTCTGGTCGATOTGGTAAGAGTTGTAGATTGAGATGGT 
TTAATCAATTGGATCCTAGGATTAACCGAAACCCTTTCACGGAGGAAGAAGAAGAAAGGC 
TTTTAGCGCCTCATCGGATCCATGGGAACAGATGGTCTGTGATCGCTAGATTTTTTCCCG 
GTCGAACTGATAACGCTGTTAAAAACCATTGGCACGTCATCATGGCTCGTCGTGGCCGAG 
AACGGTCCAAGCTCCGTCCACGAGGCCTTGGCCATGATGGCACGGTGGCTGCGACTGGGA 
TGATTGGTAATTATftAAGACTGCGATAAGGAGAGAAGATTGGCAACCACAACCGCTATCA 
ATTTTCCTTATCAATTCTCTCATATTAATCA^ 

GAAAGATCGGGTTCAGAAATAGTACTACTCCAATACAAGAAGGAGC^TAGACaVAACTA 

AACGACCGATGGAGTTCTACAATTTTCTCCAAGTAAACACGGATTCGAAGATACACGAAT 

TGATAGATAATTCAAGATUyVGACGAAGAAGAAGATGTCGATCAAAACAACCGAATTCGTA 

ACGAGAATTGTGTTCCATTTTTCGACTTTTTGTCTGTTGGAAACTCTGCCTCT 

TATGTTAATTTGTCCGTACCAC^TGTACrATAAGGTGGACCATATCTTAACTAMGATAA 

TGTAGAAAGTACTAATCAATTAGAGCTCCTGTTTGAGCCAAATGTGAAAATTAGTTAAGA 

CATCCCAAACATTTTCTTGTATAACACATATAAGGTTGTACTTTTATCAGGTCT 

CTATTTTTATTTTAAGGATGTTTAATCAGACCCATAACCATTCGATAAAA7VAAAA7VAAAA 
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>G1320 Amino Acid Sequence (domain in AA coordinates: 5-108) 

MMCSRGHWRPAEDEKLRELVEQFGPHNWNAIAQKLSGRSGKSCRLRWFNQLDPRINRNPF 

TEEEEERLLAPHRIHGNRWSVIARFFPGRTDNAVKNHWHVIMARRGRERSKliRPRGLGHD 

GTVAATGMIGNYKDCDKERRLATTTAINFPYQFSHINHFQVLKESLTGKIGFRNSTTPIQ 

EGAIDQTKRPMEFYNFLQVNTDSKIHELIDNSRKDEEEDVDQNNRIRNENCVPFFDFLSV 

GNSASQGLC* 

>G1330 (36.-959) 

GTACCGGCGACCTCTTTGTGGGTCACTCTTCATCAATGGGTGACAAAGGAAGGAGCTTAA 
AGATCAACAAGAACATGGAGGAATTCACGAMGTGGAAGAAGAAATGGACGTAAGGAGAG 
GTCCATGGACAGTTGAGGAAGATTTAGAGCTCATCAATTACATTGCTAGTCATGGTGAAG 
GTCGATGGAACTCTCTCGCTCGTTGCGCCGAACTCAAAAGGACCGGAAAAAGCTGCAGAC 
TTCGGTGGCTGAACTATCTCCGACCAGATGTGCGCCGTGGAAACATAACCCTCGAAGAAC 
AACTCTTGATTCTTGAACTTCACACACGTTGGGGCAATAGATGGTCTAAGATTGCACAAT 
ATTTACCAGGAAGAACGGATAACGAGATCAAAAACTATTGGAGAACACGTGTTCAAAAGC 
ATGCAAAACAGCTTAAATGCGACGTGAACAGTCAACAATTTAAAGACACCATGAAGTATC 
TTTGGATGCCTCGGCTCGTAGAAAGGATCCAAGCCGCGTCCATCGGGTCTGTTTCCATGT 
CATCTTGCGTCACCACCTCCTCAGATCAGTTCGTGATCAACAACAACAACACCAACAACG 
TGGAT7VATTTGGCTTTAATGAGTAACCCTAATGGTTACATCACGCCGGATAATTCCAGCG 
TGGCAGTATCTCCTGTATCAGATTTGACGGAGTGTCAAGTGAGTAGTGAAGTGTGGAAGA 
TTGGTCAGGATGAGAATTTGGTGGATCCAAAAATGACATCGCCGAATTATATGGATAATA 
GCAGTGGACTATTAAACGGAGATTTTACGAAGATGCAAGATCAAAGTGACCTTAATTGGT 
TTGAAAATATTAATGGGATGGTACCAAATTATTCGGACAGTTTTTGGAACATTGGAAATG 
ATGAAGACTTCTGGCTCTTACAACAACATCAACAAGTCCACGACAATGGAAGCTTCTGAA 
TAGACAAGAAGCTATGCGGCC 

>G133 0 Amino Acid Sequence (domain in AA coordinates: 2 8-134) 
MGDKGRSLKINKNMEEFTKVEEEMDVRRGPWTVEEDL^ 

KRTGKS CRLRWLN YLRPDVRRGN ITLEEQLLILELHTRWGNRWSKI AQYLPGRTDNE I KN 
YWRTRVQKHAKQLKCDWSQQFKDTMKYliWMPRLVERIQAASIGSVSMSSCVTTSSDQFV 
IIsntfl^TNlTVDNLALMSNPNGYIT^ 

TSPNYMDNSSGLLNGDFTKMQDQSDLNWFENINGMVPNYSDSFWNIGNDEDFWLLQQHQQ 

VHDNGSF* 

>G1352 (79..900)- 

GCGCGATTAAAAACTCTCAACTTTTCTCTCAAATTTCTGATCCTTTGATCCAACAGTTAG 

AAGT^AGATTCATCTGATCATGGCCCTCGAAGCGATGAACACTCCAACTTCTTCTTTCACC 

AGAATCGAAACGAAAGAAGATTTGATGAACGACGCCGTTTTCATTGAGCCGTGGCTTAAA 

CGCAAACGCTCCAAACGTCAGCGTTCTCACAGCCCTTCTTCGTCTTCTTCCTCACCGCCT 

CGATCTCGACCCAAATCCCAGAATCAAGATCTTACGGAAGAAGAGTATCTCGCTCTTTGT 

CTCCTCATGCTCGCTAAAGATCAACCGTCGCAAACGCGATTTCATCAACAGTCGCAATCG 

TTAACGCCGCCGCCAGAATCAAAGAACCTTCCGTACAAGTGTAACGTCTGTGAAAAAGCG 

TTTCCTTCCTATCAGGCTTTAGGCGGTCACAAAGCAAGTCACCGAATCAAACCACCAACC 

GTAATCTCAACAACCGCCGATGATTGAACAGCTCCGACCATCTCCATCGTCGCCGGAGAA 

AAACATCCGATTGCTGCCTCCGGAAAGATCCACGAGTGTTC^TCrrGTCATT^AAGTGTTT 

CCGACGGGTCAAGCTTTAGGCGGTCACAAACGTTGTCACTACGAAGGCAACCTCGGCGGC 

GGAGGAGGAGGAGGAAGCAAATCAATCAGTCACAGTGGAAGCGTGTCGAGCACGGTATCG 

GAAGAAAGGAGCCACCGTGGATTCATCGATCTAAACCTACCGGCGTTACCTGAACTCAGC 

CTTCATCACAATCCAATCGTCGACGAAGAGATCTTGAGTCCGTTGACCGGTAAAAAACCG 

CTTTTGTTGACCGATCACGACCAAGTCATCZAAGAAAGT^GATTTATCTTTAAAAATCTAA 

TACTCGACTATTAATTCTTGTGTGATTTTTTTC^ 

TTTTAGTTACAAATTTTTAATTGTTCTGATTTGGATTGAAA 

>G1352 Amino Acid Sequence (domain in AA coordinates: 108-129,167-188) 

MALEAMNTPTSSFTRIETKEDLMNDAVFIEPWLKRKRSKRQRSHSPSSSSSSPPRSRPKS 

QNQDLTEEEYLALCLLMLAKDQPSQTRFHQQSQSLTPPPESKNLPYKCNVCEKAFPSYQA 

LGGHKASHRIKPPTVISTTADDSTAPTISIVAGEKHPIAASGKIHECSICHKVFPTGQAL 

GGHKRCHYEGNLGGGGGGGSKSISHSGSVSSTVSEERSHRGFIDLNLPALPELSLHHNPI 

VDEEILSPLTGKKPLLLTDHDQVIKKEDLSLKI * 

>G1354 (1..1047) 

ATGGAAAGTCTCGCACACATTCCTCCCGGTTATCGATTCCATCCGACCGATGAAGAACTC 
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GTTGACTATTATCTCAAGAACAAAGTTG CATTCC CGGGAATGCAAGTTG ATGTTATCAAA 
GATGTTGATCTCTACAAAATCGAGCCATGGGACATCCAAGAGTTATGTGGAAGAGGGACA 
GGAGAAGAGAGGGAATGGTATTTCTTTAGCCACAAGGACAAGAAATATCCAACTGGGACA 
CGAACCAATAGAGCAACGGGCTCCGGATTTTGGAAAGCAACGGGTCGAGACAAGGCCATT 
TACTCAAAGCAAG AG CTTGTTGGGATGAGGAAG ACTCTTGTCTTTTACAAAGGTAGGG CC 
CCAAATGGTCAGAAATCTGATTGGATAATGCACGAATACCGTCTTGAGACCGATGAAAAT 
GGACCGCCTCATGAGGAAGGATGGGTGGTTTGTCGCGCTTTCAAGAAGAAGCTAACCACG 
ATGAACTACAACAATCCAAGAACAATGATGGGATCATCATCAGGCCAAGAATCTAACTGG 
TTCACGCAGCAAATGGATGTGGGGAATGGTAATTACTATCATCTTCCTGATCTAGAGAGT 
CCGAGAATGTTTCAAGGCTCATCATCATCATCACTATCATCATTACATCAGAATGATCAA 
GACCCTTATGGTGTCGTACTCAGCACTATTAACGCAACCCCAACTACAATAATGCAACGA 
GATGATGGTCATGTGATTACCAATGATGATGATCATATGATCATGATGAACACAAGTACT 
GGTGATCATCATCAATCAGGATTACTAGTCAATGATGATCATAATGATCAAGTAATGGAT 
TGGCAAACGCTTGACAAGTTTGTTGCTTCTCAGCTAATCATGAGCCAAGAAGAGGAAGAA 
GTTAACAAAGATCCATCAGATAATTCTTCGAATGAAACATTTCATCATCTCTCTGAAGAG 
CAAGCTGCAACAATGGTTTCGATGAATGCTTCTTCCTCTTCTTCTCCATGTTCCTTCTAC 
TCTTGGGCTCAAAATACACACACGTAA 

>G1354 Amino Acid Sequence (domain in AA coordinates: TBD) 

MESLAHIPPGYRFHPTDEELVDYYLKNKVAFPGMQVDVIKDVDLYKIEPWDIQELCGRGT 

GEEREWYFFSHKDKKYPTGTRTNRATGSGFWKATGRBKAIYSK^ 

PNGQKSDWIMHEYRLETDENGPPHEEGVAA/CRAFKKJKIiTTmYl^PRTMMGSSSGQESNW 

FTQQMDVGNGNYYHLPDLESPRMFQGSSSSSLSSLHQNDQDPYGW1.STINATPTTIMQR 

DDGHVITNDDDHMIMMNTSTGDHHQSGLLVNDDHNDQVMDWQTLDKFVASQLIMSQEEEE 

VNKDPSDNSSNETFHHLSEEQAATMVSMNASSSSSPCSFYSWAQNTHT* 

>G1360 (1..1257) 

ATGGGAGATAGAAACAACGACGGTGATCAGAAAATGGAGGATGTATTGTTGCCCGGATTT 

AGGTTTCATCCAACCGACGAAGAGCTCGTAAGCTTCTACCTGAAGCGGAAGGTTCAACAC 

AACCCTCTCTCCATTGAGCTCATAAGACAACTCGATATCTACAAATATGACCCCTGGGAT 

CTTCCAAAGTTTGCGATGACGGGTGAAAAAGAATGGTACTTTTATTGTCCAAGGGACAGG 

AAGTATAGGAACAGCTCGAGGCCAAACCGAGTGACCGGAGCTGGTTTTTGGAAAGCCACG 

GGAACGGACCGGCCGATATACTCGTCAGAAGGAAACAAATGCATAGGTTTAT^AGAAGTCC 

TTAGTGTTCTACAAAGGAAGAGCAGCGAAAGGAGTTAAGACTGATTGGATGATGCATGAG 

TTTCGTTTGCCTTCTCTCTCCGAACCATCTCCTCCTTCTAAGAGATTCTTCGACTCTCCT 

GTCTCTCCCAACGATTCATGGGCTATATGCAGAATCTTCAAAAAGACCAACACAACGACC 

CTAAGAGCTCTCTCTCACTCTTTTGTTTCCTCGTTACCACCAGAAACAAGCACCGACACA 

ATGTCTAACCAAAAGCAATCAAACACATACCATTTTTCTTCAGACAAGATCCTCAAACCT 

AGCTCTCACTTCCAGTTTCACCATGAGAATATGAACACTCCCAAAACTAGTAATAGTACA 

ACTCC^TCCGTTCCCACTATAAGTCCCTTCTCTTACTTGGATTTCACTTCATACGACAAA 

CCCACCAACGTTTTCAATCCGGTTTCATGTTTAGACCAACAATACCTCACAAATCTCTTT 

CTTGCC^CACAAGAAACACAACCTCAGTTTCCCAGGCTCCCCTCGTCAAATGAAATCCCA 

TCGTTTCTGCTAAACACGTCTTC^GATTCGACCTTCTTGGGAGAATTCACGAGCCATATC 

GACCTCAGCGCAGTGTTGGCCCAAGAGCAATGTCCCCCGCTTGTAAGCCTACCACAGGAG 

TATCAAGAGACGGGATTCGAAGGAAATGGTATAATGAAGAACATGCGTGGTTCCAATGAA 

GATCATCTTGGTGATCATTGCGACACACTTCGGTTTGATGATTTCACTTCAACAATTAAT 

GAGAACCATCGTCATCATCAAGACCTGAAACAGAACATGACATTGCTGGAGAGTTATTAT 

TCTTCTTTATCGTCCATCAATAGCGATTTGCCAGCTTGTTTCTCCAGTACAACCTC 

>G1360 Amino Acid Sequence (conserved domain in AA coordinates : 18-174) 

MGDRl^GDQKMEDVIiLPGFRFHPTDEELVSFY^ 

LPKFAMTGEKEWYFYCPRDRKYRNSSRPI^VTGAGFWKATGTDRPIYSSEGNKCIGLKK 
LVTYKGRAAKGVKTDWMmEFRLPSLSEPSPPSKRFFDSPVSPNDSWAICRIFKKTNTTT 
LRALSHSFVSSLPPETSTDTMSNQKQSNTYHFSSDKILKPSS 
TPSVPTISPFSYTjDFTSYDKPTOTFNPVSCLDQQYLT^ 

SFLLNTSSDSTFLGEFTSHIDLSAVLAQEQCPPLVSLPQEYQETGFEGNGIMKNMRGSNE 
DHLGDHODTLRFDDFTSTINENHRHHQDLKQl^TLLESYYSSLSSINSDLPACFSSTT* 
>G1364 (1..537) 

ATGGCGGAGTCGCAGGCCAAGAGTCCCGGAGGCTGTGGAAGCCATGAGAGTGGTGGAGAT 
CAAAGTCCCAGGTCGTTACATGTTCGTGAGCAAGATAGGTTTCTTCCGATTGCTAACATA 
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AGCCGTATCATGAAAAGAGGTCTTCCTGCTAATGGGAAAATCGCTAAAGATGCTAAGGAG 

ATTGTGCAGGAATGTGTCTCTGAATTCATCAGTTTCGTCACCAGCGAAGCGAGTGATAAA 

TGTCAAAGAGAGAAAAGGAAGACTATTAATGGAGATGATTTGCTTTGGGCAATGGCTACT 

TTAGGATTTGAAGACTACATGGAACCTCTC7UVGGTTTACCTGATGAGATATAGAGAGGGT 

GACACAAAGGGATCAGCAAAAGGTGGGGATCCAAATGCAAAGAAAGATGGGCAATCAAGC 

CAAAATGGCCAGTTCTCGCAGCTTGCTCACCAAGGTCCTTATGGGAACTCTCAAGTAACT 

TTTCCTCTCTTCTCTTCACACTCAAGCAATACGCATCATTCTCTTCTAATTTGTTAA 

>G1364 Amino Acid Sequence (conserved domain in AA coordinates : 29-120) 

MAESQAKSPGGCGSHESGGDQSPRSLHVREQDRFLPIANISRIMKRGLPANGKIAKDAKE 

IVQECVSEFISFVTSEASDKCQREKRKTINGDDLLWAMATLGFEDYMEPLKVYLMRYREG 

DTKGSAKGGDPNAKKDGQSSQNGQFSQLAHQGPYGNSQVTFPLFSSHSSNTHHSLLIC* 

>G1379 (68.. 622) 

CTCTGCCTCTCTCTCTCTCTCAAAACCCATCTCGAAAGTCTTTCTCTTTCGAGGGTTTAG 

ATCCTCCATGGAAGGCGGCGGAGTTGCTGACGTGGCTGTCCCCGGTACGAGGAAGAGAGA 

CAGACCTTACAAAGGAATTAGGATGAGGAAGTGGGGAAAGTGGGTGGCGGAGATTCGTGA 

GCCTAACAAGCGCTCTAGGTTATGGCTTGGCTCTTACTCTACTCCCGAGGCGGCGGCGCG 

AGCTTACGACACGGCGGTTTTCTATCTTAGAGGACCTACGGCGAGGCTTAACTTCCCTGA 

GCTTCTTCCTGGGGAGAAATTCTCCGACGAGGATATGTCGGCTGCGACCATCAGGAAGAA 

AGCCACGGAGGTCGGTGCTCAGGTTGATGCTTTGGGCACGGCGGTGCAAAATAACCGCCA 

CCGTGTTTTTGGTCAGAATCGAGATAGTGATGTGGATAATAAGAATTTTCATCGGAATTA 

TCAAAACGGTGAACGAGAAGAAGAAGAAGAAGATGAGGATGACAAGAGATTGAGGAGTGG 

CGGCCGGTTATTGGATCGGGTTGACTTGAATAAATTACCCGACCCGGAAAGCTCCGATGA 

AGAATGGGAAAGCAAACATTAAAAATATATAGTTTGGAGCGGTGGCTGTTGCTAACGTAC 

GCCAACGGCTTGCTTCTACGAATCATTAGCGCCGTTTATGATTTTTTTTTTTTTTTTTTT 

CATTATCTGAAAATTTAGGGCTTTTTAGTTATTAATTTTTGTTTTGTTTTTTTCCTTTCT 

TGCGAGTTTTGCGGTTTATGGAATTTTAGGCTATTGCTTAACGAAAAAAAAAAAAAAAA 

>G1379 Amino Acid Sequence (domain in AA coordinates: 18-85) 

MEGGGVADVAVPGTRKRDRPYKGIRMRKWGKWVAEIREPNKRSRLWLGSYSTPEAAARAY 

DTAVFYLRGPTARLNFPELLPGEKFSDEDMSAATIRKKATEVGAQVDALGTAVQNNRHRV 

FGQNRDSDTONKNFHRNYQNGEREEEEEDEDDICRLRSGGRLLDRVDLNKLPDPESSDEEW 

ESKH* 

>G1384 (33.. 977) 

GTACATTTTTTTTTGTATTTCAGGAAACTCCGATGGCGGATCTCTTCGGTGGTGGCCACG 

GCGGCGAGCTTATGGAAGCACTTCAACCTTTTTACAAAAGTGCTTCCACGTCTGCTTCAA 

ATCCTGCGTTTGCGTCCTCAAACGATGCGTTTGCGTCTGCCCCAAACGACCTATTTTCTT 

CTTCTTCTTACTATAATCCTCATGCATCTTTATTCCCTTCACATTCCACAACCTCTTACC 

CGGATATTTATTCTGGATCCATGACCTATCCATCTTCATTCGGGTCGGATCTTCAACAAC 

CCGAAAACTACCAATCTCAGTTCCATTACCAAAACACTATCACTTACACTCACCAAGACA 

ACAACACTTGC^TGCTTAACTTCATTGAGCCGAGCCAACCGGGTTTTATGACCCAACCGG 

GTCCGAGTTCGGGTTCGGTTTCAAAACCGGCTAAGCTCTATAGAGGAGTGAGGCAAAGAC 

ATTGGGGAAAATGGGTCGCGGAGATCCGTTTACCCAGGAACCGAACCCGACTTTGGCTCG 

GAACATTCGACACGGCTGAAGAAGCCGCGTTGGCTTATGATCGCGCCGCGTTTAAGCTTC 

GTGGTGACTCGGCTCGGCTTAACTTCCCAGCTCTCCGATACCAAACCGGCTCGTCTCCGT 

CTGATACCGGCGAATATGGTCCTATTGAAGCTGCCGTAGACGCTAAACTAGAAGCCATAT 

TAGCTGAGCCGAAGAATCAGCCGGGCAAAACGGAGAGGACGTCGAGGAAACGAGCTAAAG 

CCGCGGCTTCTTCAGCTGAGCAGCCGTCAGCGCCACAACAACATTCCGGGTCGGGTGAAA 

GTGATGGGTCGGGTTCACCGACTTCGGATGTTATGGTGCAGGAGATGTGCCAAGAGCCAG 

AGATGCCATGGAAT6AAAATTTCATGCTCGGGAAGTGTCCTTCTTATGAGATAGATTGGG 

CTTCAATTTTATCGTGAAAAATTAGGATTCAATTCA 

TATTTTCTTTTAT^ACTTTAGGGTTATTAGCTGTGCGTAA 

>G1384 Amino Acid Sequence (domain in AA coordinates: TBD) 

MADLFGGGHGGELMEALQPFYKSASTSASNPAFAS SNDAFAS APNDLFS SS SYYNPHASL 

FPSHSTTSYPDIYSGSMTYPSSFGSDLCK?PENYQSQFHYQNTITY^QDNm , <3ttiNFIEP 

SQPGFMTQPGPSSGSVSKPAKLYRGVRQRHWGKWVAEIRLPRNRTRLWLGTFDTAEEAAIj 

AYJ)RAAFKLRGDSARIiNFPALRYQTGSSPSDTGEYGPIQAAvT)AKLEAILAEPKNQPGKT 

ERTSRKRAKAAASSAEQPSAPQQHSGSGESDGSGSPTSDVMVQEMCQEPEMPWNENFMLG 

KCPSYEIDWASILS* 
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>G1399 (261. .1475) 

AGGTCGAATTTTCTGAAATTAAGATTC^TTCCTCCATGGAAGAAGCTCTGTTTTTATTCT 

CTTTAGCTTAGCTTAGCTTCTACTGATCTGTTTTTGCTACAAAATCCCATCTTTTTCTTT 

AAAACTCTTTATCTCTGAATCTTGAGTTTCTTGTAGAAGAAGAAGCAATTTTGAATCTTT 

CGTAATCATAAAGATTCGTGGAGGATCTCTACTGATTTGTCGGAATCTCTCACTACAGAA 

TCACTTGATCTTATGTCCGGATGGAGGAGAGAGAAGGAACCAACATCAACAACAACATCA 

CTAGCAGTTTCGGCTTGAAGCAGCAACATGAAGCTGCTGCTTCTGATGGTGGTTACTCAA 

TGGACCCACCACCAAGACCCGAAAACCCTAACCCGTTTTTAGTCCCACCCACTACTGTCC 

CCGCGGCCGCCACCGTAGCAGCAGCTGTTACTGAGAATGCGGCTACTCCGTTTAGCTTAA 

CAATGCCGACGGAGAACACTTCAGCTGAGCAGCTGAAAAAGAAGAGAGGTAGGCCGAGAA 

AGTATAATCCCGATGGGACTCTTGTCGTGACTTTATCGCCGATGCCAATCTCGTCCTCTG 

TTCCGTTGACGTCGGAGTTTCCTCCAAGGAAACGAGGAAGAGGACGTGGCAAGTCTAATC 

GATGGCTCAAGAAGTCTCAAATGTTCCAATTCGATAGAAGTCCTGTTGATACCAATTTGG 

CAGGTGTAGGAACTGCTGATTTTGTTGGTGCCAACTTTACACCTCATGTACTGATCGTCA 

ACGCCGGAGAGGATGTGACGATGAAGATAATGACATTCTCTCAACAAGGATCTCGTGCTA 

TCTGCATCCTTTCAGCTAATGGTCCCATCTCCAATGTTACGCTTCGTCAATCTATGACAT 

CCGGTGGTACTCTAACTTATGAGGGTCGTTTTGAGATTCTCTCTTTGACGGGTTCGTTTA 

TGCAAAATGACTCTGGAGGAACTCGAAGTAGAGCTGGTGGTATGAGTGTTTGCCTTGCAG 

GACCAGATGGTCGTGTCTTTGGTGGAGGACTCGCTGGTCTCTTTCTTGCTGCTGGTCCTG 

TCCAGGTAATGGTAGGGACTTTTATAGCTGGTCAAGAGCAGTCACAGCTGGAGCTAGCAA 

AAGAAAGACGGCTAAGATTTGGGGCTCAACCATCTTCTATCTCCTTTAACATATCCGCAG 

AAGAACGGAAGGCGAGATTCGAGAGGCTTAACAAGTCTGTTGCTATTCCTGCACCAACCA 

CTTCATACACGCATGTAAACACAACAAATGCGGTTCACAGTTACTATACAAACTCGGTTA 

ACCATGTCAAGGATCCCTTCTCGTCTATCCCAGTAGGAGGAGGAGGAGGTGGAGAGGTAG 

GAGAAGAAGAGGGTGAAGAAGATGATGATGAATTAGAAGGTGAAGACGAAGAATTCGGAG 

GCGATAGCCAATCTGACAACGAGATTCCGAGCTGATGATGATCATACGGTTTCTTTTCGC 

GGATTTGTTAGGTTTGATGGATTTCAGATTTTGGTTGATTGTTTTTATTAACACAGAATG 

TTTAGAAGCTGCTATCTTTAGGTTCCCATCCTCTTGTGATTGTTGAGTATCCTTGTTAGA 

AACAAACTTACTGTTGCAAAACTCTCTTCATUUUU^GTTTCACTTTGCTTTCCCA 

>G1399 Amino Acid Sequence (domain in AA coordinates; 86-93) 

MEEREGTOIl^ITSSFGIiKQQHEAAASDGGYSMDPPPRPENPNPFLVPPTTVPAAATVA 
A^VTENAATPFSLTMPTENTSAEQLKKKRGRPRKYNPDGTLVVTLSPMPISSSVPLTSEF 
PPRKRGRGRGKSNRWLKKSQMFQFDRSPVDTNLAGVGTADFVGANFTPHVLIVNAGEDVT 
MKIMTFSQQGSRAICILSANGPISNVTLRQSMTSGGTLTYEGRFEILSLTGSFMQNDSGG 
TRSRAGGMSVCLAGPDGRVFGGGLAGLFLAAGPVQVMVGTFIAGQEQSQLELAKERRLRF 
GAQPS S I S FNI SAEERKARFERLNKS VAI PAPTTS YTHVNTTNAVHS YYTNS VNHVKDPF 
SSI P VGGGGGGE VGEEEGEEDDDELEGEDEEFGGDS QSDNE I PS * 
>G1415 (60.. 680) 

CCTTATCACTC^CCAAAAGTCGTCACATAATATCACTTTCGAGTTATCAACATCCGTACA 
TGTCATCCATAGAGCCAAAAGTAATGATGGTTGGTGCTAATAAGAAACAACGAACCGTCC 
AAGCTAGTTCGAGGAAAGGTTGTATGAGAGGAAAAGGTGGACCCGATAACGCGTCTTGCA 
CTTACAAAGGTGTTAGACAACGCACTTGGGGCAAAT^ 

ACCGAGGAGCTCGTCTTTGGCTCGGTACCTTCGACACCTCCCGTGAAGCTGCCTTGGCTT 
ATGACTCCGCAGCTCGTAAGCTCTATGGGCCTGAGGCT 

TAAGAAGTTACCCTAAAACGGCGTCGTCTCCGGCGTCCCAGACTACACCAAGCAGCAACA 

CCGGTGGAAAAAGCAGCAGCGACTCTGAGTCGCCGTGTTCATCCAACGAGATGTCATCAT 

GTGGAAGAGTGACAGAGGAGATATCATGGGAGCATATAAACGTGGATTTGCCGGTAATGG 

ATGATTCTTCAATATCGGAAGAAGCTACAATGTCGTTAGGATTTCCATGGGTTCATGAAG 

GAGATAATGATATTTCTCGGTTTGATACTTGTATTTCCGGTGGCTATTCTAATTGGGATT 

CCTTTCATTCCCCACTTTGAGGTGTCACTAGACTCTCTTTAATTGTTAAGTTATCATATA 

CAAACTACATATATATACAAATATAGTCACCGTGAACTAGGATATATATGTAAATAAACA 

CCAGTTACATGTACTTATATATGTGCACATCTATATATGTGGTTTGTCTGTATAGTGTGA 
AAGCAGATTCTTACCATATCA 

>G1415 Amino Acid Sequence (domain in AA coordinates: TBD) 
MSSIEPKVMMVGANKKQRTVQASSRM^ 

NRGARLWLGTFDTSREAALAYDSAARKLYGPEAHLNLPESLRSYPKTASSPASQTTPSSN 
TGGKSSSDSESPCSSNEMSSCGRVTEEISWEHINVDL^ 
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GDND I SRFDTCI SGG YSNWDS FHSPL* 
>G1417 (32.. 1501) 

TCTATCTCTATCTATCTCTCTTTGTCTGCAAATGGAAGAACATATTCAAGATCGCCGTGA 
AATTGCGTTCTTACACTCAGGAGAATTTCTCCACGGAGATTCTGACTCAAAGGATCATCA 
ACCGAACGAGTCTCCGGTGGAACGTCATCACGAGTCGTCTATCAAAGAAGTTGATTTCTT 
CGCTGCTAAAAGTCAGCCGTTTGATCTTGGTCATGTGAGAACAACGACGATCGTTGGATC 
ATCTGGTTTTAATGATGGATTAGGTTTGGTAAATTCATGTCATGGAACATCAAGCAATGA 
TGGCGATGACAAAACCAAAACTCAAATTAGTAGACTGAAGTTGGAGCTAGAGAGGCTTCA 
CGAGGAGAATCACAAACTGAAGCATTTATTAGATGAGGTCAGTGAGAGTTACAACGACCT 
CCAAAGAAGAGTTTTGTTAGCAAGACAAACACAAGTGGAAGGTCTTCATCATAAACAACA 
TGAGGATGTACCTCAAGCTGGTTCCTCACAAGCTCTAGAGAACAGAAGACCAAAGGATAT 
GAACCATG7\AACTCCGGCCACCACCTTGAAACGACGGTCTCCAGACGACGTGGATGGTCG 
TGATATGCACCGAGGATCACCA7VAAACTCCTCGAATAGACCAAAACAAGAGTACTAATCA 
TGAAGAACAACAAAACCCTCATGATCAATTACCCTATAGAAAAGCTAGGGTTTCCGTTAG 
AGCTAGATCTGATGCCACTACGGTAAATGACGGATGTCAATGGAGAAAATACGGTCAGAA 
AATGGCGAAAGGGAATCCATGTCCTCGCGCTTATTATCGTTGCACCATGGCCGTTGGATG 
TCCTGTCCGTAAACAGGTCCAACGATGCGCGGAGGATACAACTATCTTGACAACAACGTA 
CGAAGGAAACCATAACCATCCTCTTCCCCCGTCAGCCACAGCCATGGCTGCAACCACCTC 
CGCCGCAGCAGCCATGCTCTTATCAGGCTCCTCCTCCAGCAACCTCCACCAAACACTCTC 
TAGCCCCTCCGCCACGTCATCATCATCCTTCTACCATAACTTCCCATACACCTCCACAAT 
CGCAACACTCTCTGCCTCAGCTCCTTTCCCCACCATAACCTTAGACCTCACCAACCCACC 
TCGACCGCTAC7VACCGCCACCGCAGTTTCTAAGCCAGTATGGTCCCGCCGCGTTTTTACC 
AAACGCTAATCAAATTAGGTCTATGAATAATAATAACCAGCAGTTATTAATACCTAATTT 
GTTTGGCCCACAAGCCCCACCACGTGAAATGGTCGATTCAGTTAGGGCTGCGATTGCGAT 
GGATCCGAACTTCACGGCGGCACTTGCGGCCGCGATCTCAAACATTATCGGAGGAGGTAA 
TAACGACAACAATAATAATACTGATATTAATGATAACAAGGTTGATGCAAAAAGTGGAGG 
GAGTAGTAACGGAGATTCGCCACAGCTTCCTCAGTCTTGCACCACTTTCTCTACAAACTA 
ATTTTACTACCATTATTATATGTTATCTTATTATATATTACACACACATATTATACATTA 
TGCGTATCTTAAGTTTTTTTTTGGGGGCCATTATATATGAATGATATGGAGATCACTGAG 
AGAGAGAGAGAGCTATTATGGGTTTTTTTTT 

>G1417 Amino Acid Sequence (domain in AA coordinates: 239-296) 

MEEHIQDRREIAFLHSGEFLHGDSDSKDHQPNESPVERHHESSIKEVDFFAAKSQPFDLG 

HTOTTTIVGSSGFNDGLGLVNSCHGTSSNDGDDKTKTQISRLKLELERLHEENHKLKHLL 

DEVSESYNDLQRRVLLARQTQVEGLHHKQHEDVPQAGSSQALENRRPKDMNHETPATTL 

RRSPDDVDGRDMHRGSPKTPRIDQNKSTNHEEQQNPHDQLPYRKARVSVRARSDATTVND 

GCQWRKYGQKMAKGNPCPRAYYRCTMAVGCPVRKQVQRCAEDTTILTTTYEGNIINHPLPP 

SATAMAATTSAAAAMLLSGSSSSNLHQTLSSPSATSSSSFYHNFPYTSTIATLSASAPFP 

TITLDLTNPPRPLQPPPQFLSQYGPAAFLPNANQIRSMNNNNQQLLIPNLFGPQAPPREM 

VDS VRAAI AMDPNFTAALAAAI SN 1 1 GGGNNDNNNNTD INDNKVDAKSGGS SNGDS PQLP 

QSCTTFSTN* 

>G1442 (1..1293) 

ATGGGAACAAGAGCAGAACGCAAGGAAGATTTTGTTGGTGGGTTTGGATTTGGTGTTGTA 
GAAAATTCGCATAAAGACGTTATGGTGOTACCTC^TC^TCACTATTATCCATCATATTCA 
TCACCTTCCTCTTCTTCTTTGTGTTACTGTTCT 

GTTTCTAGCAATCAGGCTTACACTTCTTCTCACAGTGGTATGTTCAC^CCCGCCGGTTCT 
GGTTCTGCTGCTGTGACTGTAGCAGATCCTTTTTTCTCCTTGAGCTCTTCAGGGGAAATG 
AGAAGAAGTATGAACGAAGATGCTGGTGCAGCTTTCAGCGAAGCTCAATGGCATGAGCTT 
GAGAGGCAGAGG7VATATATACAAGTACATGATGGCTTCTGTTCCTGTTCCTCCAGAGCTT 
CTCACACCCTTTCCCAAGAACCACCAATCAAACACTAACCCGGATGTAACTGTGGCAGTG 
GCGACAGGAGGCTCATTGCAGCTGGGGATTGCTTCAAGCGCAAGCAATAACACGGCTGAT 
CTGGAGCCATGGAGGTGCAAGAGAACAGATGGGAAGAAATGGAGGTGCTCTAGAAACGTG 
ATTCCTGATCAGAAATACTGTGAGAGACACACACACAAGAGCCGTCCTCGTTCAAGAAAG 
CATGTGGAATCATCTCACCAATCATCTCACCACAATGACATTCGTACGGCTAAGAATGAT 
ACTAGCCAGCTTGTGAGAACTTATCCTCAGTTTTACGGACAACCTATAAGCCAGATCCCT 
GTGCTTTCTACTCTTCCGTCTGCCTCCTCTCCATATGATCACCACAGAGGACTGAGGTGG 
TTTACGAAAGAAGATGATGCCATTGGAACCTTAAACCCGGAGACTCAAGAAGCTGTCCAG 
CTGAAAGTTGGATCAAGCAGAGAGCTCAAACGGGGATTCGATTATGATCTGAATTTCAGG 
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CAGAAAGAG CC AATAGTAGACCAGAGCTTTGGAG CATTGCAGG GTCTATTAAGTCTAAAC 
CAGACACCACAACATAACCAAGAAACAAGACAGTTTGTTGTAGAAGGAAAGCAAGATGAA 
GCGATGGGAAGCTCTCTGACACTCTCAATGGCTGGAGGAGGCATGGAGGAAACAGAGGGA 
ACAAACCAGCATCAGTGGGTTAGCCATGAAGGTCCATCATGGCTCTATTCAACAACACCA 
GGTGGACCATTGGCTGAAGCACTGTGTCTCGGTGTCTCCAACAACCCAAGTTCTAGTACT 
ACTACTAGTAGCTGCAGCAGAAGCTCAAGCTAA 

>G1442 Amino Acid Sequence (domain in AA coordinates: 172-223) 

MGTRAERKEDFVGGFGFGWENSHKDVMVLPHHHYYPSYSSPSSSSLCYCSAGVSDPMFS 

VSSNQAYTSSHSGMFTPAGSGSAAVTVADPFFSLSSSGEMRRSMNEDAGAAFSEAQWHEL 

ERQRNIYKYMMASVPVPPELLTPFPKNHQSNTNPDVTVAVATGGSLQLGIASSASNNT7U3 

LEPWRCKRTDGKKWRCSRNVIPDQKYCERHTHKSRPRSRKHVESSHQSSHHNDIRTAKND 

TSQLVRTYPQFYGQPISQIPVLSTLPSASSPYDHHRGLRWFTKEDDAIGTLNPETQEAVQ 

LKVGSSRELKRGFDYDLNFRQKEPIVDQSFGALQGLLSLNQTPQHNQETRQFWEGKQDE 

AMGSSLTLSI^GGGMEETEGTNQHQWVSHEGPSWLYSTTPGGPLAEALCLGVSNNPSSST 

TTSSCSRSSS* 

>G1454 (86.. 1180) 

CTAGTAGTGATGATATGATCGCTTCTTCTCCTACAATCTCAGAAACCTCCGATCACGGTT 

TTAGATATCTTCTACAACGGATACAATGGAGAGCACCGATTCTTCCGGTGGTCCACCACC 

GCCACAACCTAACCTTCCTCCAGGCTTCCGGTTTCACCCTACCGACGAAGAGCTTGTTGT 

TCACTACCTCAAACGCAAAGCAGCCTCTGCTCCTTTACCTGTCGCCATCATCGCCGAAGT 

CGATCTCTATAAATTTGATCCATGGGAACTTCCCGCTAAAGCATCGTTTGGAGAACAAGA 

ATGGTACTTCTTTAGTCCACGAGATCGGAAGTATCCAAACGGAGCAAGACCAAACAGAGC 

GGCGACTTCAGGTTATTGGAAAGCGACCGGTACAGATAAACCGGTACTTGCTTCCGACGG 

TAACCAAAAGGTGGGCGTGAAGAAGGCACTAGTCTTCTACAGTGGTAAACCACCAAAAGG 

CGTTAAAAGTGATTGGATCATGCATGAGTATCGTCTCATCGAAAACAAACCAT^ACAATCG 

ACCTCCTGGCTGTGATTTCGGCAACAAAAT^AAACTCACTCAGACTTGATGATTGGGTGTT 

ATGTAGAATCTACAAGAAGAACAACGCAAGTCGACATGTTGATAACGATAAGGATCATGA 

TATGATCGATTACATTTTCAGGAAGATTCCTCCGTCTTTATCAATGGCGGCTGCTTCTAC 

AGGACTTCACCAACATCATCATAATGTCTCAAGATCAATGAATTTCTTCCCTGGCAAATT - 

CTCCGGTGGTGGTTACGGGATTTTCTCTGACGGTGGTAACACGAGTATATACGACGGCGG 

TGGCATGATCAACAATATTGGTACTGACTCAGTAGATCACGACAATAACGCTGACGTCGT 

TGGTTTAAATCATGCTTCGTCGTCAGGTCCTATGATGATGGCGAATTTGAAACGAACTCT 

CCCGGTGCCGTATTGGCCTGTAGCAGATGAGGAGCAAGATGCATCTCCGAGCAAACGGTT 

TCACGGTGTAGGAGGAGGAGGAGGAGATTGTTCGAACATGTCTTCCTCCATGATGGAAGA 

GACTCCACCATTGATGCAACAACAAGGTGGTGTGTTAGGAGATGGATTATTCAGAACGAC 

ATCGTACCAATTACCCGGTTTAAATTGGTACTCTTCTTAATCAAATGTGTTTCGCCGCCG 

GTGTGAAGAATTTTCCGGTGACAGTGAAGATTTTTTTCCGATTGGTGGGGTCATTTGCAT 

GGATTATATAATTTGAGATTTGTGTATATGTTTTGGGTTAATTAATTGGTCACAGGGGC 

>G1454 Amino Acid Sequence (conserved domain in AA coordinates : 9-178) 

MESTDSSGGPPPPQPNLPPGFRFHPTDEELVVHYLKRKAASAPLPVAIIAEVDLYKFDPW 

ELPAKASFGEQEWYFFSPRDRKYPNGARPNRAATSGYWKATGTD 

ALVTYSGKPPKGVlCSDWIMHEYRLIENKPmjRPPGCDFGNKKNSLRLDDVArLCRIYKKNN 

ASRHVDNDKDHDMIDYIFRKIPPSLSMAAASTGLHQHHHNVSRSMNFFPGKFSGGGYGIF 

SDGGNTSIYIXSGGMINNIGTDSVDHDNNADWGLNHASSSGPMMMA^ 

DEEQDASPSKRFHGVGGGGGDCSNMSSSMMEETPPLMQQQGGVLGDGLFRTTSYQLPGLN 

WYSS* 

>G1459 (1..1272) 

ATGATGAAAGGTCTGATTGGGTATAGATTTAGTCCGACGGGAGAGGAAGTGATCAACCAT 
TACCTAAAGAACAAACTTCTGGGTAAGTATTGGCTCGTTGATGAAGCTATTAGCGAGATC 
AACATCTTGAGTCACAAACCCAGCAAGGATTTGCCTAAGTTAGCTAGGATCCAATCGGAA 
GATCTTGAATGGTATTTCTTCTCTCCGATTGAGTACACGAACCCGAATAAGATGAAAATG 
AAGAGGACGACAGGTTCTGGGTTTTGGAAACCTACTGGTGTTGATCGGGAAATTAGGGAT 
AAAAGAGGT^ATGGTGTTGTGATAGGGATTAAGAAGACGCTTGTGTACCATGAAGGTAAG 
AGTCCTCATGGAGTTAGAACTCCTTGGGTTATGCACGAGTATCACATCACTTGCTTGCCT 
CATCATAAGAGGAAATATGTTGTCTGCCAAGTAAAGTATAAGGGTGAAGCTGCAGAAATT 
TCATATGAGCCAAGTCCCTCTTTGGTATCCGATTCGCATACCGTCATAGCGATTACCGGA 
GAACCGGAACCTGAGCTTCAGGTTGAGCAGCGAGGTAAAGAAAATCTCTTGGGTATGTCT 
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GTAGATGATTTGATAGAACCAATGAACCAACAAGAGGAGCCACAAGGTCCTCACTTAGCT 
CCGAATGATGATGAGTTTATACGTGGATTGAGGCATGTTGATCGAGGGACGGTTGAATAT 
TTGTTTGCCAATGAAGAAAACATGGATGGTTTGTCTATGAATGACTTGAGAATCCCAATG 
ATCGTCCAACAAGAGGATCTCTCTGAGTGGGAGGGATTTAACGCAGACACCTTTTTCAGC 
GACAACAACAATAACTATAACCTTAACGTGCATCATCT^ACTAACGCCTTACGGCGATGGC 
TATTTGAATGCATTTTCGGGTTATAACGAAGGGAA.TCCTCCCGATCACGAATTAGTGATG 
CAAGAGAACCGCAACGATCACATGCCAAGGAAACCTGTGACAGGGACCATTGATTATAGC 
AGCGATAGTGGCAGTGATGCTGGATCCATATCTACAACGGTGAAACAAGAAATCCCAAGA 
GCTGTTGATGCACCCATGAACAATGAGTCATCTTTGGTGAAAACAGAGAAGAAAGGCTTG 
TTTATTGTAGAGGACGCAATGGAGAGAAACCGCAAGAAACCACGATTTATCTATCTCATG 
AAGATGATCATAGGCAACATCATATCGGTTTTACTACCCGTCAAAAGATTGATCCCGGTG 
AAGAAGTTATGA 

>G1459 Amino Acid Sequence (conserved domain in AA coordinates : 10-152) 

MMKGLIGYRFSPTGEEVINHYLKNKLLGKYWLVI)EAISEINILSHKPSKDLPKLARIQSE 

DLEWYFFSPIEYTNPNKMKMKRTTGSGFWKPTGVDREIRDKRGNGVVIGIICKTLVYHEGK 

SPHGVRTPWVMHEYHITCLPHHKRKYWCQv^ 

EPEPELQV12QPGKENLLGMSVDDLIEPMNQQEEPQGPH 

LFANEENI^GLSMNDLRIPMIVQQEDLSEWEGFNi^ 

YLNAFSGYWEGNPPDHELVMQENRNDHMPRKPVTGTIDYSSDSGSDAGSISTTVKQEIPR 
AVTJAPl^ESSLVKTEKKGLFIVEDAMERNRKKPRFIYLMKMIIGNIISVLLPVKRLIPV 
KKL* 

>G1460 (87.. 995) 

CGTCGACCTTCACTCAAACCCTAATCCCGGGAACCCGGGAATTTTGATCATTTTGTTTCT 
TTTCGATCTGTTTCTATTTTAAAAAGATGATGAAAGATCCGACTGGGTATAGATTTAGTC 
CGACGGGAGAGGAAGTGATAAACCATTACCTAAAGAACAAAATTCTGGGTAAGACTTGGC 
TCGTTGATGAAGCCATTAGCGAGATCAACATCTTGAATCACAAACCCAGCAAGGATTTGC 
CTAAGTTAGCTAGGATCCAATCGGAAGATCTTGAGTGGTACTTTTTCTCTCCGATTGAGT 
ACACGAACCCGAATAAGATGAAAATGAAGAGGACGACAGGTTCTGGGTTTTGGAAACCTA 
GTGGTGTTGATCGGAAAATTAGGGATAAAAGAGGAAATGGTGTTGTGATAGGGATTAAGA 
AGACGCTTGTGTACCATGAAGGTAAGAGTCCTCATGGAGTTAGAACTCCTTGGGTTATGC 
ACGAGTATCACATC^CTTGCTTGCCTmTCATAAGAGGAAATATGTTGTCTGCCAAGTAA 
AGTATAAGGGTGAAGCTGCAGAAATTTCATATGAGCCAAGTCCCTCTTTGGTATCCGATT 
CGCATACCGTCATAGCGATTAACGGAGAACCGGAACCTGAGCTTCAGGTTGAGCAGCCAG 
GTAAAGAAAATCTCTTGGGTATGTCTGTAGATGATTTGATAGAACCAATGAACCAACAAG 
AGGAGCCACAAGGTCCTCACTTAGCTCCGAATGATGATGAGTTTATACGTGGATTGAGAC 
ATGTTGATCGAGAGCCGGTTGAATATTTGTTTGCCAATGAAGAAAACATGGATGGTTTGT 
CTATTATGAATGACTTGACAATCCCAATGATCGCCCAACAAGAGGATCTCATTCTCTCTG 
AGTGGGAGGGATTTATCGCAGCCACCTTTTTCAGCGACAACAACAATAACAATAACCTTA 
ACGTGCATCAACTAACGTCTTTCTTACCGGGATGATTATCAGAATGCATTTTGGGTTACA 
ACGGAGCGNCCGCT 

>G14 60 Amino Acid Sequence (domain in AA coordinates: TBD) 

MMKD PTG YRFS PTGEEV INHYLKNKI LGKTWL VDEAI S E IN I LNHKPS KDLPKLAR I QS E 

DLEWYFFSPIEYTNPNKMKMKRTTGSGFWKPSGVDRO 

SPHGVRTPWVTOHEYHITCLPHHKRKYWCQVICYK^ 

EPEPELQVFIQPGKENLLGMSTODLIEPMNQQEEPQGPH^ 

LFANEENMDGLSIMNDLTIPMIAQQEDLILSEWEG 

PG* 

>G147 (37. .672i 

AAATCATCAGATAGAAGGAAATATTCTGATTGAGAGATGGCTCGTGGAAAGATTCAGCTT 
AAGAGGATTGAGAACCCGGTTCAC^GACAAGTGACTTTTTGCAAGAGGAGAACT 
CTCAAGAAGGCTAAGGAGCTCTCTGTGCTCTGTGATGCCGAGATCGGTGTTGTGATCTTC 
TCTCCr<^GGGCAAGCTCTTTGAGCTCGCTACT 

AAGTACATGAAGTGTACTGGTGGTGGTCGTGGTTCTTCTrCTGCTACTTTTACTGCTC^ 

GAACAACTTCAACCACCAAATCTTGATCCGAAAGATGAGATCAACGTGCTTAAGCAAGAG 

ATTGAGATGCTTCAGAT^GGGATAAGCTATATGTTTGGAGGAGGAGATGGGGCTATGAAT 

CTTGAAGAACTTCTTTTGCTTGAGAAGCATCTTGAGTATTGGATTTCTCAGATTCGCT 

GCTAAGATGGATGTTATGCTTCAAGAAATTCAGTCATTGAGGAACAAGGAAGGAGTCCTC 
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AAAAACACCAACAAGTATCTCCTCGACAAGATAGAGGAAAACAACAATAGCATATTAGAT 
GCTAACTTCGCAGTCATGGAGACAAACTATTCCTATCCGCTAACAATGCCAAGTGAAATA 
TTTCAGTTCTAGACCATAGGGTATTTGAAGACTATGTCTCACGAATTTAAATAACCTTGG 
TAAGTATAATATAGTGTTGTTAAATCACACATAATTAAAATAAAGCCTGTGGAACTTCGC 
TAGGCAGTTGAAAATCTATCCGTATGTTTTATCCTCTTGTTTTACATTTGTTGGTGTGAA 
GATGAAATGACTGCAAGTGTGGTGTGTACTTATAACTCTTTCTACTTTCTATCTATGTTT 
TGAATTTATGGATT 

>G147 Amino Acid Sequence (domain in AA coordinates: 2-57) 

MARGKI QLKR I ENPVHRQVTFCKRRTGLLKKAKELS VLCDAE I GWI FSPQGKLFELATK 

GTMEGMIDKYMKCTGGGRGSSSATFTAQEQLQPPNLDPKDEINVLKQEIEMLQKGISYMF 

GGGDGAMNLEELIjLLEKHLEYWISQIRSAKMDVML^ 

ENNNS I LDANFAVMETNYSY PLTMPS E I FQF * 

>G1471 (1..735) 

ATGGAGAACCAATCTATGTCTTCATCAAGCTCCTCCACACACAAACATGATCAAT^AACTC 
AAAAGTTCCGTTGTGGCCATGGAGGTCCTGGAGGAGAAGGAGACAGTGAACAATCCGCCC 
CAGTATTATAATAAGATCTACATCTGTTACTTGTGCAAGAGAGCGTTCCCAACCCCTCAT 
GCCCTTGGCGGTCACGGAACCACCCACAAGGAGGACCGAGAATTGGAGAGGCAACAGATC 
GAGTCAAGGCTTTCTAACAAAGACAAGTCTAACTTGCTCTTTGGTGGGTCTTCACAAGAT 
GTTTTATCAAATGATAATCACCTTGGACTCTCTCTTGGTCCATTGAAGTCCATAGAAGGT 
AG CAG CAGCAGCAACAACGTTAACCCATTGCTTAATGTTGGAGTCCCTAGAGGAACCACA 
GATATGAACATGAACAACTATAGCTCACATGCTTTATCAACTGATGATATTAATCTTGAT 
CTTACTCTTGGTCCATCTAAGTCCATAGGAGATAGCAACAATATCATTAATAACAACACT 
AACTCATCCTTCGATGGGAATCTGATCATTCCCGTTCGTCCTCGTGTGTCTAGATACCAT 
TTTGTTGCTGGGAACCCCCTTGATTCAATCTCTAGAAACATTCCTCCTTCTATTACTTTT 
CCTCATCTAAACATCAATCTTTCTCATGATTCGTTTTCTTTACAAGAGAATGGTTCGGGC 
TCTAGTCACTCATAA 

>G1471 Amino Acid Sequence (domain in AA coordinates: 49-70) 
MENQSMSSSSSSOTKHDQKLKSSWAIYEVL^ 

ALGGHGTTHKEDRELERQQIESRLSNKDKSNLLFGGSSQDVLSNDNHLGLSLGPLKSIEG 
S S S SNNTOPLLNVGVPRGTTDMNMNNYS SHALSTDDINLDLTLGPSKS IGDSNNIINNNT 
NSSFDGNLIIPVRPRVSRYHFVAGNPLDSISRNIPPSITFPHLNINLSHDSFSLQENGSG 
SSHS* 

>G1475 (1..645) 

ATGAAGAGAACACATTTGGCAAGTTTTAGTAACAGAGACAAAACCCAAGAAGAAGAAGGA 

GAAGACGGTAATGGTGACAACAGAGTCATCATGAATCACTACAAGAATTACGAAGCTGGG 

CTGATCCCATGGCCTCCCAAGAATTACACTTGCAGCTTCTGCAGGAGAGAGTTCAGATCT 

GCTCAAGCACTTGGAGGCCACATGAATGTTCATAGAAGAGACAGAGCAAAACTCAGGCAG 

ATCCCTTCTTGGCTCTTCGAACCTCACCACCACACACCTATTGCAAACCCTAACCCTAAT 

TTTAGCTCTTCTTCTTCCTCTTCAACAACAACAGCTCATCTTGAGCCTTCCCTAACCAAC 

CAGAGATCCAAAACCACTCCTTTTCCTTCTGCCCGGTTTGATCTTTTGGACAGTACTACT 

AGCTATGGAGGTTTGATGATGGACAGAGAGAAGAACAAGAGCAATGTATGTAGCAGAGAG 

ATCAAGAAAAGTGCCATCGATGCATGTCATTCAGTAAGATGTGAGATAAGCCGTGGGGAT 

CTGATGAATAAGAAAGATGATCAAGTCATGGGGTTGGAGCTTGGGATGAGTTTGAGGAAT 

CCCAACCAAGTTCTTGATTTGGAGCTTCGACTAGGCTACCTCTAA 

>G1475 Amino Acid Sequence (domain in AA coordinates: 51-73) 

MKRTHLASFSNRDKTQEEEGEDGNGDNRVIMNHYI^^ 

AQALGGHMNA7HRRDRAKLRQI PS WLFEPHHHTP I ANPNPNFS S S S S S S TTTAHLE PS LTN 
QRSKTTPFPSARFDMjDSTTSYGGLMMDREKNKSNV^ 
LMNKKDDQVMGLELGMSLRNPNQVLDLELRLGYL* 
>G1477 (1. .606) ' 

ATGTTGTCCTCGGACTCGAATTACGCTAGTGATATTAGCGACGATGCCTCCGCCACCGGA 
TCGATAGAGAATCCTATATACAAATGCAAGTATTGTCCTAGGAAGTTCGATAAAACACAA 
GCATTAGGTGGTCATCAAAATGCACACAGAAAGGAGAGAGAGGTCGAAAAACAACAAAAA 
GCATTTTTGGCGC^TTTGAACCGAC^ 

CATCATTCATTTCCTAACCAATACGCACTCCCACCGGGATTTGAACAGCCTCAGTACAAA 
GTTGATAGATCATACAAGATGTCCATGGTCTACAACCAATATGTGGGATCCTCTUVGCTCT 
AGCTTTGC^GGACTACAAAGTGACCCAAGTCAAGGAATGT^CCAGGATTGGACCTTTACC 
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GGGATCCCATTCCTACCCCAATCTCAACCTCAACCACTATCGTCACCAATATGTTTGGAT 
CTTTGCCTTGGCATTGGTAGCTCCCAAACCCAACCACAACCTCAAGAACCAAATGATGCA 
ACAGAAGAGATGGATGCTGAGAAAGAAAATGATGGTTCTTCCCTTTCTCTCTCACTCAAA 
CTGTGA 

>G1477 Amino Acid Sequence (domain in AA coordinates: 29-48) 

MLSSDSNYASDISDDASATGSIENPIYKCKYCPRKFDKTQALGGHQNAHRKEREVEKQQK 

AFLAHLNRPEPDLYAYSYSYHHSFPNQYALPPGFEQPQYKVDRSYKMSMVYNQYVGSSSS 

SFAGLQSDPSQGMNQDWTFTGIPFLPQSQPQPLSSPICLDLCLGIGSSQTQPQPQEPNDA 

TEEMDAEKENDGSSLSLSLKL* 

>G1487 (1. .1020) . 

ATGGAACAAGCCGCGTTGAAGAGCAGCGTCAGGAAAGAGATGGCTCTCAAAACGACTTCT 

CCGGTTTACGAAGAGTTTCTTGCCGTCACCACCGCTCAAAATGGCTTTTCCGTCGACGAT 

TTCTCTGTAGACGACTTGCTTGACTTGTCAAACGATGACGTTTTTGCCGACGAAGAAACT 

GACCTCAAGGCTCAACATGAGATGGTCCGTGTTTCCTCTGAGGAACCCAACGACGACGGA 

GACGCTCTTCGCCGGAGCAGCGATTTCTCCGGCTGTGACGACTTTGGTTCTCTCCCTACA 

AGCGAACTCTCTCTTCCGGCGGATGATTTAGCGAACCTTGAGTGGCTCTCTCATTTCGTG 

GAGGACTCCTTCACGGAATATTCGGGTCCAAACCTCACCGGAACCCCGACTGAGAAACCG 

GCGTGGTTAACGGGTGACCGGAAACATCCTGTGACTGCAGTCACGGAAGAGACCTGTTTC 

AAATCCCCTGTTCCGGCTAAAGCCCGTAGCAAACGTAACCGCAATGGCCTCAAGGTCTGG 

TCGCTTGGTTCGTCGTCCTCCTCGGGTCCTTCCTCGTCCGGTTCGACCTCCTCCTCCTCT 

TCGGGTCCTTCCAGCCCGTGGTTCTCCGGCGCTGAGCTGCTCGAGCCTGTGGTCACGTCA 

GAGAGGCCACCGTTTCCCAAGAAGCATAAGAAAAGGTCAGCCGAGTCTGTTTTCTCCGGT 

GAGCTGCAGCAGCTGCAACCTCAGCGAAAGTGCAGCCACTGCGGCGTTCAGAAAACTCCG 

CAGTGGAGAGCCGGGCCAATGGGAGCCAAGACCCTGTGCAATGCGTGCGGTGTCCGGTAC 

AAGTCGGGTAGGTTGCTACCGGAATACAGACCCGCTTGTAGCCCGACATTCTCGAGTGAG 

CTGCACTCGAACCACCACCGGAAAGTCATAGAGATGAGGCGGAAGAAGGAGCCAACCAGT 

GACAACGAAACCGGTTTAAACCAGCTGG1TCAGTCCC(^CAlAGCTGTACCAAGTTTTTGA 

>G1487 Amino Acid Sequence (domain in AA coordinates : 251-276), 

MEQAALKSSVRKEMALKTTSPVYEEFLAVTTAQNGFSVDDFSVDDLLDLSNDDVFADEET 

DLKAQHEMVRVSSEEPNDDGDALRRSSDFSGCDDFGSLPTSELSLPADDLANLEWLSHFV 

EDS FTEYSGPNLTGTPTEKPAWLTGDRKHPVTAVTEETCFKS PVPAKARS KRNRNGLKVW 

SLGSSSSSGPSSSGSTSSSSSGPSSPWFSGAELLEPVVTSERPPFPKKHKKRSAESVFSG 

ELQQLQPQRKCSHCGVQKTPQWRAGPMGAKTLCNACGVRYICSGRLLPEYRPACSPTFSSE 

LHSNHHRKVIEMRRKKEPTSDNETGLNQLVQSPQAVPSF* 

>G1492 (149.. 919) 

AATCCCAACCCACACACCTCTCAAATCCTCCTCTCCTCGTTTCTCTTTCTCTCCTCTTCA 
CAGAACCAAAACATATCAAACCTTTTTTTCTCTTC 

TGTCGGTTTTTAGGGTTCTTGAAACGATATGGGTAAGTCTAGTGGTAGAAATGGTAACGG 
AAGCTTTAACGGCAATAAATTTCACGGAGTTAGACCTTACGTACGGTCTCCAGTTCCACG 
GCTTAGATGGACGCCGGATCTTCACCGTTGTTTCGTTCACGCCGTCGAGATTCTCGGTGG 
TCAACACCGAGCAACACCAAAACTTGTTCTTAAGATGATGGATGTGAAGGGACTTACCAT 
TTGACATGTCAAAAGCCACCTT(^GATGTATAGAGGAGGT^ 

ACCAGAAGA^GCTCATCATCTTCAATAAGAAGAAGACAAGACAGTGAAGAAGATTATTA 
TCTTCATGACAACTTGTCTTTACACACAAGGAATGATTGTCTTTTGGGTTTTCACTCTTT 
TCCTCTTTCTTCACATTCnTC^TTTAGAGGAGGAGGAGGAGGAAGAACAAAAGAGCAGCA 
GACTTCAGAGTCTGGTGGTTATGATGATGATGCTGACTTTCTTCACATCAAGAAGATGAA 
CGATACGACGACGTTTTTGTCACATCATTTCCCCAAGGGAACAGAGGAGTGGCGGGAACA 
AGAACACGAAGAAG^GAAGAAGATTTGTCGTTGTCTCTGTCGTTAAATCATCATCATTG 
GAGAAGCT^ATGGATCATCGGTGGTGAGCGAAACGAGTGAAGCAGCAGTCTCGACTTGTTC 
AGCACCATTCGTATCC!AAAGATTGCTTTGGTTC^ 

TTCTCTCCTCGGTAGCTAAATAAGTTATGCAAGATTTAGGTTCAGAGAAACTATTCGGAT 

GTGTTTTTGAAACTAGGATATTGAATGTTAGTAGAGAAACCTAGAAAATGAAGTTTAGAT 

AAATTATCAACGCAGCGTTTTGATCGCCTTTGAACGGAAAATTAACAAA 

>G1492 Amino Acid Sequence (dpmain in AA coordinates: 34-83) 

MGKS SGRNGNGSFNGNKFHGVRPYYRS PVPRLRWTPDLHRCFVHAVEILGGQHRATPKLV 

LKMMDVKGLTI SHVKSHLQMYRGGSKLTLEKPEESSSSS IRRRQDSEEDYYLHDNLSLHT 

RWDCLLGFHSFPLSSHSSFRGGGGGRTKEQQTSESGGYDDDADFLHIKKMiroTTTFLSHH 
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FPKGTEEWREQEHEEEEEDLSLSLSLNHHHWRSNGSSWSETSEAAVSTCSAPFVSKDCF 
GSSKIDLNLSISLLGS* 

>G1531 (1..666) 

ATGTGTGAGTCAAGCAACAAAGTCAGAGTATCGCCATACCCGCTTCGGTCTTCGAGGACC 

GACAAACACAAGGCGTCAGAGTCGCCTATTGAGACAGGTTGGGAGGATGTGCGTGGATGT 

CATCCTTACATGTGCGATACGAGTGTTCGTCACTCCAATTGTTTC7VAGCAGTTCCGCAGA 

AAAACCATAAAAAAGCGCCTATACCCCAAGACCTTACATTGTCCTCTCTGTAGAGGTGAA 

GTATCCGAGACGACAAAGGTGACGAGCACTGCT^AGAAGATTTATGAATGCTAAACCGAGG 

TCTTGCTCCGTAGAGGATTGCAAATTCTCTGGGACGTTTTCTCAGCTTACTAAGCACTTG 

AAAACTGAGCATCGCGGTATTGTGCCACCAAAGGTCGATCCACTGAGACAACAGAGATGG 

GAAATGATGGAGAGACATTCTGAATACGTTGAACTCATGACTGCAGCTGGGATTTCGCGT 

ATGGCTGAGGTGATGCAACAACAGCTTCCCCAGGATCAGAATCATCCTCATGTGTTTCAA 

GTGACCGTTAATGGAACCATATGGAATCTAATTGATCCGAGTCAGGGAAGGAATGGATTA 

GGCATCACCAACTATAGCGCAATGCAGTTTGTACCATTAAGCATAAATCACAGTAGAACT 
CTGTGA 

>G1531 Amino Acid Sequence (domain in AA coordinates: 41-77) 

MCES SNKVRVS P YPLRS SRTDKHKASES P I ETGWEDVRGCHP YMCDTS VRHSNCFKQFRR 

KTIKKRLYPKTLHCPLCRGEVSETTKVTSTARRFMNAKPRSCSVEDCKFSGTFSQLTKHL 

KTEHRGIVPPKVDPLRQQRWEMMERHSEYVELMTAAGISRMAEVMQQQLPQDQNHPHVFQ 

VTVNGTIWNLIDPSQGRNGLGITNYSAMQFVPLSINHSRTL* 

>G1540 (122.. 997) 

atctctttactaccagcaagttgttttcttgctaacttcaaacttctctttctcttgttc 
ctctctaagtcttgatcttatttaccgttaactttgtgaacaaaagtcgaatcaaacaca 
catggagccgccacagcatcagcatcatcatcatcaagccgaccaagaaagcggcaacaa 
caacaacaagtccggctctggtggttacacgtgtcgccagaccagcacgaggtggacacc 
gacgacggagcaaatcaaaatcctcaaagaactttactacaacaatgcaatccggtcacc 
aacagccgatcagatccagaagatcactgcaaggctgagacagttcggaaagattgaggg 
caagaacgtcttttactggttccagaaccataaggctcgtgagcgtcagaagaagagatt 
caacggaacaaacatgaccacaccatcttcatcacccaactcggttatgatggcggctaa 
cgatcattatcatcctctacttcaccatcatcacggtgttcccatgcagagacctgctaa 
ttccgtcaacgttaaacttaaccaagaccatcatctctatcatcataacaagccatatcc 
cagcttcaataacgggaatttaaatcatgcaagctcaggtactgaatgtggtgttgttaa 
tgcttctaatggctacatgagtagccatgtctatggatctatggaacaagactgttctat 
gaattacaacaacgtaggtggaggatgggcaaacatggatcatcattactcatctgcacc 
ttacaacttcttcgatagagcaaagcctctgtttggtctagaaggtcatcaagacgaaga 
agaatgtggtggcgatgcttatctggaacatcgacgtacgcttcctctcttccctatgca 
cggtgaagatcacatcaacggtggtagtggtgccatctggaagtatggccaatcggaagt 
tcgcccttgcgcttctcttgagctacgtctgaactagctcttacgccggtgtcgctcggg 
attaaagctctttcctctctctctctctttcgtactcgtatgttcacaactatgcttcgc 
tagtgattaatgatgcagttgttatattagtagttaactagttatctctcgttatgtgta 

atttgtaattactagctaagtatcgtctaggtttaattgtaattgacaaccgtttatctc 
tatgatgaataagttaaatttatatat 

>G1540 Amino Acid Sequence (domain in AA coordinates: 35-98) 

MEPPQHQHHHHQADQESGNN^KSGSGGYTCRQTSTRWTPTTEQIKILKELYYNNAIRSP 
TADQIQKITARLRQFGKIEGKNVFYWFQNHKARERQKra^ 
DHYHPLLHHHHGVPMQRPANSVNVKLNQDHHLY 
ASNGYMSSHVYGSMEQDCSMNYNNVGGGWA^^ 

ECGGDAYLEHRRTLPLFPMHGEDHINGGSGAIWKYGQSEVRPCASLELRLN* 
>G1544 (1..2178) 

ATGTCTCAGTCAAACATGGTACCAGTGGCTAACAACGGAGACAACAACAACGAC^ 

AACAAGAACAACAACAACAACAATGGTGGAACTGACAACACTAATGCTGGAAATC 

GGAGATCAAGATTTCGACAGTGGGAATACCTCAAGTGGCAATCATGGAGAAGGGTTGGGA 

AACAATCAAGCTCCTCGTCATAAGAAG7\AAAAATACAATCGTCACACCCAACTTCAGATT 

TCGGAGATGGAAGCTTTCTTCAGAGAGTGTCCTCACCCAGATGACAAACAAAGGTACGAC 

CTTAGCGCTCAATTGGGATTGGACCCTGTTCAGATCAAATTCTGGTTCCAGAACA^ 

ACTCAAAACAAGAATCAACAAGAACGCTTTGAGAACTCAGAACTTCGGAATCTGAACAAC 

C^CCTTAGGTCTGTU^TCAGCGGTTACGAGAAGCTATTCATC^GCCTTATGCCCTAAG 
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TGTGGAGGCCAAACTGCAATTGGCGAAATGACCTTCGAAGAGCACCATCTTCGCATCCTC. 

AACGCTCGTTTGACTGAAGAGATCAAGCAACTTTCCGTGACAGCGGAAAAGATATCMGG 

CTTACGGGGATACCAGTAAGGAGCCATCCCCGTGTGTCTCCTCCTAATCCTCCTCCAAAT 

TTCGAGTTCGGGATGGGATCTAAGGGAAATGTCGGAAACCACTCGAGGGAAACCACTGGA 

CCTGCAGATGCTAATACCAAGCCGATCATCATGGAGTTGGCATTTGGAGCCATGGAGGAG 

CTCTTGGTGATGGCTCAAGTGGCTGAACCACTGTGGATGGGAGGATTTAATGGCACTAGC 

TTAGCTTTGAACTTGGATGAATACGAAAAGACGTTTCGCACGGGTCTCGGTCCTAGACTT 

GGCGGGTTTCGAACCGAGGCATCCAGGGAAACTGCACTCGTGGCAATGTGTCCTACTGGC 

ATTGTTG7^AATGCTCATGCAAGAGAATCTGTGGTCAACAATGTTTGCCGGAATTGTTGGT 

AGAGCCAGGACTCATGAACAGATAATGGCTGATGCTGCTGGAAACTTCAATGGAAATCTC 

CAAATAATGAGTGCTGAGTACCAAGTGCTTTCCCCGCTAGTCACAACCCGCGAAAGCTAC 

TTCGTCCGCTACTGTAAGCAACAAGGAGAGGGTTTGTGGGCGGTGGTCGATATTTCCATC 

GACCATCTCCTCCCAAACATCAACCTAAAATGTCGCCGCCGACCCTCTGGATGTCTGATT 

CAAGAAATGCATAGTGGTTACTCCAAGGTTACATGGGTGGAACATGTGGAAGTAGATGAT 

GCAGGAAGTTACAGCATCTTTGAGAAATTAATCTGTACTGGTCAAGCTTTTGCTGCTAAC 

CGCTGGGTTGGTACATTGGTACGCCAGTGTGAGCGGATATCTAGCATCTTGTCGACAGAT 

TTTCAATCTGTCGATTCCGGTGATCACATAACGCTAACTAACCATGGAAAGATGAGCATG 

CTGAAGATAGCTGAGCGGATTGCGAGAACCTTCTTTGCTGGAATGACCAATGCGACGGGG 

TCTACAATATTTTCTGGTGTTGAAGGAGAAGATATCAGAGTGATGACAATGAAGAGCGTG 

AATGATCCAGGAAAGCCTCCCGGTGTCATTATTTGTGCAGCCACTTCCTTTTGGCTTCCT 

GCTCCTCCTAACACTGTCTTTGACTTCCTCAGAGAGGCTACTCACCGACACAATTGGGAT 

GTTCTCTGCAACGGAGAGATGATGCACAAGATAGCAGAGATTACGAATGGGATAGACAAA 

AGGAACTGTGCAAGTTTACTCCGGCATGGACACACTAGCAAGAGCAAGATGATGATAGTT 

CAAGAGACTTCTACTGACCCAACAGCTTCATTTGTGCTTTATGCGCCTGTTGATATGACA 

TCAATGGATATTACTCTCCATGGAGGTGGTGATCCTGACTTTGTGGTGATCCTGCCTTCT 

GGTTTTGCTATTTTTCCAGATGGTACGGGTAAGCCTGGAGGAAAAGAAGGAGGATCACTT 

TTGACCATTTCCTTCCAAATGCTGGTTGAGTCAGGTCCTGAGGCTAGGCTGAGTGTTAGC 

TCTGTTGCAACTACTGAGAATCTGATTCGTACAACCGTGCGGAGGATCAAAGATTTGTTT 

CCTTGTCAGACTGCTTGA 

>G1544 Amino Acid Sequence (domain in AA coordinates: 64-124) 
MSQSNMVPVA1TOGDNNNDM 

NNQAPRHK^CKKYNRHTQLQISEMEAFFRECPHPDDKQRYDLSAQLGLDPVQIKFWFQNKR 
TQNKNQQERFENSELRNLNNHLRSENQRLREAIHQALCPKCGGQTAIGEMTFEEHHLRIIi 
NARLTEEIKQLSVTAEKISRLTGIPVRSHPRVSPPNPPPNFEFGMGSKGNVGNHSRETTG 
PADANTKPIIMELAFGAMEELLVMAQVAEPLWMGGFNGT^ 

GGFRTEASRETALVAMCPTGIVEMLMQENLWSTMFAGIVGRARTHEQIMADAAGNFNGNIi 
QIMSAEYQVLSPLVTTRESYFVRYCKQQGEGLWAVVDISIDHLLPNINLKCRRRPSGCLI 
QEMHSGYSKVTWVEHVEVDDAGSYSIFEKLICTGQAFAANRWVGTLVRQCERISSILSTD 
FQSVDSGDHITLTimGKMSMLKIAERIARTFFAGMTNATGSTIFSGVEGEDIRVMTMKSV 
NDPGKPPGVI I CAATSFWLPAPPNTVFDFLREATHRHNVTOVLCNGEMMHKIAEITNGIDK 
RNCASLLRHGHTSKSKMMIVQETSTDPTASFVLYAPVDMTSMDITLHGGGDPDFWILPS 
GFAIFPDGTGKPGGKEGGSLLTISFQMLVESGPEARLSVSSVATTENLIRTTVRRIKDLF 
PCQTA* 

>G156 (39.. 755) 

AGGAAGAGGGAGCCACTCATAAGAGGAAGAAGAGAGAGATGGGTAGAGGGAAGATAGAGA 

TAAAGAAGATAGAGAATCAGACGGCGAGGCAAGTGACCTTCTCCAAGAGAAGAACTGGTC 

TTATT^GAAGACTCGTGAGCTCTCTATTCTCTGTGACGCTCACATCGGTCTCATCGTCT 

TCTCAGCCACCGGAAAGCTTTCCGAGTTCTGCTCCGAACAGAACAGGATGCCTCAACTCA 

TTGACCGATACITGCATACCAACGGATTGCGACnTCCTGATCATCATGACGACCAGGAG^ 

AATTGCACCATGAGATGGAACTACTAAGAAGAGAGACATGTAACCTTGAGCra 

GTCCATTCCATGGACATGACTTAGCCTCCATTCCTCCTAATGAGCTTGACGGACTCGAGA 

GACAGCTAGAACATTCTGTCCTCAAAGTCCGTGAGCGTAAGAGGAGGATGCTAGAAGAAG 

ATAACAACAACATGTACCGTTGGCTTCATC^^ 

CTGGGATAGATACCAAACCAGGGGAGTATCAACAGTTTATAGAGCAGCTTCAGTGCTATA 
AACCAGGGGAGTATCAGCAGTTTCTAGAGCAGCAGCAACAACAACCAAACAGCGTTCTTC 
AGCTTGCTACACTTCCTTCTGAGATTGATCCTACTTACAATCTCCAGCTTGCTCAGCCTA 
ATCTTCAAAACGATCCAACGGCCCAGAATGATTAATACAATTCTCAATAGATATCTACTC 
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TTTCTTTATGGAGACAGATTCATGAACTTTTATTACCTATATTTTGATAAGCCAGTGTCT 

TCTTTTGTGTGGCTATGGAAACCTTGTTTAAAGCACAATGCACTTGAGTTCTTGGTTATA 

TAATTAATCATCATTATTACATANHAAANAANHAAAAAAAAAAAAAA 

>G156 Amino Acid Sequence (domain in AA coordinates- 2-57) 

MGRGKIEIKKIENQTARQVTFSKRRTGLIKKTRELSILCDAHIGLIVFSATGKLSEFCSE 
QNRMPQLIDRYLHTNGLRLPDHHDDQEQLHHEMELLRRETCNLELRLRPFHGHDLASIPP 
NELDGLERQLEHSVLKVRERICRRMLEEDNNHMYRWLHEHRAAMEFQQAGIDTKPGEYQQF 

ATTCACATTTTTATTTATCTTTCCATTTAGCCATTCTGTTCCCTGTCTCTTCCTCCTCTC 

TTTTTGACACATCACATCATCATCACATCATCATTCAACATCAATCATCATCATATGCAT 

ACACATACATCTGTGTTCTGCGGATCGAGTTAATTAGTTATGGCTTCTTCGAATAGACAC 

TGGCCAAGCATGTTCAAGTCCAAACCTCATCCCCATCAATGGCAACATGACATCAACTCT 

CCTCTCTTGCCTTCTGCTTCTCACCGATCTTCTCCTTTCTCTTCAGGATGTGAAGTGGAG 

AGGAGTCCAGAGCCAAAACCAAGATGGAATCCAAAGCCAGAGCAGATTCGGATACTTGAA 

GCAATCTTTAACTCCGGGATGGTGAATCCTCCAAGAGAGGAGATCAGGCTTCAAGAATAC 

GGCCAAGTCGGTGATGCTAACGTCTTCTACTGGTTCCAAARCCGTAAGTCCCGTAGTAAA 

CACAAACTCCGCCTCCTCCACAACCACTCCAAACACTCTCTCCCTCAAACGCAACCGCAG 

CCGCAGCCGCAACCTTCGGCTTCCTCTTCCTCTTCCTCCTCCTCTTCCTCCTCCAAATCC 

ACCAAACCCCGAAAAAGCAAGAACAAGAACAACACTAATCTCTCTTTGGGTGGTAGTCAA 

ATGATGGGGATGTTTCCACCGGAACCGGCGTTTCTCTTCCCGGTCTCCACTGTCGGAGGG 

TTTGAAGGTATCACCGTCTCATCCCAATTAGGGTTTCTCTCCGGTGATATGATTGAGCAA 

CAAAAACCGGCTCCAACGTGTACCGGACTCCTGCTGAGTGAGATCATGAACGGTAGTGTG 

AGTTATGGAACTCATCATCAACAACACTTGAGTGAGAAAGAAGTTGAAGAAATGAGGATG 

AAGATGTTGCAACAGCCACAGACTCAGATTTGTTACGCTACCACTAATCATCAAATAGCT 

TCTTACAACAACAACAACAACAACAATAACATCATGCTTCATATTCCTCCCACTACTTCT 

ACTGCCACCACTATTACTACTTCGCATTCTCTCGCTACTGTCCCATCAACTTCGGACCAG 

CTTCAAGTTCAAGCGGACGCACGAATAAGAGTTTTCATCAATGAAATGGAGCTTGAAGTG 

AGCTCAGGACCGTTCAATGTGAGGGATGCATTTGGGGAAGAGGTTGTTCTGATTAATTCC 

GCGGGTC^GCCC^TTGTCACCGATGAATATGGCGTCGCTCTTCACCCTCTTCAACACGGA 

GCCTCGTACTATCTGATCTAGTCGTGTGGGAGATTTGAGTTTGAAGAAGAAATTAAGACC 

TGTCTCTTTCTTTCACCATCTCTCGTACGTAGGCTTAAATGTTAAGATTTTATAAAGTAT 

TGGTTTCAGTTACCTGTTGTGACGGTGTTTATGTATGAGTTTCGGACAACATTCACAAAA 

CTCTCTCGTTAAATTGTTGACCAATAATATATGATGTGTGTTTCATTATTATCTAAAAAA 

>G1584 Amino Acid Sequence (domain in AA coordinates: TBD) 

MASSHRHWPSMFKSKPHPHQWQHDINSPLLPSASHRSSPFSSGCEVERSPEPKPRWNPKP 

EQIRILEAIFNSGMVNPPREEIRLQEYGQVGDANVFYWFQNRKSRSKHKLRLLHNHSKHS 

LPQTQPQPQPQPSASSSSSSSSSSSKSTKPRKSKNKNNTNLSLGGSQMMGMFPPEPAFLF 

PVSTVGGFEGITVSSQLGFLSGDMIEQQKPAPTCTGLLLSEIMNGSVSYGTHHQQHLSEK 

EVEEMRMKMLQQPQTQICTATTNHQIASYNNNNNNNWIMMIPPTTSTATTITTSHSLAT 

>G1587 (1..816) 

ATGGGCTACRTCrrCCAACAACAACCTCATCAACTATTTGCCCCTCTCTACTACTCAACCT 
CCTCTTCTTCTCACCCACTGTGATATTAACGGCAATGATCACCATCAGCTCATAACCGCA 
TCATCAGGAGAACACGATATTGATGAACGGAAAAACAACATTCCTGCGGCGGCGACTTTG 
AGATGGAATCCGACGCCAGAGCAGATCACGACGCTAGAAGAGCTTTACAGAAGCGGAACA 
CGGACGCCGACGACGGAACAGATCCAACAGATAGCATCTAAGCTTCGTAAATATGGGAGA 
ATCGAAGGGAAGAACGTTTTCTATTGGTTTCAGAATCATAAGGCTAGAGAGAGACTAAAA 
CGCCGCCGTCGTGAAGGTGGTGCTATTATCAAACCACATAAAGACGTCAAGGATTCATCA 
TCAGGTGGTCATCGAGTTGATCAGACAAAGCTCTGCCCATCTTTTCCaCACACAAACCGA 
CCACAGCCACAGCATGAATTAGATCCTGCGAGTTACAATAAAGACAACAATGCTAATAAT 
GAAGATCATGGGACGACTGAAGAATCTGATCAGAGGGCATCAGAGGTTGGTAAATACGCC 
^^ AG ^ TOTGTOACTOGGTCGAT ^ CT ^ C ^ CCGG ^^ GAOT AATATCGAC 
GAAAATGTCAACGGAGAAGAAGAAGAAACGAGGGACAACCGGACTTTAAATCTCTTTCCG 
GTTAGGGAGTACCAAGAGAAAACAGGCCGGTTGATAGAGAAGACGAAAGCATGCAACTAC 
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TGTTACTACTACGAGTTCATGCCTCTGAAGAACTGA 

>G1587 Amino Acid Sequence (conserved domain in AA coordinates : 61-121) 

MGYISI^LINYLPLSTTQPPLLLTHCDINGNBHHQLITASSGEHDIDERKNNIPAAATL 

RWNPTPEQITTLEELYRSGTRTPTTEQIQQIASKLRKYGRIEGKNVFYWFQNHKARERLK 

RRRREGGAIIKPHKDVKDSSSGGHRVDQTKLCPSFPHTNRPQPQHELDPASYNKDNNANN 

EDHGTTEESDQRASEVGKYATWRNLVTWSITQQPEEINIDENVNGEEEETRDNRTLNLFP 

VREYQEKTGRLIEKTKACNYCYYYEFMPLKN* 

>G1588 (1..2232) 

ATGTACCATCCAAACATGTTTGAGAGCCATCATATGTTCGATATGACCCCAAAGAGTACC 

TCTGATAACGACTTGGGAATCACCGGTAGCCGAGAAGATGACTTTGAGACCAAGTCAGGT 

ACCGAAGTCACTACTGAGAATCCTTCTGGTGAAGAGCTTCAAGATCCTAGCCAACGTCCC 

AACAAAAAGAAGCGTTACCATCGCCACACGCAACGCCAAATTCAAGAGCTCGAATCATTC 

TTTAAGGAATGTCCTCATCCAGATGATAAGCAACGAAAAGAGTTGAGCCGTGATCTCAAT 

TTAGAGCCTCTTCAAGTTAAGTTTTGGTTCCAAAACAAACGCACACAGATGAAGGCACAA 

AGTGAGAGGCATGAGAACCAGATTCTAAAGTCAGACAATGACAAGCTCAGAGCAGAGAAC 

AATAGATACAAAGAAGCTCTAAGCAATGCTACATGCCCTAACTGTGGCGGTCCAGCTGCT 

ATTGGAGAAATGTCTTTTGACGMCAACATCTCAGGATCGAAAATGCTCGGCTCCGCGAA 

GAGATTGATAGGATCTCTGCTATTGCTGCGAAATACGTTGGGAAGCCGTTAGGATCGTCT 

TTCGCTCCACTAGCGATCCACGCGCCTTCTCGTTCGCTTGATCTTGAAGTTGGAAACTTT 

GGGAACCAGACAGGCTTTGTAGGAGAAATGTATGGAACAGGGGACATTTTGAGGTCAGTT 

TCGATTCCTTCTGAGACTGATAAGCCTATAATCGTGGAGCTAGCGGTTGCAGCTATGGAG 

GAACTCGTGAGAATGGCTCAAACTGGAGATCCTTTATGGCTTTCAACCGATAATTCAGTC 

GAGATTCTCAACGAAGAAGAGTATTTCAGAACGTTTCCGAGAGGAATTGGACCAAAGCCA 

TTAGGATTAAGATCAGAGGCGTCAAGACAATCTGCAGTTGTTATAATGAATCACATCAAT 

CTCGTTGAGATTCTCATGGATGTGAATCAATGGTCTTGTGTTTTCTCTGGGATTGTGTCA 

AGAGCCTTGACACTTGAAGTTCTTTCAACTGGAGTTGCTGGGAACTACAACGGTGCTTTA 

CAAGTGATGACAGCTGAGTTTCAAGTTCCATCACCCCTAGTCCCAACGCGTGAGAACTAC 

TTTGTGAGATACTGCAAACAACACAGTGACGGCTCTTGGGCTGTGGTTGATGTCTCTTTG 

GACAGCCTTAGACCAAGTACTCCAATCTTAAGAACTAGAAGAAGGCCTTCAGGTTGTCTG 

ATTCAAGAATTGCCTAATGGTTATTCTAAGGTTACATGGATAGAGCATATGGAGGTAGAT 

GATAGATCAGTTCACAACATGTATAAACCGTTGGTTCAGTCCGGTTTAGCTTTCGGTGCG 

AAACGTTGGGTGGCTACACTCGAACGACAATGCGAGCGGCTTGCTAGCTCCATGGCCAGC 

AACATTCCTGGTGATCTTTCCGTGATAACGAGTCCTGAAGGAAGGAAGAGTATGTTGAAG 

CTAGCTGAGAGAATGGTTATGAGTTTCTGCAGTGGTGTTGGCGCGTCGACTGCACACGCT 

TGGACAACAATGTCGACAACAGGATCCGATGATGTTCGGGTCATGACCCGCAAGAGTATG 

GATGATCCAGGAAGACCTCCGGGTATTGTTCTTAGTGCAGCTACTTCATTCTGGATCCCA 

GTTGCTCCCAAACGTGTOTTTGATTTCCTCCGTGACGAAAATTCAAGAAAAGAGTGGGAT 

ATTCTGTCAAATGGAGGTATGGTTCAGGAAATGGCTCATATAGCCAATGGTCATGAACCT 

GGAAACTGTGTCTCCTTGCTCCGAGTCAATAGTGGAAACTCGAGCCAGAGCAACATGTTG 

ATTCTACAAGAGAGCTGTACAGATGCATCAGGATCGTATGTGATTTACGCGCCAGTGGAT 

ATAGTGGCGATGAATGTGGTTCTAAGCGGTGGAGATCCTGATTACGTGGCGTTGTTGCCG 

TCTGGTTTTGCTATTTTACCGGATGGTTCGGTTGGAGGAGGAGATGGGAATCAGCATCAG 

GAAATGGTTTCTACTACTTCTTCTGGGAGTTGTC 

CAGATTCTTGTTGACTCTGTTCCTACAGCTAAACTCTCACTTGGCTCGGTGGCTACGGTT 
AATAGTCTGATCAAATGTACGGTGGAGAGGATTAAAGCTGCTGTTTCTTGTGATGTTGGA 

GGAGGAGCGTAG 

>G1588 Amino Acid Sequence (domain in AA coordinates: 66-124) 
MYHPNMFESHHMFDMTPKSTSDNDLGITGSREDDFETKSGTEVTTENPSGEELQDPSQRP 

NKKKRYHROTQRQIQELESFFKECPHPDDK^ 

SERHENQILKSDNDKLRAENNRYKEALSNATCPNCGGPAAIGEMSFDEQHLRIENARLRE 
EIDRISAIAAKYVGKPLGSSFAPliAIHAPSRSLDLEVGNFGNQTGFVGEMYGTGDILRSV 
S I PSETDKPI I VELAVAAMEELWMAQTGDPL^^ 

LGLRSEASRQSAWIMNHINLV^IIiMDWQWSCVTSGIVSRALTLEVliSTGV 
QVWTAEFQVPSPLVPTRENYFTOYCKQHSDGSWAVVDVSLDSLRPSTPILRTRRRPSGCL 
I QELPNGYSKVTW I EHMEVX)DRS VTINMYKPLVQSGLAFGAKRWVATLERQCERIiASSMAS 
NIPGDLSVITSPEGRKSMLKIJ^RMviyiSFCSGVGASTAHAWTTMSTTGSDDWVMTRKSM 
DDPGRPPGIVLSAATSFWIPVAPKRVTDFLRDENSRKETOILSWGGMVQEMAHIANGHEP 
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GNCVSLLRVNSGNSSQSIsrMLILQESCTDASGSYVIYAPVDIVAMNWLSGGDPDYVALLP 
SGFAILPDGSVGGGDGNQHQEMVSTTSSGSCGGSLLTVAFQILVDSVPTAKLSLGSVATV 
NSLIKCTVERIKAAVSCDVGGGA* 
>G1589 (179. .2221) 

ACCAAACTCACATAGCAATCACACACATCTCCACAAACACAGCTTGAGATGATCATGAAA 

CACGTGCATCCTCAGATCTCTATCAATCCAGCTTGGTGAAAGAAGGTCAAGAATTGAAAG 

AGAATCAAAGAAAACGACGTCGTTTCATTCGTGTGTAACAACTACTAATTATACATAGAT 

GGCTGCTTACTTTCACGGAAACCCACCGGAGATCTCTGCCGGATCCGACGGTGGTCTTCA 

AACGTTGATCCTCATGAATCCAACTACTTACGTTCAGTACACCCAACAAGACAACGACTC 

GAACAACAACAACAACAGCAACAATAGCAACAACAACAACACAAACACAAACACAAACAA 

CAACAACAGTAGTTTCGTTTTCCTCGATTCCCACGCGCCGCAGCCAAACGCGAGCCAGCA 

GTTCGTCGGAATACCACTCTCAGGTCACGAAGCTGCTTCCATTACAGCCGCCGACAACAT 

CTCCGTACTTCACGGTTATCCTCCGCGCGTGCAGTACAGTCTCTACGGTAGCCACCAAGT 

GGATCCCACTCACCAGCAAGCCGCGTGTGAGACGCCACGCGCGCAGCAAGGCCTCTCTTT 

AACCCTCTCGTCTCAACAGCAGCAGCAACAGCAACATCATCAACAACACCAGCCTATTCA 

CGTCGGATTCGGGTCCGGACATGGAGAAGATATCCGGGTCGGGTCTGGCTCTACAGGATC 

GGGGGTAACAAACGGTATAGCTAATCTTGTTAGCTCCAAGTACTTGAAGGCAGCACAAGA 

GCTTCTTGACGAAGTAGTCAACGCTGATTCCGATGACATGAACGCTAMTCCCAACTATT 

CTCATCGAAAAAGGGTAGTTGCGGAAATGATAAACCTGTCGGAGAATCATCGGCCGGCGC 

TGGAGGAGAAGGTTCCGGTGGCGGAGCAGAAGCAGCCGGGAAACGTCCGGTGGAGCTAGG 

CACGGCAGAGAGACAAGAAATACAGATGAAGAAAGCAAAACTTAGTAACATGCTTCATGA 

GGTGGAGCAGAGATATAGACAGTACCACCAGCAGATGCAGATGGTGATCTCTTCGTTCGA 

GCAAGCGGCAGGGATAGGATCAGCGT^AGTCATACACGTCGCTAGCATTGAAAACCATATC 

AAGACAGTTCCGTTGCTTGAAAGAGGCGATCGCTGGTCAGATAAAAGCGGCCAACAAGAG 

TCTTGGGGAGGAAGATTCAGTGTCTGGTGTTGGGAGGTTTGAGGGGTCGAGGCTCAAGTT 

CGTGGACCACCACTTGAGA(^GCAAAGAGCTCTTCAACAACTGGGAATGATT(^ACATCC 

TTCCAATAATGCTTGGAGACCTCAACGTGGTCTCCCAGAACGAGCCGTCTCAGTTCTCCG 

TGCTTGGCTCTTCGAACACTTTCTTCATCCATACCCTAAGGATTCGGACAAGCACATGCT 

AGCTAAGCAAACAGGACTCACTCGTAGGCAGGTGTCGAACTGGTTTATAAACGCGAGAGT 

TCGGTTATGGAAACCAATGGTGGAGGAGATGTACATGGAGGAAATGAAGGAGCAGGCAAA 

GAACATGGGATCCATGGAAAAGACTCCTTTGGATCAAAGCAACGAAGATTCTGCTTCA7VA 

GTCAACAAGTAACC7^AGAAAAGAGCCCAATGGCGGACACTAATTACCATATGAATCCCAA 

TCACAACGGTGACCTAGAAGGCGTCACTGGAATGCAAGGATGCCCCAAGAGACTAAGAAC 

CAGCGACGAGACAATGATGCAGCCAATAAATGCGGATTTCAGCTCCAACGAGAAGCTCAC 

GATGAAGATTCTAGAAGAACGGCAAGGGATAAGATCAGATGGTGGCTACCCTTTCATGGG 

TAATTTCGGGCAATACCAAATGGATGAGATGTCAAGATTTGATGTAGTCTCAGACCAGGA 

GCTCATGGCGCAAAGGTACTCAGGAAACAACAATGGCGTGTCCCTCACGTTAGGTTTACC 

TCATTGTGATAGCTTGTCGTCCACGGACCATCAGGGTTTCATGCAGACCCACCATGGGAT 

TCCTATAGGGAGAAGAGTGAAAATAGGAGAAACAGAGGAATATGGACCCGCCACCATCAA 

TGGTGGTAGCTCGACCACAACCGCACATTCATCAGCGGCAGCTGCCGCGGCTTACAATGG 

GATGAACATAC^GAACCAGAAGAGATATGTGGCTCAGTTATTGCCCGACTTCGTTGCATA 

AACCCATCTCTCTAGAAGGAGAAACCGAAACAGGTTATTATATACGTTTCTAGTTTTT^ 

TTAGTATATAGTTTCTCATACCATTGAACCAAAAC^^ 

TTGGTTATATATGGCCGACGGGCTACGTCAGGGCCCTGACGTAGC 

>G1589 Amino Acid Sequence (conserved domain in AA coordinates: 3 84 -44 8) 

MAAYFHGNPPE I S AG SDGGLQTL I LI^PTT YVQ YTQQDNDSNNNl^SNNSNRbWTNTNTN 

NNNS SFVFLDSHAPQPNASQQFVG I PLSGHEAAS ITAADNISVLHGYPPRVQYSLYGSHQ 

VDPTHQQAACETPRAQQGLSLTLS SQQQQQQQHHQQHQP IHVGFGSGHGED IRVGSGSTG 

SGVTNGIANLVSSKYLKAAQELLDEVWADSDD^AKSQLFSSKXGSCGiro 

AGGEGSGGGAEAAGKRPVELGTAERQEIQMKKAKLSNMLHEVEQRYRQYHQQMQMVISSF 

EQAAGIGSAKSYTSLAIjKTISRQFRCLKEAIAGQIKAANKSLGEEDSVSGVGRFEGSRLK 

FVDHHLRQQRALQQLGMIQHPSmAWRPQRGLPERAVSVIiRAWLFEHFLHPyPKDSDKHM 

LAKQTGLTRSQVSNWFINARVRLWKPMVEEMYM 

KSTSNQEKSPMADTNYHMNPmiNGDLEGvTGMQGCPKRLRTSDETMM 
TMKILEERQGIRSDGGYPFMGNFGQYQMDEMSRFDWSDQELMAQRYSGNNNGVSLTLGL 
PHCDSLSSTDHQGFMQTHHGIPIGRRVKIGETEEYGPATINGGSSTTTAHSSAAAAAAYN 
GMNIQNQKRYVAQLLPDFVA* 
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>G160 (38.. 784) 

TCAAATTTGTCATTTGTTTATTCAAATTTTTGAGAAAATGGTGAGT^AGTACCAAAGGTCG 
TCAGAAAATAGAGATGAAAAAAATGGAAAACGAAAGCAACCTTCAGGTTACTTTCTCAAA 
AAGAAGATTCGGTCTTTTCAAAAAAGCTAGTGAACTTTGCACATTAAGTGGTGCAGAGAT 
TCTGTTGATTGTGTTCTCTCCTGGTGGGAAAGTGTTTTCTTTTGGCCATCCAAGTGTTCA 
AGAACTCATTCATCGCTTTTCGAATCCTAACCATAATTCTGCCATTGTCCATCATCAGAA 
CAACAATCTCCAACTTGTTGAAACCCGTCCGGATAGAAATATCCAATATCTCAACAATAT 
ACTCACTGAGGTGCTGGCAAACCAGGAAAAGGAGAAACAGAAGAGAATGGTTTTGGACCT 
ATTGAAAGAATCCAGAGAACAAGTAGGAAACTGGTATGAAAAAGATGTGAAAGATCTCGA 
CATGAATGAAACCAACCAGCTGATATCTGCTCTTCAAGATGTGAAAAAGAAACTGGTAAG 
AGAAATGTCTCAATATTCTCAAGTAAATGTTTCGCAGAATTACTTTGGTCAAAGTTCTGG 
CGTGATTGGTGGTGGTAATGTTGGCATTGATCTTTTTGATCAT^AGAAGAAATGCATTCAA 
CTATAATCCA/IACATGGTGTTTCCCAATCATACACCACCAATGTTTGGATACAACAATGA 
TGGAGTTCTCGTTCCGATATCCAACATGAACTACATGTCAAGTTACT^ACTTCAACCAGAG 
CTAGAGTCTGAAGCTAGAAGAACATCCTAATCAATATTTGCGTTATTTTGGCTATGGTTA 
CTGTTAGGATTGTTCTTGTATTGTGAGACTTAAGTTTGTTTTTTCTTTTAATTTGTTTCA 
GTTGGTTGGTTTTTCATTTTATTCGTCGTTTGTTTTCCTTTGTTTTTGGATATTTTTGTA 
TCCCAGAATAAATTTATTTATCCTTTAAAAA 

>G160 Amino Acid Sequence (domain in AA coordinates: 7-62) 

MTOSTKGRQKIEMKKMENESNLQVTFSK3mFGLFKKASELCTLSGMILLIVFSPGGKVF 

SFGHPSVQELIHRFSNPNHNSAIVHHQNNNLQLVETRPDM^ 

QKRMVLDLLKESREQVGNWYEKDVKDLD^ 

NYFGQSSGVIGGGNVGIDLFDQRRNAPNYNPNMVFPNOT^ 

SSYNFNQS* 

>G1636 {19.. 666) 

GAGTAATCATCAACGATTATGGCGTCAAGTCAGTGGACGAGGTCGGAGGATAAGATGTTT 
GAGCAAGCTTTGGTTCTTTTTCCTGAAGGATCTCCTAATCGGTGGGAGAGAATCGCTGAT 
CAGCTTCATAAATCTGCTGGTGAAGTTAGGGAGCATTACGAGGTCTTGGTTCATGATGTT 
TTCGAGATTGATTCTGGTCGAGTTGATGTCCCTGATTACATGGATGACTCGGCGGCTGCG 
GCGGCGGGTTGGGATTCCGCTGGTCAGATCTCTTTTGGGTCTAAACATGGCGAGAGTGAA 
CGCAAAAGAGGAACTCCTTGGACAGAGAACGAACACAT^ATTGTTTCTGATCGGATTAAAG 
AGATATGGTAAGGGAGATTGGAGGAGTATCTCGAGAAACGTTGTGGTGACGAGGACACCG 
ACGCAAGTCGCGAGTCACGCTCAGAAGTATTTTCTGAGACAGAACTCGGTGAAGAAGGAG 
AGGAAAAGGTCGAGCATCCATGATATAACTACGGTTGATGCTACTTTGGCTATGCCTGGG 
TCTAACATGGACTGGACTGGCCAACACGGGAGTCCTGTTCAGGCGCCGCAGCAGCAACAG 
ATTATGTCTGAGTTCGGTCAGCAATTGAATCCTGGTCATTTCGAGGATTTTGGGTTTCGG 
ATGTGATG 

>G1636 Amino Acid Sequence (domain in AA coordinates: 100-165) 

MASSQWTRSEDKMFEQALVLFPEGSPNRWERIADQLHKSAGEVREHYEVLVHDVFEIDSG- 

RVDVPDYMDDSAAAAAGWDSAGQISFGSKHGESERKRGTPWTENEHKLFLIGLKRYGKGD 

WRS I SRNVVVTRTPTQVASHAQKYFIJIQNS VKKERKRSS IHDI TTVDATLAMPGSNMDWT 

GQHGSPVQAPQQQQIMSEFGQQLNPGHFEDFGFRM* 

>G1642 (1..1077) 

ATGGGTCATCACTCATGCTGCAACAAGCAAAAGGTGAAGAGAGGGCTTTGGTCACCTGAA 
GAAGACGAAAAGCTCATCAACTACATCAATTCATATGGCCATGGATGTTGGAGCTCTGTT 
CCTAAACATGCAGGTTTGCAGAGATGTGGAAAGAGTTGTAGATTAAGATGGATAAATTAT 
CTAAGACCTGATCTTAAACGTGGAAGCTTCTCTCCTCAAGAAGCTGCTCTTATCATTGAG 
CTTCACAGCATTCT^GGTAACAGATGGGCTCAAATTGCTAAACATCTACCTGGAAGAACA 
GATAACGAGGTCAAGAATTTCTGGAACTCGAGCATTAAAAAGAAGCTCATGTCTCACCAT 
CATCACGGTCATCATC^TCATCATCTCTCTTCCATGGCGAGTTTGCTCACAAACCTTCCT 
TATCACAATGGATTCAACCCTACTACAGTCGACGATGAAAGTTCAAGATTCATGTCCAAT 
ATCTVTCACAAAttCTAACCCTAATTTCATC^CTCCAAGCC^TCTCTCTCTTCCTTCTCCT 
CATGTTATGACCCCATTGATGTTCCCAACCTCTAGAGAAGGAGATTTCAAGTTTCTAAC^ 
ACAAACAACCCAAACCAATCTCATCACCATGATAATAACCATTACAACAACCTCGACATT 
TTGTCACCCACACCAACTATAAACAATCATCATCAACCTTCACTTTCTTCTTGTCCTCAT 
GATAATAATCTCCAATGGCCAGCGTTACCAGATTTCCCAGCGAGTACCATTTCTGGTTTC 
CAAGAAACCCTTCAAGATTATGATGATGCTAATAAACTCAACGTGTTTGTGACACCATTC 
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AACGATAATGCCAAAAAGTTATTATGTGGAGAAGTTCTCGAAGGCAAAGTACTATCTTCC 

TCCTCACCAATTTCACAAGATCACGGCCTTTTTCTTCCCACCACGTACAACTTTCAAATG 

ACTTCTACGAGTGATCATCAACATCATCATCGAGTGGACTCATACATCAATCACATGATC 

ATACCATCATCATCCTCATCGTCGCCAATCTCTTGTGGACAGTACGTCATAACTTAA 

>G164 2 Amino Acid Sequence (domain in AA coordinates: TBD) 

MGHHSCCNKQKVKRGLWSPEEDEKLINYINSYGHGCWSSVPKHAGLQRCGKSCRLRWINY 

LRPDLKRGSFSPQEAALIIELHSILGNRWAQIAKHLPGRTDNEVKNFWNSSIKIOCLMSHH 

HHGHHHHHLSSMASLLTNLPYHNGFNPTTVDDESSRFMSNIITNTNPNFITPSHLSLPSP 

HVMTPLMFPTSREGDFKFLTTNNPNQSHHHD^ 

DOTLQWPALPDFPASTISGFQETLQDYDDANKLNVFVTPFNDNAKKLLCGEVLEGKVLSS 
SSPISQDHGLFLPTTYNFQMTSTSDHQHHHRVDSYINHMIIPSSSSSSPISCGQYVIT* 
>G1747 (1..777) 

ATGAAAATGATGCAAGAGGAGGGAAACCGAAAAGGTCCATGGACAGAACAGGAAGACATA 

CTTCTGGTAAATTTTGTTCACTTATTTGGAGATCGACGATGGGATTTTATAGCAAAAGTA 

TCAGGTTTGAACAGAACAGGAAAGAGTTGCAGGCTAAGATGGGTTAATTACCTACATCCT 

GGTCTCAAACGTGGCAAGATGACGCCTCAAGAAGAGCGCCTCGTCCTTGAGCTTCACGCT 

AAGTGGGGAAACAGGTGGTCGAAAATAGCCCGAAAATTGCCGGGACGAACGGATAACGAG 

ATAAAGAACTACTGGAGGACTCATATGAGGAAGAAAGCTCAAGAAAAGAAGCGTCCTGTT 

TCCCCAACTTCCTCATTTTCCAACTGCAGCTCGTCATCTGTGACCACTACCACCACCAAT 

ACTCAAGATACATCGTGCCACTCGCGTAAATCTTCAGGGGAAGTGAGCTTTTACGACACT 

GGAGGTTCCCGATCCACTAGAGAGATGAATCAAGAAAACGAAGACGTGTACTCGTTGGAT 

GATATATGGAGAGAGATTGATCACTCAGCAGTAAACATAATAAAACCGGTTAAAGACATC 

TACTCAGAACAAAGCCATTGCTTAAGTTACCCAAATCTAGCTTCACCATCATGGGAAAGC 

TCATTGGATTCTATATGGAACATGGATGCAGATAAAAGTAAGATATCGTCTTACTTTGCA 

AATGATCAGTTTCCTTTCTGTTTCCAACACAGTAGATCACCATGGTCGTCAGGTTAA 

>G1747 Amino Acid Sequence (domain in AA coordinates: 11-114) 

MKMMQEEGNRKGPWTEQEDILLTOFVHLFGDRRWDFIAKVSGLNRTGKSCRLRWVNYLHP 

GLKRGKMTPQEERLVLELHAKWGNRWS KI ARKLPGRTDNE I KITY WRTHMRKKAQEKKRPV 

SPTSSFSNCSSSSVTTTTTNTQDTSCHSRKSSGEVSFYDTGGSRSTREMNQENEDVYSLD 

DIWREIDHSAWIIKPVKDIYSEQSHCLSYPNLASPSWESSLDSIWNMDADKSKISSYFA 

NDQFPFCFQHSRSPWSSG* 

>G1749 (59.. 535) 

CAACACTTCTCAGTGACCGTGAGCAACGAATTATTTTCAGTTCAACGACTCCGCGGAAAT 
GGAAAATTCAGAAAATGTTCCCTCTTACGATCAAAACATCAATTTCACTCCTAATTTGAC 
GAGAGATCAAGAACATGTGATCATGGTCTCTGCTTTGCAACAAGTAATATCCAACGTCGG 
AGGTGACACGAACTCGAATGCATGGGAAGCTGATCTTCCACCTTTGAACGCTGGCCCTTG 
TCCTCTTTGTAGTGTCACCGGCTGCTACGGTTGCGTCTTCCCACGACACGAGGCGATAAT 
TAAGAAGGAGAAGAAGCACAAAGGAGTGAGGAAAAAACCATCAGGTAAATGGGCGGCGGA 
GATATGGGATCCGAGTTTGAAAGTAAGGAGATGGCTTGGAACGTTTCCAACAGCGGAGAT 
GGCGGCTAAGGCTTACAACGATGCGGCGGCTGAGTTTGTCGGAAGAAGATCAGCAAGACG 
TGGCACAAAGAACGGAGAGGAAGCATCTACCAAGAAGACGACTGAGAAAAATTAACGGAG 
AAGGAGCACGTATAGAAAGGCAGGAAGAGGCATCTTACTTGCTTCACAAGTAAATCAGAA 
I'l'U-l'TTTGAAAAGTAAAAACGTTATTTTGTTTGGTAATAAAATAAAGTAAT^CAAAATAT 
TGCTAACGCAAGACTTATCAAGTTCAGTCGTGACTGTGAGTGTGTTTTTATGTATCTT^^ 
TTCATTTTTTGTCTTTCAATTGTGTGTGTGTGTGT 

>G1749 Amino Acid Sequence (conserved domain in AA coordinates : 84-155) 

MENSENVPSYDQNINFTPNLTRDQEHVIMVSALQQVIS3STVGGDTNSNAWEADLPPLNAGP 

CPLCSVTGCYGCTFPRHEAIIKKEKKHK 

MAAKAYMJAAAEFVGRRSARRGTKNGEEASTKKTTEKN* 

>G1751 (117.. 923) 

AAACACAAACAAAACTCATATTTTCAATCTCCA 

AAAACAAAAACCAAACTCGGATTTAGTTTGACAGAAGAAGGAATCGAGAGTCGGGTATGC 
ATTATCCTAACAACAGAACCGAATTCGTCGGAGCTCCAGCCCCAACCCGGTATCAAAAGG 
AGCAGTTGTCACCGGAGCAAGAGCTTTCAGTTATTGTCTCTGC 

CAGGGGAAAACGAAACGGCGCCGTGTCAGGGTTTTTCCAGTGACAGCACAGTGATAAGCG 
CGGGAATGCCTCGGTTGGATTCAGACACTTGTCAAGTCTGTAGGATCGAAGGATGTCTCG ' 
GCTGTAACTACTTTTTCGCGCCAAATCAGAGAATTGAAAAGAATCATCAACAAGAAGAAG 
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AGATTACTAGTAGTAGTAACAGAAGAAGAGAGAGCTCTCCCGTGGCGAAGAAAGCGGAAG 

GTGGCGGGAAAATCAGGAAGAGGAAGAACAAGAAGAATGGTTACAGAGGAGTTAGGCAAA 

GACCTTGGGGAAAATTTGCAGCTGAGATCAGAGATCCTAAAAGAGCCACACGTGTTTGGC 

TTGGTACTTTCGAAACCGCCGAAGATGCGGCTCGAGCTTATGATCGAGCCGCGATTGGAT 

TCCGTGGGCCAAGGGCTAAACTCAACTTCCCCTTTGTGGATTACACGTCTTCAGTTTCAT 

CTCCTGTTGCTGCTGATGATATAGGAGCAAAGGCAAGTGCAAGCGCCAGTGTGAGCGCCA 

CAGATTCAGTTGAAGCAGAGCAATGGAACGGAGGAGGAGGGGATTGCAATATGGAGGAGT 

GGATGAATATGATGATGATGATGGATTTTGGGAATGGAGATTCTTCAGATTCAGGAAATA 

CAATTGCTGATATGTTCCAGTGATAAATGAGCTCTTTCTTGTTGGCGTTTTTTGGAGTTA 

AGTGCAAGAAGAGATTGACACTGTGGCTTGTTTAAAGTGAACAAGAACAAGAAAGCATGT 

AATTAGTAGTCTCATTCTTTTGTTTGTGGTCAATTCTATGTTTATCTCATATAAAATCTG 

AGTTAAACCTATCTGAGGAGAGAGTAAATAAAGAGGTTAAGAA 

>G1751 Amino Acid Sequence (domain in AA coordinates: TBD) 

MHYPNNRTEFVGAPAPTRYQKEQLSPEQELSVIVSALQHVISGENETAPCQGFSSDSTVI 

SAGMPRLDSDTCQVCRIEGCLGCNYFFAPNQRIE1CNHQQEEEITSSSNRRRESSPVAKKA 

EGGGKIRKRKNKKNGYRGVRQRPWGKFAAEIRDPKRATRVWLGTFETAEDAARAYDRAAI 

GFRGPRAKLN FPF VDYTS S V S S PV AADD I GAKAS AS AS VS ATDS VE AEQWNGGGGDCNME 

EWMNMMMMMDFGNGDSSDSGNTIADMFQ* 

>G1752 (25.. 756) 
AAAAAAAAAAAAAAAAAAAAACTTA^^ 

AGTTCTTGGAGCTCATCACAAGAATCACTCTTATGGAACGAGAGCTGTTTCTTGGATCAA 
TCATCTGAACCTCAAGCCTTCTTTTGCCCTAATTATGATTACTCCGATGACTTTTTCTCA 
TTTGAGTCACCGGAGATGATGATTAAGGAAGAAATTCAAAACGGCGACGTTTCTAACTCC 
GAAGAAGAAGAAAAGGTTGGAATTGATGAAGAAAGATCATACAGAGGAGTGAGGAAAAGG 
CCGTGGGGGAAATTTGCAGCGGAGATAAGAGATTCAACGAGGAATGGAATTAGGGTTTGG 
CTCGGGACATTTGACAAAGCCGAGGAAGCCGCTCTTGCTTATGATCAAGCGGCTTTCGCC 
ACAAAAGGATCTCTTGCAACACTTAATTTCCCGGTGGAAGTGGTTAGAGAGTCGCTAAAG 
AAAATGGAGAATGTGAATCTTCATGATGGAGGATCTCCGGTTATGGCCTTGAAGAGAAAA 
CATTCTCTTCGAAACCGGCCTAGAGGGAAAAAGCGATCCTCTTCTTCTTCTTCTTCTTCT 
TCTAATTCTTCTTCTTGCTCTTCTTCTTCGTCTACTTCTTCAACATCAAGAAGTAGTAGT 
AAGCAGAGTGTTGTGAAGCAAGAAAGTGGTACACTTGTGGTTTTTGAAGATTTAGGTGCT 
GAGTATTTAGAACAACTTCTTATGAGCTCATGTTGATCTTGTAATTGATTTCAGCAAAAG 
CCACTATTAAACTTTAATTTTGTGATAATTAATCTTGAAATTTGTTTTGTTCATTCTGCA 
ATTTCTTTGGTTCTCTTATTTTTTGTTTGTTGTATCCA^ATGAAATTATTGGAAGAGATG 
GTGATGTTAAAGTGTATATATATAAAAAAAAAA 

>G1752 Amino Acid Sequence (domain in AA coordinates: TBD) 

MEYSQSSMYSSPSSWSSSQESLLWNESCFLDQSSEPQAFFCPNYDYSDDFFSFESPEMMI 

KEEIQNGDVSNSEEEEKVGIDEERSYRGVRKRPWGKFAAEIRDSTRNGIRVWLGTFDKAE 

EAALAYDQAAFATKGSLATLNFPVEVTOESLKKMENWLHDGGSPVMAL 

GKKRSSSSSSSSSNSSSCSSSSSTSSTSRSSSKQSWKQESGTLWFEDLGAEYLEQLLM 

SSC* 

>G1763 (33.. 977) 

GTACATTTTTTTTTGTATTTCAGGAAACTCCGATGGCGGATCTCTTCGGTGGTGGCCACG 
GCGGCGAGCTTATGGAAGCACTTCAACCTTTTTACAAAAGTGCTTCCACGTCTGCTTCAA 
ATCCTGCGTTTGCGTCCTCAAACGATGCGTTTGCGTCTGCCCCAAACGACCCATTTTCTT 
CTTCTTCTTACTATAATCCTCATGCATCTTTCTTCCCTTCACATTCCACAACCACTTACC 
CGGATATTTATTCTGGATCCATGACCrATCCATCTTCATTCGGGTCGGATCTTCAACAAC 
CCGAAAACTACCAA5CTCAGTTCCATTACCAAAACACTATCACTTACACTCACCAAGACA 
ACAACACTTGCATGCTCAACTTCATTGAGCCGAGCCAACCGGATTTTATGACCCAACCGG 
GTCCGAGTTCGGGTTGGGTTTCyVAAACCGGCTAAGCTCTATAGAGGAGTGAGGCAAAGAC 
ATTGGGGAAAATGGGTCGCGGAGATCCGTTTACCCAGGAACCGAACCCGACTTTGGCTCG 
GAACATTCGACACGGCTGAAGAAGCCGCGTTGGCTTATGATCGCGCCGCGTTTAAGCTTC 
GTGGTGACTCGGCTCGGCTTAACTTCCCAGCTCTCCGATACCAAACCGGCTCGTCTCCGT 
CTGACGTTGGCGAATACGGACCTATTCAAGCTGCCGTTGACGCCAAGCTAGAAGCCATAT 
TAGCTGAGCCGAAGAATCAGCCGGGCAAAACGGAGAGGACGTCGAGGAAACGAGCTAAAG 
CCGCGGCTTCTTCAGCTGAGCAGCCGTCAGCGCCACAACAACATTCCGGGTCGGGTGAAA 
GTGATGGGTCGGGTTCACCGACTTCGGATGTTATGGTGCAGGAGATGTGCCAAGAGCCAG 
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AGATGCCATGGAATGAAAATTTCATGCTCGGCAAGTGTCCTTCTTATGAGATAGATTGGG 
CTTCAATTTTATCGTGAAAAATTAGGATTCAATTCATTTTTATTCATTTTAACTTGTTTG 
TATTTTCTTTTAAACTTTAGGGTTATTAGCTGTGCGTAAAATTTGTAATTTAGCATTTTG 
TATGAATGTAATGCAAGTGTGTAAATTATGGACAGCTCAAGCTTTTTTGTTAAAA 

>G1763 Amino Acid Sequence (conserved domain in AA coordinates : 140-209) 

MADLFGGGHGGELMEALQPFYKSASTSASNPAFASSNDAFASAPNDPFSSSSYYNPHASF 

FPSHSTTTYPDIYSGSMTYPSSFGSDL.QQPENYQSQFHYQNTITYTHQDNNTCMLNFIEP 

SQPDFMTQPGPSSGSVSKPAKLYRGTOQRHWGKWVAEIRLPRNRTRLWLGTFDTAEEAAL 

AYDRAAFKLRGDSARLNFPALRYQTGSSPSDVGEYGPIQAAVDAKLEAILAEPKNQPGKT 

ERTSRKRAKAAASSAEQPSAPQQHSGSGESDGSGSPTSDVMVQEMCQEPEMPWNENFMLG 

KCPSYEIDWASILS* 

>G1766 (32.. 1216) 

AGGCTATTCTCGGAAAAACAAAGAATAAAGAATGAATTCGTTTTCACAAGTACCTCCTGG 
CTTCAGATTTCATCCTACTGATGAAGAACTTGTAGACTACTACTTGAGGA71AAAAGTTGC 
ATCAAAGAGAATAGAAATCGATATCATCAAGGATGTTGATCTTTACAAGATTGAGCCATG 
TGATCTTCAAGAGTTATGCAAGATAGGAAACGAAGAGCAGAGCGAATGGTACTTCTTTAG 
TCATAAAGACAAGAAGTATCCCACGGGAACTCGAACCAATAGAGCCACGA7VAGCAGGATT 
TTGGAAAGCCACTGGAAGAGACAAGGCTATATATATAAGACATAGTCTTATCGGTATGAG 
GAAAACACTTGTGTTTTACAAAGGAAGAGCCCCAAATGGTCAGAAATCCGATTGGATCAT 
GCACGAATATCGCTTAGAAACAAGTGAAAATGGAACCCCTCAGGAAGAAGGATGGGTAGT 
ATGTAGGGTATTCAAGAAGA7^ATTGGCAGCGACAGTGAGGAAAATGGGAGATTACCATTC 
ATCACCATCGCAGCATTGGTACGATGATCAGCTCTCTTTTATGGCCTCCGAGATCATTTC 
TAGTTCTCCACGACAGTTTCTTCCCAATCATCATTATAACCGCCACCATCACCAGCAGAC 
ATTGCCTTGTGGCCTCAATGCATTCAACAACAACAATCCTAACTTGCAATGCAAGCAAGA 
GCTCGAGTTACATTACAATCAAATGGTACAACATCAACAACAAAACCATCATCTTCGTGA 
ATCTATGTTTCTCCAGCTTCCTCAGCTCGAAAGCCCTACCAGTAATTGCAATTCTGACAA 
CAACAATAACACAAGAAATATTAGTAACTTGCAGAAATCATCAAATATATCTCATGAGGA 
ACAATTGCAACAAGGGAATCAAAGTTTCAGCTCTCTGTATTACGATCAAGGAGTAGAGCA 
AATGACTACTGACTGGAGAGTTCTCGATAAATTTGTTGCTTCACAGCTTAGCAATGATGA 
AGAGGCTGCAGCCGTGGTTTCTTCTTCTTCTCATCAAAACAACGTCAAGATTGACACGAG 
AAACACGGGTTATCATGTGATAGATGAGGGAATAAATTTGCCGGAGAATGATTCTGAAAG 
GGTTGTTGAAATGGGAGAAGAGTATTCAAATGCTCATGCTGCTTCTACTTCTTCAAGTTG 
TCAGATTGATCTCTAGAAATAGTGATAGAGAGATGAAAAAGATGCAAGGTGAATATATAT 
GAAAATACATGCACACTAGTGTTATTTATACTTT^AAGATGGAAGGGGAAAAACAAGGAGT 
TATTTCCTGGATTTATGGAGGTTTTGTACATAATAAAAACCTACAACCATATGGTATTTT 
CTTTTGAAAAAAAAAAAAAAAAAAAAAAAA 

>G1766 Amino Acid Sequence (domain in AA coordinates: 10-153) 
MNSFSQVPPGFRFHPTDEELVT)YYLRKK^ 

EEQSEWYFFSHKDKKYPTGTRTNRATKAGFWKATGRDKAIYIRHSLIGMRKTLVFYKGRA 
PNGQKSDWIMHEYRLETSENGTPQEEGWWCRVTKKKLAATVRKMG 

LSFMASEIISSSPRQFLPNHHYNRHHHQQTLPCGLNAFNNmJPNLQCKQELELHYNQMVQ 

HQQQNHHLRESMFLQLPQLESPTSNCnsrSDNNimTRNISNLQKSS 

SLYYDQGVEQMTTDWRVLDKWASQLSN^ 

INLPENDSERVVEMGEE YSNAHAASTS S S CQ IDL * 

>G1767 (1..1596) 

ATGGATACTCTCTTTAGACTAGTCAGTCTCCAACAACAACAACAATCCGATAGTATCATT 

ACAAATCAATCTTCGTTAAGCAGAACTTCCACCACCACTACTGGCTCTCCACA 

TATCACTACAACTTTCCACAAAACGACGTCGTCGAAGAATGCnTCAACTTTTTCATGGAT 

GAAGAAGACCTTTCCTCTTCTTCTTCTCTVCCAC^^ 

ACTTACTACTCTCCTTTC^CTACTCC^ 

TCCTCCACCGCCGCAGCCGCAGCTTTAGCCTCGCCTTACTCCTCCTCCGGCCACCATAAT 
GACCCTTCCGCGTTCTCCATACCTCAAACTCCTCCGTCCTTCGACTTCTCAGCCAATGCC 

CGTGCGCAACAAATCCTATGGACGCTCAACGAGCTCTCTTCTCCGTACGGAGACACCGAG 
CAAAAACTGGCTTCTTACTTCCTCCAAGCTCTCTTCAACCGCATGACCGGTTCAGGCGAA 
CGATGCTACCGAACCATGGTAACAGCTGCAGCCACAGAGAAGACTTGCTCCTTCGAGTCA 
ACGCGAAAAACTGTACTAAAGTTCCAAGAAGTTAGCCCCTGGGCCACGTTTGGACACGTG 
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GCGGCAAACGGAGCAATCTTGGAAGCAGTAGACGGAGAGGCAAAGATCCACATCGTTGAC 

ATAAGCTCCACGTTTTGCACTCAATGGCCGACTCTTCTAGAAGCTTTAGCCACAAGATCA 

GACGACACGCCTCACCTAAGGCTAACCACAGTTGTCGTGGCCAACAAGTTTGTCAACGAT 

CAAACGGCGTCGCATCGGATGATGAAAGAGATCGGAAACCGAATGGAGAAATTCGCTAGG. 

CTTATGGGAGTTCCTTTCAAATTTAACATTATTCATCACGTTGGAGATTTATCTGAGTTT 

GATCTCAACGAACTCGACGTTAAACCAGACGAAGTCTTGGCCATTAACTGCGTAGGCGCG 

ATGCATGGGATCGCTTCACGTGGAAGCCCTAGAGACGCTGTGATATCGAGTTTCCGACGG 

TTAAGACCGAGGATTGTGACGGTCGTAGAAGAAGAAGCTGATCTTGTCGGAGAAGAAGAA 

GGTGGCTTTGATGATGAGTTCTTGAGAGGGTTTGGAGAATGTTTACGATGGTTTAGGGTT 

TGCTTCGAGTCATGGGAAGAGAGTTTTCCAAGGACGAGCAACGAGAGGTTGATGCTAGAG 

CGTGCAGCGGGACGTGCGATCGTTGATCTTGTGGCTTGTGAGCCGTCGGATTCCACGGAG 

AGGCGAGAGACAGCGAGGAAGTGGTCGAGGAGGATGAGGAATAGTGGGTTTGGAGCGGTG 

GGGTATAGTGATGAGGTGGCGGATGATGTCAGAGCTTTGTTGAGGAGATATAAAGAAGGT 

GTTTGGTCGATGGTACAGTGTCCTGATGCCGCCGG7\ATATTCCTTTGTTGGAGAGATCAG 

CCGGTGGTTTGGGCTAGTGCGTGGCGGCCAACGTAA 

>G1767 Amino Acid Sequence (domain in AA coordinates: 255-272) 

MDTLFRLVSLQQQQQSDSIITNQSSLSRTSTTTTGSPQTAYHYNFPQNDWEECFNFFMD 

EEDLS S S S SHHNHHNHNNPNTYYS PFTTPTQYHPATS STPS STAAAAAL AS PYS S SGHHN 

DPSAFSIPQTPPSFDFSANAKWADSVLLEAARAFSDKDTARAQQILWTLNELSSPYGDTE 

QKLASYFLQALFNRMTGSGERCYRT^TVTAAATEKTCSFESTRKTVLKPQEVSPWATFGHV 

AANGAI LEAVDGEAKIH I VDI S STFCTQWPTLLEALATRSDDTPHLRLTTVWANKF VKD 

QTASHRMMKEIGNRMEKFARLMGVPFKFNIIHHVGDLSEFDLNELDVKPDEVLAINCVGA 

MHGIASRGSPRDAVISSFRRLRPRIVTWEEEADLVGEEEGGFDDEFIiRGFGECLRWFRV 

CFESWEESFPRTSNERLMLERAAGRAIVDLVACEPSDSTERRETARKWSRRMRNSGFGAV 

GYSDEVADDVRALLRRYKEGWSMVQCPDAAGIFLCWRDQPVVWASAWRPT* 

>G1778 (1..627) 

ATGATGGGATACCAAACAAACTCTAATTTCTCCATGTTTTTTTCCTCGGAAAATGACGAC 
CAAAACCACCACAACTACGATCCTTATAATAATTTCTC 

ACTCTCTCACTTGGAACACCCTCTACTCGTCTCGACGACCACCATAGATTTTCTTCTGCT 
AATTCTAACAACATCTCCGGCGACTTTTATATTCACGGAGGAAACGCTAAGACTTCTTCG 
TACAAGAAGGGTGGTGTTGCTCATAGCCTACCTCGCCGTTGTGCTAGCTGCGACACCACT 
TCAACTCCTCTATGGAGAAACGGACCAAAAGGACCTAAGTCGTTATGTAACGCGTGTGGA 
ATCCGATTCAAGAAAGAGGAGAGGCGTGCGACGGCCAGAAACTTAACGATCTCCGGTGGA 
GGTTCATCAGCGGCAGAAGTCCCAGTAGAGAATTCGTACAACGGAGGTGGAAACTATTAC 
AGTCATCATCATCATCACTATGCCTCGTCGTCGCCGTCGTGGGCTCATCAGAACACACAA 
AGAGTTCCATATTTCTCACCGGTTCCGGAGATGGAATATCCCTACGTGGATAACGTCACG 
GCTTCTTCTTTTATGTCTTGGAATTGA 

>G1778 Amino Acid Sequence (domain in AA coordinates : 94-119) 
MMGYQTNSNFSMFFSSENDDQNHHNYDPY^^ 

NSNNISGDFYIHGGNAKTSSYKKGGVAHSLPRRCASCDTTSTPLWRNGPKGPKSLCNACG 
IRFKKEERRATARNLTISGGGSSAAEVPVENSYWGGGNYYSHHHHHYASSSPSWAHQNTQ 
RVPYFS PVPEMEYPYVDNVTAS S FMSWN* 
>G1789 (108.. 413) 

CAAGGACTCTGCGACATCTGTGCAAC^TATCATTTCCTCAGAATCTCTTTCTTTTCTAGG 
TTTATTACTACACAAAACCAAACATCATC^CTTTAGTTAOTAAAC^TGGCATCAGGCT 
CAATGTCTTCTTATGGCTCTGGCTCATGGACTGTTAAGCAGAACAAAGCCTTTGAGCGTG 
CTCTAGCAGTCTATGACCAAGACACTCCGGACCGTTGGCACAATGTTGCTAGAGCTGTTG 
GTGGTAAAACACCAQAAGAAGCTAAGAGACAGTATGACCTTCTAGTTCGTGACATCGAAA 
GCATCGAGAATGGTCACGTGCCATTCCCTGACTACAAGACTACTACAGGAAACAGCAACA 
GAGGCAGGCTGCGTGATGAGGAAAAGAGGATGAGAAGCATGAAGCTGCAGTGAGACAAGA 
AGCAACAA7UVCCTAACTACGTATGATCGTCAAAATAAAAGAGAATCACTTCAGAGAGATG 
TGTTTTTTTCAATGTCTGACGAATCAATGTTTTTTTCTTGCAATTTCTCATGTTTTTCCC 
TAAGAAATGGTTTTTTTTTCGAGGCAACAAAAAAAAAA 

>G1789 Amino Acid Sequence (domain in AA coordinates: 1-50) 
I^SGSMSSYGSGSWTVKQNKAFERAIAVYDQDTPDRW^ 
RDIESIENGHVPFPDYKTTTGNSNRGRLRDEEKRMRSMKLQ* 
>G1790 (63.. 1346) 
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CAATGGAGAATTTCGTCGACGAGAATGGTTTTGCTTCTCTAAACCAAAACATCTTCACAC 
GTGATCAAGAACACATGAAAGAAGAAGATTTTCCATTCGAAGTCGTCGACCAATCAAAAC 
CTACAAGCTTTCTTCAAGATTTTCACCATCTTGATCATGATCATCAGTTTGATCATCATC 
ATCATCATGGCTCCTCATCTTCACATCCTTTGCTCAGCGTCCAAACTACGTCTTCTTGTA 
TCAATAATGCTCCTTTCGAGCATTGCTCTTACCAAGAAAACATGGTCGATTTCTATGAAA 
CTAAACCAAATTTGATGAATCATCATCATTTCCAAGCAGTGGAAAACTCATACTTCACTC 
GTAATCATCATCATCATCAAGAGATCAATTTGGTCGATGAACATGATGATCCTATGGACT 
TGGAGCAAAACAACATGATGATGATGAGGATGATCCCTTTTGATTACCCTCCTACAGAGA 
CTTTCAAACCTATGAACTTCGTAATGCCAGATGAAATTTCATGTGTTTCTGCAGATAATG 
ATTGTTATAGAGCAACGAGTTTCAACAAGACCAAACCATTTCTTACACGAAAGTTGTCTT 
CTTCTTCTTCATCATCATCATGGAAAGAAACCAAAAAGTCAACCTTAGTCAAAGGACAAT 
GGACTGCTGAAGAAGACAGGGTACTGATTCAACTCGTGGAGAAGTATGGATTGCGTAAAT 
GGTCGCATATCGCTCAAGTGTTACCGGGAAGAATCGGGAAACAATGTAGAGAGAGGTGGC 
ATAACCATTTGAGACCTGACATTAAGAAAGAAACATGGAGTGAAGAAGAGGACAGAGTGT 
TGATAGAATTTCACAAAGAGATTGGAAACAAATGGGCAGAGATTGCGAAAAGACTCCCGG 
GAAGAACAGAGAACTCGATCAAGAACCATTGGAACGCAACAAAAAGAAGACAATTCTCTA 
AAAGAAAATGTAGATCTAAGTATCCAAGACCTTCTCTGTTGCAGGATTACATCAAGAGCT 
TGAATATGGGAGCTTTGATGGCTTCTTCTGTTCCTGCAAGAGGTAGACGCAGAGAGAGTA 

ATGGACAAGACAGGATTGTGCCTGAATGTGTGTTTACTGATGATTTTGGATTCAATGAGA 
AGCTGCTTGAGGAAGGATGTAGCATTGACTCTTTGCTTGATGACATTCCTCAGCCTGACA 
TTGATGCTTTTGTTCATGGGCTCTGATTTGTATTTTTTATTCTGCTTGTTTCAGTTTTGT 
TGTTTTTTGTTTGTCTTTTTATACGAGACAGATTCCACCAAACTTCAATAATTTGAAAAG 
ATATAAAATATTTTGCTTTTTAAAAAAAAAAAAAAAAAAAAAAA 

>G1790 Amino Acid Sequence (conserved domain in AA coordinates : 217-316) 

MENFTOBNGFASLNQNIFTRDQEHMKEEDFPFEWDQSKPTSFLQDFffilLDHDHQ 

HHGSSSSHPIjLSVQTTSSCINNAPFEHCSYQENMVD^ 

HHHHHQEINLVT)EHDDPMDLEQNN^ 

CYRATSFNKTKPFLTRKLS S S S S SSSWKETKKSTLVKGQWTAEEDRVL IQLVEKYGLRKW 

SHIAQVLPGRI GKQCRERWHNHLRPD I KKETWSEEEDR VLIEFHKEIGNKWAEI AKRLPG 

RTENS I KNHWNATKRRQFSKRKCRS KYPR PSLLQDY I KSLNMGALMAS S VPARGRRRESN 

NKKKDVWATOEKKKEEEVYGQDRIVPECVFTDDFGFNEKIiLEEGCSIDSLLDDIPQPD^ 

DAFVHGL* 

>G1791 (36.. 455) 

ATGTACATGCAAAAACAAAAACCTTAAAAGCTTTCATGGAACGTATAGAGTCTTATAACA 
CGAATGAGATGAAATACAGAGGCGTACGAAAGCGTCCATGGGGAAAATATGCGGCGGAGA 
TTCGCGACTC^GCTAGACACGGTGCTCGTGTTTGGCTTGGGACGTTTAACACAGCGGAAG 
ACGCGGCTCGGGCTTATGATAGAGCAGCTTTCGGCATGAGAGGCCAAAGGGCCATTCTCA 
ATTTTCCTCACGAGTATCAAATGATGAAGGACGGTCCAAATC 

TGGCTTCCTCGTCGTCGGGATATAGAGGAGGAGGTGGTGGTGATGATGGGAGGGAAGTTA 

TTGAGTTCGAGTATTTGGATGATAGTTTATTGGAGGAGCTTTTAGATTATGGTGAGAGAT 

CTAACCAAGACAATTGTAACGACGCAAACCGCTAGATCATC^CTACTTACTTA<^GTGTA 

ATGTTTTTGGAGTAAAGAGTAATAATCAATATAATATACTTTAGTTO 

AAAAAAAAA 

>G179i Amino Acid Sequence (domain in AA coordinates : TBD) 
MERI ES YNTNEMKYRGTOKRPWGKYAAEI RDSARHGARVWLGTFNTAEDAAR 
MRGQRAILNFPHEY^MMKDGPNGSHENAVASSSSGYRGGGGGDDGREVIEFEYLDDSLLE 
ELLDYGERSNQDNCNDANR* 
>G1793 (59.. 1783) 

AGTGATTTATTGATTAACCCAAACACAAAATAAACAGATTTGACTCAAAAA 

GAATTCTAACAACTGGCTTGGCTTTCCTCTTTCACCGAACAACTCTTCTTTGCCT 

TGAATACAACCTTGGCTTGGTC^GCGACCATATGGACAACCCTTTTC 

GAATATGATCAATCCACACGGTGGAGGAGGAGATGAAGGAGGAGAGGTTCCAAAAGTGGC 

CGATTTTCTCGGTGTGAGCAAACCGGACGAAtoCCAATCCAACCACCTAGTAGCTTACAA 

CGACTCAGACTACTACTTCCATACCAATAGCTTGATGCCTAGCGTCCAATCAAACGATGT 

CGTTGTAGC^GCTTGTGACTCC^TACTCCTAACAACAGTAGCTATC^TGAGCTTCAAGA 
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GAGTGCTCACMTCTACAGTCACTTACTTTGTCCATGGGGACCACCGCTGGTAATAATGT 

TGTAGACAAAGCTTCACCATCCGAGACCACCGGGGATAACGCTAGCGGTGGAGCACTAGC 

CGTTGTTGAGACGGCCACGCCAAGACGTGCATTGGACACTTTCGGACAACGAACCTCGAT 

CTATCGTGGTGTCACAAGACATCGATGGACTGGTCGATATGAGGCTCATCTATGGGATAA 

TAGTTGTAGAAGGGAAGGCCAGTCTAGGAAAGGAAGACAAGTTTACTTGGGTGGATATGA 

CAAAGAAGATAAAGCAGCAAGATCATATGATCTAGCTGCACTTAAGTACTGGGGTCCTTC 

?^ACTACTACTAATTTCCCCATTACAMCTACGAGAAAGAAGTAGAGGAAATGAAGCACAT 

GACGAGACAAGAGTTCGTGGCTGCCATTAGAAGGAAAAGTAGTGGATTTTCGAGAGGCGC 

TTCGATGTATCGAGGAGTTACAAGGCATCACCAACATGGAAGATGGCAAGCAAGGATCGG 

CCGAGTCGCCGGAAACAAAGACCTCTACTTGGGAACTTTTAGCACTGAGGAAGAAGCAGC 

AGAAGCTTACGATATAGCTGCAATAAAGTTTAGAGGACTTAATGCAGTGACCAACTTCGA 

GATCAACCGGTACGACGTGAAAGCCATTCTAGAGAGTAGCACTCTTCCCATCGGAGGAGG 

CGCAGCTAAACGGCTCAAAGAAGCTCAAGCTCTTGAGTCTTCAAGGAAACGCGAGGCGGA 

GATGATAGCCCTTGGTTCAAGTTTCCAGTACGGTGGTGGCTCGAGCACAGGCTCTGGCTC 

CACCTCATCAAGACTTCAGCTTCAACCTTACCCTCTAAGCATTCAACAACCATTAGAGCC 

TTTTCTATCTCTTCAGAACAATGACATCTCTCATTACAACAACAACAATGCTCACGATTC 

CTCCTCTTTTAATCACCATAGCTATATCCAGACACAACTTCATCTCCACCAACAGACCAA 

C AATTACTTGCAGCAACAGTCGAGCCAGAACTCTCAG CAG CTCTACAATGCGTATCTTCA 

TAGCAATCCGGCTCTGCTTCATGGACTTGTCTCTACCTCTATCGTTGACAACAATAATAA 

CAATGGAGGCTCTAGTGGGAGCTAC7VACACTGCAGCATTTCTTGGGAACCACGGTATTGG 

TATTGGGTCCAGCTCGACTGTTGGATCGACCGAGGAGTTTCCAACCGTTAAAACAGATTA 

CGATATGCCTTCCAGTGATGGAACCGGAGGGTATAGTGGTTGGACCAGTGAGTCTGTTCA 

GGGGTCAAACCCTGGTGGTGTTTTCACTATGTGGAATGAGTAAACAAGGATCTCTTTCTT 

GCGGCACAAGGAATGGGT 

>G1793 Amino Acid Sequence (conserved domain in AA coordinates : 179-255 , 281-349) 

MNSNN^^LGFPLSPNNSSLPPHEYNLGLVSDHMDNPFQTQEWNMINPHGGGGDEGGEVPKV 

ADFLGVSKPDENQSNHLVAYNDSDYYFHTNSLMPSVQSNDVWAACDSNTP^mSSYHELQ 

ESAHNLQSLTLSMGTTAGNNVVX>KASPSETTGDNASGGALAVVETATPRRALDTFGQRTS 

ITOGVTRHRWTGRYEAHLWDNSCRREGQSRKGRQVYLGGYDKEDKAARSYBLAALKYW 

STTTNFPITNYEKEVEEMKHMTRQEFVAAIRRKSSGFSRGASMYRGVTRHHQHGRWQARI 

GRVAGNKDLYLGTFSTEEEAAEAYDIAAI KFRGLNAVTNFEINRYDVKAILES STLP IGG 

GAAKRLKEAQALESSRKREAEMIALGSSFQYGGGSSTGSGSTSSRLQLQPYPLSIQQPLE 

PFLSLQNNDISHYNl^AHDSSSFNHHSYI^^ 

HSNPALLHGLVSTSITONNNTO^ 

YDMPSSDGTGGYSGWTSESVQGSNPGGVFTMWNE* 

>G1795 (27.. 422) 

ACAAACACGCAAAAAGTCATTAATATATGGATCAAGGAGGTCGAGGTGTCGGTGCCGAGC 
ATGGAAAGTACCGGGGAGTTCGGAGACGACCTTGGGGAAAATATGCAGCAGAGATACGAG 
ATTCGAGGAAGCACGGTGAACGTGTGTGGCTTGGAACGTTCGATACGGCAGAGGAAGCGG 
CTAGAGCCTATGACCAAGCTGCTTACTCCATGAGAGGCCAAGCAGCAATCCTTAACTTCC 
CTCATGAGTATAACATGGGGAGTGGTGTCTCTTCTTCCACCGCCATGGCTGGATCTTCCT 
CCGCCTCCGCCTCCGCTTCTTCTTCTTCTAGGGAAGTTTT^ 

ATAGTGTTTTGGAGGAGCTCCTTGAGGAAGGAGAGAAACCTAACAAGGGCAAGAAGAAAT 
GAGCGAGATATAATTCATGATTATTTCTAA 

>G1795 Amino Acid Sequence (domain in AA coordinates: 12-80) 
MDQGGRGVGAEHGKYRGWRRPWGKYAAEIRDSRKHGERVWLGTFDTAEEAARAYDQAAY 
SMRGQAAILNFPHEYNMGSGVSSSTAMAGSSSASASASSSSRQVFEFEYLDDSVLEELLE 
EGEKPNKGKKK* - 
>G1800 (61. .894) 

CCATTATCATATCCTeTTCTTCCTTCTTCACTATCAATCTTCTTCTCCACTACAACACAA 
ATGGAGAAATCATCCTCAATGAAACAATGGAAGAAGGGTCCTGCTCGGGGTAAAGGCGGT 
CCACAAAACGCTCTTTGTCAGTACCGTGGAGT^ 

GCTGAGATGAGAGAGCCC7VAGAAGAGGGCAAGACTTTGGCTTGGCTCTTTCGCTACAGCT 
GAAGAAGCAGCTATGGCTTATGATGAGGCTGCCTTGAAACTCTATGGGCACGACGCATAC 
CTCAACTTACCTCATCTTCAGCGGAATACAAGACCTTCTCTGAGTAACTCTCAGAGGTTC 
AAATGGGTACCTTCAAGGAAGTTTATATCTATGTTTCCTTCATGTGGTATGCTAAACGTG 
AATGCTCAGCCTAGTGTTCACATAATCCAGCAAAGACTAGAAGAACTCAAGAA7VACTGGA 
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CTTTTATCTCAATCCTATTCTTCTAGTTCTTCCTCCACCGAATCAAAAACTAATACTAGC 
TTTCTTGATGAGAAGACCAGCAAGGGAGAAACAGACAATATGTTCGAAGGTGGTGATCAG 
AAGAAACCAGAGATCGACCTGACCGAGTTTCTTCAGCAACTAGGAATCTTGAAGGATGAA 
AATGAAGCAGAACCAAGTGAGGTAGCAGAGTGTCATTCCCCTCCACCATGGAACGAGCAA 
GAAGAAACTGGAAGTCCTTTCAGAACTGAGAATTTCAGCTGGGATACCCTGATCGAGATG 
CCAAGAAGTGAAACCACAACTATGCAATTTGACTCCAGCAACTTCGGAAGCTATGATTTT 
GAGGATGATGTATCCTTCCCTTCCATCTGGGACTACTACGGAAGCTTAGATTGAGTAAAA 
GCAATTTAAGGTAGATCAAGATTCAGAAGTACACAAATGGTTTTGGATTTAGTGTAGCGT 
TTTGGAAAAGAGACATAGGTAGTGAGAGTGCAGTCTTTTATTATGCAGCAATAAAGTGAG 
TCAGTGTACAACCGAGTTGTTCGCTTTTTTTGGTATATTAATGAAGCATGTTCATTTTTT 
CGCTAAAAAAAAAAAAAAAAAAAAAAA 

>G1800 Amino Acid Sequence (domain in AA coordinates: TBD) 

MEKSSSMKQWKKGPARGKGGPQNALCQYRGWQRTWGKOTAEIREPKKRARLWLGSFATA 

EEAAMAYDEAALKLYGHDAYLNLPHLQRNTRPSLSNSQRFKWVPSRKFISMFPSCGMLNV 

NAQPSVHIIQQRLEELKKTGLLSQSYSSSSSSTESKTNTSFLDEKTSKGETDNMFEGGDQ 

KKPEIDLTEFLQQLGILKDENEAEPSEVAECHSPPPWWEQEETGSPFRTENFSWDTLIEM 

PRSETTTMQFDSSNFGSYDFEDDVSFPSIWDYYGSLD* 

>G1806 (1..1356) 

ATGCAGAGCAGCTTCAA7^ACCGTTCCTTTCACTCCTGATTTCTACTCTCAATCCTCTTAC 
TTCTTCAGAGGAGATAGTTGTCTTGAGGAGTTTCATCAACCAGTCAATGGTTTTCACCAT 
GAAGAAGCTATCGATTTAAGTCCAAATGTCACTATTGCTTCAGCTAACTTACACTACACG 
ACGTTTGATACGGTTATGGATTGTGGTGGTGGTGGTGGTGGCTTGAGGGAGAGACTTGAA 
GGAGGAGAAGAGGAGTGTTTGGACACAGGGCAATTAGTGTACCAGAAAGGGACAAGATTA 
GTAGGAGGAGGAGTAGGAGAAGTGAACAGCAGTTGGTGTGATTCGGTTTCAGCTATGGCT 
GATAACAGTC7UVCATACTGACACTTCCACAGATATTGATACTGATGACAAGACTCAGTTG 
AATGGAGGTCATCAAGGGATGCTATTGGCTACAAATTGTTCAGATCAATCCAATGTGAAA 
TCTAGTGATCAAAGGACACTTCGTCGACTTGCTCAGAACCGGGAGGCTGCTAGGAAAAGT 
CGGTTGAGGAAAAAGGCCTATGTTCAGCAACTTGAGAATAGTCGAATCAGGCTTGCACAG 
CTAGAGGAAGAGCTCAAAAGAGCTCGCCAACAGGGATCTTTGGTTGAAAGAGGAGTTTCA 
GCGGATCACACGC^TTTGGCAGC^GGAAATGGTGTCTTTTCATTTGAATTGGAATATACA 
CGTTGGAAGGAGGAACATCAAAGAATGATCAACGACTTAAGATCGGGTGTGAATTCGCAG 
TTAGGTGACAACGATCTACGCGTTCTAGTGGATGCTGTGATGAGTCACTATGATGAAATA 
TTCAGGCTAAAGGGAATTGGCACTAAAGTTGAAGTCTTTCATATGCTCTCAGGCATGTGG 
AAGACACCTGCCGAGAGATTTTTCATGTGGTTAGGTGGATTTAGATCATCAGAGTTACTT 
AAGATATTGGGGAACCATGTGGATCCATTGACGGACCAGCAGTTGATAGGCATTTGCAAC 
CTTCAGCAATCGTCTCAACAAGCAGAGGATGCATTGTCACAAGGCATGGAAGCTCTACAA 
CAATCACTTCTCGAGACGCTTTCTTCTGCTTCTATGGGTCCAAACTCTTCAGCAAATGTT 
GCAGATTATATGGGTCATATGGCTATGGCTATGGGCAAACTTGGCACTCTTGAAAACTTC 
CTTCGCCAGGCTGATTTATTGAGGCAACAAACTCTGCAACAGCTTCACAGAATTCTCACC 
ACACGACAAGCTGCTCGCGCCTTTTTGGTCATCCACGATTATATTTCTCGGCTTAGAGCA 
CTTAGCTCTCTATGGTTAGCCAGACCTAGAGACTAA 

>G1806 Amino Acid Sequence (domain in AA coordinates 165-225) 

MQSSFKTWFTPDFYSQSSYFFRGDSCLEEFHQPVNGFHHEEAIDLSPNVTIASANLHYT 

TFDTVMDCGGGGGGLRERLEGGEEECLDTGQLVYQKGTRLVGGGVGEVNSSWCDSVSAMA 

DNSQHTOTSTDIDTDDKTQLNGGHQGMLLATNCSDQSNVKSSDQRTLRRLAQNREAARKS 

RLRKKAYVQQLENSRIRLAQLEEELKRARQQGSLVERGVSADHTHLAAGNGVFSFELEYT 

RWKEEHQRMINDLRSGVNSQLGDNDLRVLVDAVM^ 

KTPAERFFMWIiGGFRSSELLKILGNHVDPLTDQQLIGICNLQQSSQQAEDALSQGMEALQ 
QSLLETLS S ASMGPNS SANVADYMGHMAMAMGKLGTLENFIjRQADLLRQQTLQQIiHRILT 
TRQAARAFLVIHDYISRLRALSSLWLARPRD* 
>G1811 (93. .827) 

AAAGGAGCATTGGTATCTGAAACAATATTTGCC^ 

TTGCCATCTCTTTCTCTCTCCCTCTCTTT(^AATGTCAATAAACCAATACTCAAGCGATT 
TCCACTAC(^TTCTCTCATGTGGCAACAACAGC^ 

TCGTGGAAGAAAAAGAAGCTCTTTTCGAGAAACCCTTAACCCOU^GTGACGTCGGAAAAC 
TCAACCGCCTCGTCATCCCAAAACAGCACGCCGAGAGATACTTCCCACTAGCGGCCGCCG 
CCGCAGACGCCGTGGAGAAAGGACTTCTCCTCTGCTTTGAGGACGAGGAAGGTAAACCAT 
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GGAGATTCAGATACTCGTACTGGAACAGTAGCCAGAGTTATGTCTTGACCAAAGGCTGGA 
GCAGATACGTCAAGGAGAAGCACCTTGACGCCGGAGACGTCGTTCTCTTCCATCGACACC 
GTTCAGACGGCGGAAGATTCTTCATTGGCTGGAGAAGACGCGGTGACTCTTCTTCCTCCT 
CCGACTCTTATCGCCATGTTCAATCCAATGCCTCGCTCCAATATTATCCTCATGCAGGGG 
CTCAAGCGGTGGAGAGCCAAAGAGGCAACTCGAAGACATTAAGACTGTTCGGAGTGAACA 
TGGAGTGCCAGCTAGATTCGGACTGGTCCGAGCCATCCACACCTGACGGTTCTAACACAT 
ATACAACCAATCACGACCAGTTTCATTTCTACCCTCAACAACAACACTATCCTCCTCCGT 
ACTACATGGACATAAGTTTCACAGG AGATATGAAC CGGACG AGCTAG AAG CCCACAAGGA 
TTAAAAAAAAGCTTCACATCTGGTCCTGTTATGTTGTCATAGATGTTGATTCCTTAATTT 
TACACAAGCTTCATTTTGCATTATTTAAAGTAAAATCGTATTTTGATTCTTCTTTAAATC 
TCTCTCAATTTTCACTCTCTTCCTTTTTCTTCTTATGTATTAGATTCTTTTACATAGCTA 
ACACTTGTATAGAGAATTCAAAGTTCTGGCTATTTTCGAAAGTTATCTTTTCTCTTAAAA 
AAAAAAA 

>G1811 Amino Acid Sequence (domain in AA coordinates: TBD) 
MS INQYSSDFHYHSLMWQQQQQQQQHQNDVVEEKEALFEKPLTPSDVGKLNRLVI PKQHA 
ERYFPIAAAAADAVEKGLLLCFEDEEGKPWRFRYSYWNSSQSYVLTKGWSRYVKEKHLDA 
GDWLFHRHRSDGGRFFIGWRRRGDSSSSSDSYRHVQSNASLQYYPHAGAQAVESQRGNS 
KTLRLFGVNMECQLDSDWSEPSTPDGSNTYTTNHDQFHFYPQQQHYPPPYYMDISFTGDM 

NRTS* 

>G182 (74.. 1366) 

CGTCGACGATCAGATTCTTGCGTATAGCTGTATATATACACCAAGATACACTCATCATCG 

TCATATATAGATTATGTGCAGCGTCTCTGAGCTTCTTGACATGGAAAACTTCCAAGGAGA 

CTTAACCGACGTCGTACGAGGAATCGGAGGCCACGTGTTATCACCGGAGACTCCTCCCTC 

GAACATCTGGCCTCTTCCTCTGTCACATCCAACACCATCACCGTCAGATCTTAACATAAA 

CCCCTTCGGAGATCCCTTTGTGAGCATGGACGATCCACTCCTCCAAGAACTAAACTCCAT 

CACAAACTCCGGCTATTTCTCCACCGTAGGAGATAACAACAACAACATTCACAACAACAA 

TGGTTTCTTGGTTCCAAAGGTATTTGAGGAGGATCATATAAAGAGTCAATGTAGTATCTT 

CCCAAGAATCCGGATCTCGCATAGTAACATCATCCACGATTCTTCTCCGTGTAATTCTCC 

GGCCATGTCGGCTCACGTTGTCGCAGCCGCAGCAGCCGCCTCGCCGAGAGGCATCATCAA 

CGTAGACACAAACAGTCCTAGAAACTGTCTATTGGTTGATGGTACCACGTTCTCCTCGCA 

GATTCAGATATCTTCCCCTCGGAATCTAGGCCTTAAAAGAAGGAAGAGTCAGGCAAAGAA 

GGTGGTGTGTATTCCGGCCCCGGCTGCAATGAACAGCCGATCAAGCGGAGAAGTGGTTCC 

ATCGGATCTATGGGCTTGGCGTAAATACGGTCAAAAACCTATCAAAGGCTCTCCTTTTCC 

AAGGGGTTATTATAGATGCAGCAGCTCAAAAGGTTGTTCAGCAAGAAAGCAAGTCGAAAG 

AAGCCGAACCGATCCAAACATGTTGGTGATTACATATACCTCCGAACATAACCATCCTTG 

GCCCATCCAACGCAACGCTCTCGCCGGCTCCACACGCTCCTCCACCTCCTCCTCATCTAA 

CCCTAATCCTTCCAAACCCTCAACCGCAAACGTAAACTCCTCATCCATTGGCTCCCAAAA 

CACCATCTACTTGCCTTCCTCCACCACTCCTCCTCCTACCCTCTCATCCTCCGCCATCAA 

AGATGAACGAGGGGACGATATGGAGTTGGAAAACGTAGATGATGATGATGATAACCAGAT 

TGCTCCATACAGACCGGAGCTTCATGATCATCAGCACCAACCAGATGATTTCTTTGCAGA 

TCTTGAAGAGCTAGAAGGAGATTCTCTAAGCATGTTGCTTTCTCATGGCTGTGGCGGCGA 

CGGGAAGGATAAAACGACCGCGTCCGATGGGATCAGCAATTTCTTCGGGTGGTCGGGAGA 

TAATAATTATAATAATTACGACGACCAAGACTCAAGGTCGTTATAGTATAGTGTTAATTA 

CAGGTAAACAAATTATATTAAATTAAGTTGAGCH^ 

GTCAGGTTGGGGGC 

>G182 Amino Acid Sequence (conserved domain in AA coordinates :217-276) 

MCSVSELLDMENFQGDLTDVVRGIGGHVIiSPETPPSNIWPLPLSHPTPSPSDLNINPFGD 

PFVSMDDPLLQELireiTNSGYFSTVGD*^^ 

ISHSNIIHDSSPCNSPAMSAHWAAAAAASPRGIINVDTNSPRNCLLVDGTTFSSQIQIS 

SPRNLGLKRRKSQAKKWC I PAPAAMNSRS SGEWPSDLWAWRKYGQKP I KGSPFPRGYY 

RCSSSKGCSARKQVERSRTDPNMLVITYTSEHNHPWPIQRNALAGSTRSSTSSSSNPNPS 

KPSTANTOSSSIGSQNTIYLPSSTTPPPTLSSSAIKDERGDDMELENVDDDDDNQIAPYR 

PELHDHQHQPDDFFADLEELEGDSLSMLLSHGCGGDGKDKTTASDGISNFFGWSGDNNYN 

NYDDQDSRSL* 

>G1835 (1..969) 

ATGATTGGAACAAGCTTCCCCGAGGATCTTGATTGTGGCAACTTCTTTGACAACATGGAT 
GATCTCATGGACTTTCCCGGTGGAGATATCGATGTCGGTTTCGGCATAGGTGACTCCGAC 
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TCTTTCCCTACCATCTGGACCACTCATCACGACACGTGGCCTGCCGCTTCTGATCCTCTC 
TTCTCTTCCAACACCAACTCTGATTCATCACCTGAGCTCTATGTTCCGTTTGAGGACATT 
GTTAAGGTGGAAAGACCTCCAAGCTTTGTAGAGGAAACATTGGTTGAGAAGAAGGAAGAT 
TCGTTTTCGACAAACACTGATTCATCATCTTCTCATAGCCAATTCAGGAGCTCAAGTCCA 
GTGTCGGTTCTCGAGAGCAGCTCCTCCTCGTCTCAAACCACCAACACAACCTCCCTTGTT 
CTCCCTGGAAAGCACGGTCGTCCACGCACAAAACGCCCTCGTCCACCTGTCCAGGATAAA 
GATAGAGTCAAAGACAATGTGTGCGGTGGTGACTCGCGCCTCATCATTAGAATACCGAAA 
CAGTTTCTCTCTGATCACAACAAGATGATCAACAAGAAGAAGAAGAAGAAGGCCAAGATT 
ACTTCTTCCTCTTCTTCGTCCGGGATTGATCTTGAAGTCAATGGAAACAACGTCGATTCG 
TATTCTTCAGAGCAATATCCGCTTAGGAAATGTATGCACTGTGAGGTCACCAAGACTCCA 
CAGTGGAGGCTTGGTCCAATGGGTCCAAAGACACTTTGCAATGCGTGCGGTGTACGTTAC 
AAATCAGGGAGGCTTTTCCCGGAGTACCGTCCAGCTGCTAGTCCAACATTTACTCCAGCT 
CTTCACTCAAACTCACACAAGAAAGTGGCTGAAATGAGAAACAAGAGATGCAGTGATGGT 
AGCTAC^TAACCGAAGAGAATGATCTGCAAGGGCTGATTCCGAACAATGCCTACATTGGC 
GTAGACTAA 

>G1835 Amino Acid Sequence (domain in AA coordinates: 224-296) 

MIGTSFPEDLDCGNFFDNMDDL^FPGGDIDVGFGIGDSDSFPTIWTTHHDTWPAASDPL 

FSSNTNSDSSPELYVPFEDIVKVERPPSFVEETLVEKKEDSFSTNTDSSSSHSQFRSSSP 

VSVLESSSSSSQTTNTTSLVLPGKHGRPRTKRPRPPVQDKDRVKDNVCGGDSRLIIRIPK 

QFLSDHNKMINKKJOCKKAKITSSSSSSGIDLEVNGNNVDSYSSEQYPLRKCMHCEVTKTP 

QWRLGPMGPKTLCNACGVRYKSGRLFPEYRPAASPTFTPALHSNSHKKVAEMRNKRCSDG 

SYITEENBLQGLIPNNAYIGVD* 

>G1836 (47.. 610) 

ATAACAAGCCTAGAACACTAGAAACTTCAAAAAAGAAAAAAATCTTATGGAGAACAACAA 
CGGCAACAACCAGCTGCCACCGAAAGGTAACGAGCAACTGAAGAGTTTCTGGTCAAAAGA 
GATGGAAGGTAACTTAGATTTCAAAAATCACGACCTTCCTATAACTCGTATCAAGAAGAT 
TATGAAGTATGATCCGGATGTGACTATGATAGCTAGTGAGGCTCCAATCCTCCTCTCGAA 
AGCATGTGAGATGTTTATCATGGATCTCACGATGCGTTCGTGGCTCCATGCTCAGGAAAG 
CAAACGAGTCACGCTACAGAAATCTAATGTCGATGCCGCAGTGGCTCAAACTGTTATCTT 
TGATTTCTTGCTTGATGATGACATTGAGGTAAAGAGAGAGTCTGTTGCCGCCGCTGCTGA 
TCCTGTGGCCATGCCACCTATTGACGATGGAGAGCTGCCTCCAGGAATGGTAATTGGAAC 
TCCTGTTTGTTGTAGTCTTGGAATCCACCAACCACAACCACAAATGCAGGCATGGCCTGG 
AGCTTGGACCTCGGTGTCTGGTGAGGAGGAAGAAGCGCGTGGGAAAAAAGGAGGTGACGA 
CGGAAACTAATAAGTGGAATACGTTTTAGGGTATTTTCAAGGGAATATGTAGTAT^ATAGT 
CATGGATC 

>G1836 Amino Acid Sequence (domain in AA coordinates: 30-164) 
MENNNGNNQLPPKGNEQLKS FWSKEMEGNLDFKNHDL PITRIKKIMKYDPDVTM IAS EAP 
ILLSKACEMFIMDLTMRSWLHAQESKRVTLQKSl^ 

AAAADPVAMPPIDDGELPPGMVIGTPVCCSIiGIHQPQPQMQAWPGAWTSVSGEEEEARGK 
KGGDDGN* 

>G1838 (132. .1628) 

TTCCTTGGCATTCTCTTTAGAACTTTCGTACAAAATGCAAAACCTGAACCTCTAAAGCTA 
AAAAAAT^GATTAGAGACTGTAACTGCTTTTATCAGATTTTCAACTAGGAAAAAAGTTAC 
AATCTTTTTTGATGGCTCCTCCAATGACGAATTGCTTAACGTTTTCTCTGTCACCAATGG 
AGATGTTGAAATCAACTGATCAGTCTCACTTCTCTTCTTCTTACGACGATTCTTCTACTC 
CTTATCTCATCGATAACTTCTATGCTTTCAAAGAAGAAGCTGAGATAGAAGCTGCTGCTG 
CrrTCTVATGGCGGATTCAACAACCTTATCTACTTTTTTCGATCATTCTCAGACTCAGATTC 
CAAAGCTGGAAGAraTCCTCGGTGATTCCTTTGTCCGTTACTCTGATAACCAAACAGAGA 
CCC^GACTCTTCTTCTCTCACTCCATTCTACGATCCACGTCACCGCACCGTTGCCGAAG 
GAGTTACAGGGTTCTTCTCTGATCATCATCAGCCAGATTTCAAGACGATAAACTCGGGAC 
CAGAAATCTTCGATGACTCAACAACTTCCAACATCGGTGGTACTCATCTCTCCAGTCACG 
TGGTGGAGTCATCAACGACGGCGAAGTTAGGGTTTAACGGTGATTGCACCACCACCGGAG 
GAGTTTTGTCTCTAGGGGTTAACAACACATCAGATGAACCT^^ 

AG AGAG GTG GAAACAGTAAC AAGAAGAAAACAGTTTCTAAGAAGGAAACATCAGATGATT 
CAAAGAAGAAGATTGTCGAAACATTGGGACAAAGAACTTCAATTTATCGTGGAGTCACCC 
GACATAGATGGACTGGAAGATACGAAGCGCATCTATGGGATAACAGCTGTAGGAGGGAAG 
GTCAAGCCAGAAAAGGACGTCAAGTGTACTTAGGTGGATATGACAAGGAAGATAGAGCAG 
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CTAGAGCCTATGACTTGGCAGCTTTAAAATACTGGGGTTCTACTGCTACTACAAATTTTC 
CGGTCTCGAGTTATTCAAAAGAACTTGAGGAAATGAATCACATGACCAAGCAAGAGTTTA 
TTGCATCTCTTAGGAGGAAAAGTAGCGGTTTTTCGAGAGGAGCTTCAATATATAGAGGTG 
TCACAAGGCATCATCAACAAGGTCGCTGGCAAGCAAGAATCGGCCGTGTCGCAGGAAACA 
AAGATCTTTACCTCGGAACCTTTGCAACCGAAGAGGAAGCAGCAGAGGCTTATGACATTG 
CAGCCATAAAGTTCAGAGGAATCAACGCAGTAACTAACTTTGAGATGAACAGGTATGACA 
TTGAAGCTGTCATGAATAGTTCTTTACCTGTAGGAGGAGCAGCTGCGAAACGCCACAAAC 
TCAAACTCGCTCTTGAATCTCCTTCTTCATCATCCTCTGACCATAACCTCCAACAACAAC 
AGTTGCTTCCGTCCTCTTCTCCCTCGGATCAAAACCCTAACTCAATCCCATGTGGCATTC 
CATTTGAGCCTTCAGTTCTCTATTACCACCAGAACTTCTTTCAGCATTATCCTTTGGTCT 
CTGACTCTACAATTCAAGCTCCTATGAACCAAGCTGAGTTTTTCTTGTGGCCTAACCAGT 
CTTACTAAATCATTTGGTTCGTTCTTGCTTAGACTTCTATTCACCGCACTAACCGATGAC 
CCGAGGCTTATCTTCTTGATTCTGGCTATAAGGATGAATCTTTCAAGTTCCTTTTTT7^AC 
TGTAGGCTAAGACAGAAGTAGAGGGGAGAAAAGTTGAAGAATCTGAAACTTTTGGGGTCA 
ATTTTGTATTAATGTTTTTCTTTTGTCAAGGGTGGATTATCGGTTTTATTACTTATTTTT 
TGAATGTAATCGGCCTATAACGGTATAACTCTGTTTCCATTTATGAATATTTTTCTCAAA 
TTGAAAAAAAAAAAAAAAAAA 

>G1838 Amino Acid Sequence (conserved domain in AA coordinates : 229-305 , 330-400) 

MAPPMTNCLTFSLSPMEMLKSTDQSHFSSSYDDSSTPYLIDNFYAFKEEAEIEAAAASMA 

DSTTLSTFFDHSQTQIPKLEDFLGDSFVRYSDNQTETQDSSSLTPFYDPRHRTVAEGVTG 

FFSDHHQPDFKTINSGPEIFDDSTTSNIGGTHLSSHWESSTTAKIiGFNGDCTTTGGVLS 

LGVNNTSDQPLSCNNGERGGNSNKKKTVSKKETSDDSKKKIVETLGQRTSIYRGVTRHRW 

TGRYEAHLWDNSCRREGQARKGRQVYLGGYDKEDRAARAYDIJUVLKYWGSTATTNFPVSS 

YSKELEEMNHMTKQEFIASLRRKSSGFSRGASIYRGVTRHHQQGRWQARIGRVAGNKDLY 

LGTFATEEEAAEAYDIAAIKFRGINAVTNFEMNRYDIEAVMNSSLPVGGAAAKRHKLKLA 

LESPSSSSSDHNLQQQQLLPSSSPSDQNPNSIPCGIPFEPSVLYYHQNFFQHYPLVSDST 

IQAPMNQAEFFLWPNQSY* 

>G1843 (51.. 653) 

CAGACATCACAATCAAATTAGGTCAGAAGAATTAGTCGGAGAAAACAGCCATGGGAAGAA 
GAAAAGTAGAGATCAAACGAATTGAGAACAAAAGCTCTCGACAAGTTACTTTCTGTAAAC 
GACGAAATGGTCTCATGGAGAAAGCTCGTCAACTCTCAATTCTTTGTGAATCCTCCGTCG 
CTCTTATCATCATCTCTGCCACCGGAAGACTCTACAGCTTCTCCTCAGGTGATAGCATGG 
CCAAGATCCTCAGTCGTTATGAATTAGAACAGGCTGATGATCTTAAAACCTTGGATCTAG 
AAGAAAAAACTCTTAATTATCTTTCGCACAAGGAGTTGCTAGAAACAATCCAATGCAAGA 
TTGAAGAAGCGAAAAGCGATAATGTAAGTATAGATTGTCTAAAGTCCCTGGAAGAGCAGC 
TCAAGACTGCTCTGTCTGTAACTAGAGCTAGGAAGACAGAACTAATGATGGAGCTTGTGA 
AGACCCATCAAGAGAAGGAGAAGCTGCTGAGAGAGGAGAACCAGAGTTTGACTAACCAGC 
TTATAAAGATGGGGAAGATGAAGAAGTCTGTGGAAGCAGAGGATGCAAGAGCAATGTCAC 
CGGAAAGTAGCTCTGACAACAAGCCACCGGAGACTCTCCTGCTTCTCAAGTAACCACCAT 
CACCAACGACTGATTCGAAAAATAAAAATTGTAAAAATTATGATTTGTAGTTCATAAGGA 
AAGCTAC^TACTGTATGTTAAA7U\TCCTCTTCTTCCCCCrGCTACGGAAAAGT(^TC(^A 
GGAGATGCATCAAATAAAGTAATTGATTTTTATTGTTA 

>G1843 Amino Acid Sequence (domain in AA coordinates: 2-57) 
MGRRKVE I KRIENKS SRQ VTFCKRRNGLMEKARQLS I LCESS VALI 1 1 S ATGRLYS FS SG 
DSMAKILSRYELEQADDLKTLDLEEKTIJ^ 

EEQLKTALS VTRARKTELMMELVKTHQEKEKLLREENQSLTNQL I KMGKMKKS VE AEDAR 

AMSPESSSDNKPPETLLLLK* 

>G1853 (1..186-6-) 

ATGAGAGGTTCTTGGTACAAGAGTGTTTCCTCTGTTTTTGGTCTCAGACCACGGATCAGA 

TCTAATTCGTATGATTCTTCGTCAAGTTCGACACTTGTGCCGAACATTTATAGTAACTAT 
AGGAGGATAAAGGAGCAAGCTGCTGTTGATTATCTTGATCTGAGGTCTCTTTCTTTAGGG 
GCTAGTTTAT^GAGTTTCCTTTTTGTGGTAAAGAAAGAGAAAGTTATGTGCCTTGTTAT 
AA(^TAACTGGGAATTTGCTTGCTGGGCTTCAAGAGGGTGAGGAGTTAGATCGACATTGC 
GAGTTTGAAAGAGAGAAGGAAAGATGTGTAGTTCGTCCTCCGAGAGATTATAAAATACCA 
CTTAGGTGGCCACTTGGTAGAGATATCATATGGAGTGGGAACGTGAAGATTACCAAAGAC 
(^GTTTCTTTCTTC^GGAACTGTGACAACGAGGTTAATGTTGCTTGAAGAGAATCAAATA 
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ACCTTTCACTCGGAGGACGGCCTGGTCTTTGATGGGGTCAAAGACTATGCTCGTCAAATT 

GCTGAGATGATAGGTTTAGGAAGTGATACTGAATTTGCTCAAGCGGGTGTACGGACTGTG 

TTAGACATTGGTTGCGGATTTGGTAGCTTTGGTGCTCATTTAGTGTCTTTGAAGCTGATG 

CCTATATGTATTGCTGAGTATGAGGCAACTGGGAGCCAAGTTCAGTTAGCTCTAGAGAGA 

GGCCTTCCTGCAATGATTGGCAATTTCTTTTCAAAACAGCTTCCTTATCCAGCACTGTCT 

TTTGATATGGTCCATTGTGCTCAATGTGGCACTACTTGGGATATCAAAGATGCAATGCTA 

CTTTTGGAAGTGGATCGTGTTCTGAAACCCGGGGGATACTTTGTTTTAACTTCTCCCACA 

AACAAAGCACAGGGAAACTTACCAGATACCAAGAAAACGAGCATCTCAACACGGGTGAAT 

GAGTTATCTAAGAAAATCTGTTGGAGTCTAACAGCTCAGCAGGATGAGACGTTTCTTTGG 

CAGAAAACTTCTGATTCAAGTTGCTATTCTTCTCGTTCGCAAGCTTCTATACCTCTTTGC 

AAAGATGGAGATAGCGTTCCGTATTACCACCCATTGGTTCCATGTATAAGCGGAACCACG 

AGTAAACGCTGGATTTCTATACAGAACAGGTCTGCTGTTGCAGGAACAACCTCTGCCGGG 

CTTGAAATTCATGGTTTAAAACCGGAAGAATTCTTCGAGGATACACAAATATGGAGATCA 

GCTCTGAAAAACTATTGGTCCTTGCTTACACCTCTAATTTTCTCTGACCATCCGAAGAGA 

CCCGGTGATGAGGATCCTCTCCCGCCTTTCAACATGATACGCAATGTGATGGACATGCAT 

GCTCGTTTTGGGAATTTAAATGCCGCTTTACTCGACGAAGGAAAATCTGCTTGGGTAATG 

AACGTCGTCCCAGTCAATGCACGTAATACTCTTCCTATCATACTTGATCGTGGTTTCGCC 

GGTGTTCTACATGACTGGTGTGAACCATTCCCGACATATCCTCGAACATATGACATGCTT 

CATGCCAATGAACTTCTCACACATCTTAGCTCAGAACGATGCAGCCTAATGGACTTGTTC 

TTGGAGATGGACCGGATTCTTCGCCCTGAGGGATGGGTTGTTCTAAGCGACAAAGTGGGA 

GTAATCGAGATGGCTCGAGCACTTGCAGCTCGAGTGCGTTGGGAAGCAAGAGTCATTGAT 

CTTCAAGATGGTAGTGACCAAAGACTTCTCGTCTGTCAAAAACCATTCATCAAAAAATAA 

>G1853 Amino Acid Sequence (domain in AA coordinates: entire protein) 

MRGSWYKSVSSVFGLRPRIRGLLFFIVGWALVTILAPLTSNSYDSSSSSTLVPNIYSNY 

RRIKEQAAVDYLDLRSLSLGASLKEFPFCGKERESYVPCYNITGNLLAGLQEGEELDRHC 

EFEREKERC WRPPRD YKI PLRWPLGRD I IWSGNVKI TKDQFLS SGTVTTRLMLLEENQI 

TFHSEDGLVFDGViaDYARQIAEMIGLGSDTEFAQAGVRTVLDIGCGFGSFGAHLVSLKLM 

PICIAEYEATGSQVQLALERGLPAMIGNFFSKQLPYPALSFDMVHCAQCGTTWDIKDAML 

LLEVDRVLKPGGYFVLTSPTNKAQGNLPDTKKTSISTRVNELSKKICWSLTAQQDETFLW 

QKTSDSSCYSSRSQASIPLCKDGDSVPYYHPLVPCISGTTSKRWISIQNRSAVAGTTSAG 

LEIHGLKPEEFFEDTQIWRSALKimJSLLTPLIFSDHPKRPGDEDPLPPFNMIRNVl^MH 

ARFGNLNAALLDEGKSAWVMNVVPVNARJSITLPIIL^ 

HANELLTHLSSERCSLMDLFLEI^RILRPEGVAA/LSDKVGVIEMARAIiAARVRWEARVID 
LQDGSDQRLLVCQKPF I KK* 
>G1855 (1..1902) 

ATGGCGAAAGAGAACAGTGGTCATCATCACCAAACAGAAGCAAGAAGAAAGAAACTAACT 
TTGATTCTTGGTGTAAGTGGACTCTGCATTTTGTTCTATGTTTTAGGTGCATGGCAAGCC 
AATACCGTCCCATCTTCTATCTCGAAGCTCGGATGCGAGACGCAATCAAACCCTTCTTCG 
TCCTCTTCCTCTTCCTCATCTTCAGAGTCAGCTGAACTAGATTTCaAAAGCCATAATC^G 
ATTGAGTTAAAGGAAACAAACCAAACCATTAAGTACTTTGAACCATGTGAATTATCTCTC 
AGTGAGTACACTCCTTGTGAAGACCGACAAAGAGGAAGAAGATTCGATAGGAACATGATG 
AAATATAGAGAAAGACATTGTCCTGTAAAAGATGAGCTTCTTTATTGTTTGATTCCTCCT 
CGACCAAACTACAAGATTCCATTTAAATGGCCACAAAGTAGAGACTATGCTTGGTATGAC 
AATATCCCTCACAAGGAACTTAGTGTTGAGAAAGCAGTTCAAAACTGGATTCAAGTTG^ 
GGTGACCGCTTTAGATTCCCTGGTGGTGGTACTATGTTTCCTCGTGGAGCTGATGCTTAT 
ATCGATGATATTGCTAGGCTTATTCCTCTTACTGATGGTGGAATCAGAACAGCTATTGAC 
ACTGGATGTGGTGTTGCAAGTTTTGGTGCTTACCTCTT^ 

TCTTTTGCTCCAAGAGACACTCATGAAGCTCAGGTACAGTTTGCTTTAGAACGCGGAGTT 

CCTGCGATAATCGGGATTATGGGATCAAGAAGACTTCCTTATCC^ 

CTTGCTCATTGTTCTCGTTGTTTGATCCCTTGGTTTAAAAATGA 

GAGGTCGACCGGGTTTTAAGACCGGGCGGTTACTGGATCCTCTCGGGACCACCGATTAAC 
TGGAAACAGTACTGGAGAGGGTGGGAGAGAACAGAGGAGGATTTGAAGAAAGAGCAAGAT 
TCAATAGAAGATGTAGCAAAGAGTCTTTGCTGGAAGAAAGTAACTGAAAAAGGTGACTTA 
TCAATTTGGCAAAAGCCTCTCAATCACATTGAGTGT^ 

TC^CCTCCGATATGCAGCTCAGATAACGCGGATTCCGCTTGGTACAAAGACTTGGAAACT 
TGTATAACACCATTACCAGAAACAAACAATCCAGATGATTCAGCAGGCGGTGCACTCGAG 
GATTGGCCAGACCGAGCATTCGCGGTACCTCCAAGAATCATCAGAGGAACTATACCAGAA 
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ATGAACGCGGAGAAATTTAGAGAAGACAACGAGGTTTGGAAAGAGAGAATAGCACATTAC 
AAGAAGATAGTCCCTGAGCTTTCACATGGAAGATTCAGGAACATTATGGACATGAACGCT 
TTTCTCGGCGGATTCGCTGCTTCCATGCTGAAATATCCCTCATGGGTCATGAACGTTGTC 
CCGGTCGATGCAGAGAAACAAACGTTAGGTGTGATCTACGAACGTGGATTGATAGGGACG 
TATCAAGATTGGTGTGAAGGATTCTCAACGTATCC7VAGAACTTATGATATGATTCATGCA 
GGAGGATTGTTCAGCTTATACGAACATAGGTGTGATTTGACGTTGATATTGTTGGAGATG 
GATCGAATTTTGAGACCAGAAGGAACAGTTGTGTTGAGAGATAATGTGGAGACGTTGAAT 
AAGGTAGAGAAGATAGTGAAGGGAATGAAGTGGAAGAGTCAAATTGTTGATCATGAGAAA 
GGTCCTTTTAATCCTGAGAAGATTCTTGTTGCTGTTAAAACTTATTGGACTGGTC7^ACCT 
TCTGACAAGAACAACAACAACAACAACAACAACAACAACTAG 

>G1855 Amino Acid Sequence (domain in AA coordinates: entire protein) 

MAKENSGHH'HQTEARRKKLTLILGVSGLCILFYVLGAWQANTVPSSISKLGCETQSNPSS 

SSSSSSSSESAEliDFKSHNQIELKETNQTIKYFEPCELSLSEYTPCEDRQRGRRFDRNMM 

KYRERHC PVKDELL YCL I P P P PNYK I P F KW PQSRD Y AW YDN I PHKEL S VE KAVQNW I QVE 

GDRFRFPGGGTMFPRGADAYIDDIARLIPIiTDGGIRTAIDTGCGVASFGAYLLKRDIMAV 

SFAPRDTHEAQVQFALERGVPAIIGIMGSRRLPYPARAFDIiAHCSRCLIPWFIOvroGLYLM 

EVDRVLRPGGYWILSGPPINWKQYWRGWERTEEDL.KKEQDSIEDVAICSLCWKKVTEKGDL 

SIWQKPLNHIECKKLKQNNKSPPICSSDNADSAWYKDLETCITPLPETNNPDDSAGGALE 

DWPDRAFAVPPRIIRGTIPEMNAEKFREDNEVWKERIAHYKKIVPELSHGRFRNIMDMNA 

FLGGFAASMLICYPSWVMNVVPVDAEKQTLGVIYERGLIGTYQDWCEGFSTYPRTYDMIHA 

GGLFSLYEHRCDLTLILLEMDRILRPEGTWLRDNVETLNKVEKIVKGMKWKSQIVDHEK 

GPFNPEKILVAVKTYWTGQPSDKNKTtmi^^ 

>G187 (118., 1074) 

TAGACCTCTTAGGAAAAAAACCTAAAAACCTAATCCCCAAACCTAAAAGGCTTATCTCAT 
CTCTTCTTCTTTGTCTTCTTTACTCTTTTTTTACCTCTCTCTTCATTGTTCTTCACCATG 
TCTAATGAAACCAGAGATCTCTACAACTACCAATACCCTTCATCGTTTTCGTTGCACGAA 
ATGATGAATCTGCCTACTTCAAATCCATCTTCTTATGGAAACCTCCCATCACAAAACGGT 
TTTAATCCATCTACTTATTCCTTCACCGATTGTCTCCAAAGTTCTCCAGCAGCGTATGAA 
TCTCTACTTCAGAAAACTTTTGGTCTTTCTCCCTC 

ATCGATCAAGAACCGAACCGTGATGTTACTAATGACGTAATCAATGGTGGTGCATGCAAC 

GAGACTGAAACTAGGGTTTCTCCTTCTAATTCTTCCTCTAGTGAGGCTGATCACCCCGGT 

GAAGATTCCGGTAAGAGCCGGAGGAAACGAGAGTTAGTCGGTGAAGAAGATCAAATTTCC 

AAAAAAGTTGGGAAAACGAAAAAGACTGAGGTGAAGAAACAAAGAGAGCCACGAGTCTCG 

TTTATGACTAAAAGTGAAGTTGATCATCTTGAAGATGGTTATAGATGGAGAAAATACGGC 

CAAAAGGCTGTAAAAAATAGCCCTTATCCAAGGAGTTACTATAGATGTACAACACAAAAG 

TGCAACGTGAAGAAACGAGTGGAGAGATCGTTCCAAGATCCAACGGTTGTGATTACAACT 

TACGAGGGTCAACACAACCACCCGATTCCGACTAATCTTCGAGGAAGTTCTGCCGCGGCT 

GCTATGTTCTCCGCAGACCTCATGACTCCAAGAAGCTTTGCACATGATATGTTTAGGACG 

GCAGCTTATACTAACGGCGGTTCTGTGGCGGCGGCTTTGGATTATGGATATGGACAAAGT 

GGTTATGGTAGTGTGAATTCAAACCCTAGTTCTCACCAAGTGTATCATCAAGGGGGTGAG 

TATGAGCTCTTGAGGGAGATTTTTCCTTCAATTTTCTTTAAGCAAGAGCCTTGATCGATC 

ATTGTTATAACTACATATATTATATATATTGAGAGAGAGAGGTAGAGAAAT^AAAAA 

>G187 Amino Acid Sequence (domain in AA coordinates: 172-228) 

MSNETRDLYNYQYPSSFSLHEMMNLPTSNPSSYGNLPSQNGFNPSTYSFTDCLQSSPAAY 

ES LLQKTFGLS PS S SEVFNS S IDQEPNRDVTNDVINGGACNETETRVSPSN S S S SEADHP 

GEDSGKSRRKRELVGEEDQISKKVGKTKKTEVKKQREPRVSFMTKSEVDHLEDGYRWRKY 

GQKAVKNSPYPRSYYRCTTQKCNVKXRVERSFQDPTWITTYEGQHNHPIPTNLRGSS^ 

AAMFSADLMTPRS FAHDMFRTAAYTNGGSVAAALDYGYGQSGYGSVNSNPS SHQVYHQGG 

E YELLRE I FPS I FFKQEP * 

>G1881 (1. .519) * 

ATGCGAATTTTGTGTGATGCTTGTGAGAGCGCCGCCGCTATCGTCTTTTGCGCCGCCGAC 
GAAGCTGCCCTCTGTTGCTCCTGCGACGAAAAAGTTCATAAGTGCAACAAGCTGGCTAGT 
CGGCATCTTCGTGTAGGCTTAGCTGATCCGAGTAATGCACCAAGCTGTGACATATGCGAA 
AATGCACCCGCATTCTTTTACTGTGAGATAGATGGTAGTTCCCTTTGTCTACAATGTGAT 
ATGGTGGTACATGTTGGTGGGAAGAGAACACATAGGCGGTTTCTATTACrrGAGACAGAGA 
ATTGAGTTTCCAGGCGATAAGCCTAATCATGCTGACCAACTGGGACTACGGTGTCAAT^AG 
GCTTCCTCTGGTCGTGGTCAAGAATCAAATGGGAATGGTGATCATGATCATAATATGATC 
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GATCTT/^ACTCCAATCCTCAAAGAGTACACGAGCCTGGATCACATAACCAAGAGGAGGGT 
ATTGATGTAAATAACGCAAACAATCACGAGCATGAATAG 

>G1881 Amino Acid Sequence (domain in AA coordinates : 5-28 , 56-79) 

MRILCDACESAAAIVFCAADEAALCCSCDEKVHKCNKLASRHLRVGLADPSNAPSCDICE 

NAPAFFYCEIDGSSLCLQCDMVVHVGGICRTHRRFLLLRQRIEFPGDKPNHADQLGLRCQK 

ASSGRGQESNGNGDHDHNMIDLNSNPQRVHEPGSHNQEEGIDVNNANNHEHE* 

>G1882 (1..1200) 

ATGGTTTTTTCTTCATTTCCTACTTATCCTGATCATTCATCAAACTGGCAACAACAACAT 
CAACCAATCACAACCACCGTTGGATTCACGGGAAATAACATCAACCAACAGTTTCTTCCT 
CACCATCCCCTCCCACCGCAACAGCAACAAACGCCTCCGCAGCTTCACCACAACAACGGT 
AACGGCGGAGTCGCTGTTCCCGGTGGACCTGGCGGGTTAATCCGACCAGGTTCGATGGCG 
G AAAG AG CAAGG CTAGCCAAC ATACCATTAC CTGAAAC AG CCTTG AAGTGTCC AAG ATGT 
GACTCAACTAACACCAAATTCTGTTACTTCAACAACTACAGTCTCACTCAACCTCGCCAC 
TTCTGCAAAGCATGCCGTCGTTACTGGACACGTGGCGGTGCTCTAAGGAGCGTTCCCGTC 
GGTGGCGGTTGCCGTAGAAACAAAAGAACCAAAAACAGCAGCGGTGGAGGTGGCGGTAGC 
ACCAGTAGCGGTAACAGCAAGTCACAAGACAGCGCCACGAGCAACGACCAATACCACCAC 
CGAGCCATGGCTAACAATCAGATGGGACCACCTTCTTCGTCATCGTCTCTAAGCTCGTTG 
CTGTCTTCTTACAACGCAGGGTTAATCCCCGGACATGATCATAACAGCAATAACAACAAC 
ATACTTGGACTTGGATCATCTTTGCCTCCTCTTAAGCTTATGCCTCCTTTAGACTTCACA 
GACAACTTCACCTTACAATACGGTGCCGTTTCAGCTCCTTCTTATCATATAGGCGGTGGA 
AGCAGTGGAGGAGCGGCGGCTCTTTTAAACGGTTTTGACCAGTGGAGATTCCCGGCAACA 
AACCAACTTCCTTTAGGCGGTTTAGACCCGTTTGATCAACAACATCAAATGGAGCAGCAG 
AATCCAGGTTACGGATTGGTTACCGGGTCGGGTCAGTATCGACCTAAGAACATTTTCCAT 
AACCTTATCTCCTCTTCTTCGTCTGCTTCATCAGCTATGGTTACAGCCACCGCGTCGCAA 
TTAGCTTCAGTGAAAATGGAAGATAGTAACAATCAGCTCAACTTGTCTAGACAACTTTTT 
GGAGACGAACAACAGCTCTGGAATATTCATGGCGCTGCTGCAGCATCCACCGCAGCTGCA 
ACAAGTTCGTGGAGTGAAGTCTCTAATAATTTCAGTTCTTCTTCTACTAGCAATATATAA 
>G1882 Amino Acid Sequence (domain in AA coordinates : 97-125) 
MVFSSFPTYPDHSSNWQQQHQPITTTVGFTGNWINQQFLPHHPLPPQQQQTPPQLHHNNG 
NGGVAVPGGPGGLIRPGSMAERARLANIPLPETALKCPRCDSTNTKFCYFNNYSLTQPRH 
FCKACRRYWTRGGALRSVPVGGGCRRNKRTKNSSGGGGGSTSSGNSKSQDSATSNDQYHH 
RAMAl^QMGPPSSSSSLSSLLSSYNAGLIPGroHNSN^ 

DNFTLQYGAVSAPSYHIGGGSSGGAAALLNGFDQWRFPATNQLPLGGLDPFDQQHQMEQQ 
NPGYGLVTGSGQYRPKNI FHNLI SSS S SAS S AMVTATASQLAS VKMEDSNNQLNLSRQLF 
GDEQQLWNIHGAAAASTAAATSSWSEVSNNFSSSSTSNI * 
>G1883 (1..1110) 

ATGGACGCTACGAAGTGGACACAGGGTTTTCAAGAAATGATGAACGTTAAACCAATGGAG 
CAGATCATGATTCCTAATAACAACACACATCAACCAAACACCACATCCAATGCAAGGCCA 
AACACCATTCTCACATCTAACGGCGTCTCAACTGCTGGAGCAACCGTCTCCGGCGTAAGC 
AACAACAATAACAATACGGCGGTTGTGGCGGAGAGGTUyVGCAAGACCACAAGAGAAACTA 
AATTGTCCAAGATGCAACTCAACCAACACAAAGTTTTGTTACTACAACAACTATAGTCTC 
ACACAACCAAGATACTTCTGCAAAGGTTGTCGAAGGTATTGGACCGAAGGTGGATCTCTT 
AGGAATGTTCCTGTGGGAGGAAGCTCAAGAAAGAACAAGAGATCATCTTCATCTTCTTCA 
TCAAA(^TCCTTC^GACAATACCATCIT 

TCAAACCAAATCCATAATAAATCGAAAGGGTCATCAC^GATCTCAACTTGTTGTCTTTC 
CCAGTCATGCAAGATCAACATCATGATCATGTCG^^ 

AAGATGGAGGGAAATGGTAACATAACTCATC^GCAGC^GCCTTCAT(^TCTTCTTCTGTC 
TATGGTTCCTCGTCQTCTCCTGTTTCAGCTCT 

TCTTCAAGATCAGGGATTAACTCATCGTTCATGCCTTCCGGTTCAATGATGGATTCAAAC 
ACTGTGCTTTACACTTCTTCAGGGTTTCCAACAATGGTGGATTACAAGCCAAGTAATCTC 
TCCTTCTCTACCGATCATCAAGGGCTTGGACACAATAGCAACAATAGGTCTGAAGCTCTT 
CATAGTGATCATCACCAACAAGGTAGAGTTTTGTTTC 

CTTTCATCAAGCATAACACAAGAAGTTGATCATGATGATAATC^ACAACAGAAGAGTCAT 
GGAAATAATAATAATAATAATAACTCAAGCCCTAATAATGGATATTGGAGTGGGATGTTC 
AGTACTACAGGAGGAGGATCTTCATGGTGA 

>G1883 Amino Acid Sequence (domain in aa coordinates: 82-124) 
mATKWTQGFQEMMNVKPMEQIMIPNNNTHQPNTTSNARPNTILTSN 



100 



BNSDOCID: < WO_0301 3227A2_I_> 



WO 03/013227 PCT/US02/25805 

101/286 



NNNNWTAVVAERKARPQEKLNCPRCNSTNTKFCYYNNYSLTQPRYFCKGCRRYWTEGGSL 

RNVPVGGSSRKNKRSSSSSSSNILQTIPSSLPDLNPPILFSNQIHNKSKGSSQDLNLLSF 

PVMQDQHHHHVHMSQFLQMPKMEGNGNITHQQQPSSSSSVYGSSSSPVSALELLRTGVNV 

SSRSGINSSFMPSGSMMDSNTVLYTSSGFPTMTOYKPSNLSFSTDHQGLGHNSNNRSEAL 

HSDHHQQGRVLFPFGDQMKELSSSITQEVDHDDNQQQKSHGNWNNNNNSSPNN 

STTGGGSSW* 

>G1884 (1..741) 

ATGATGACGTCATCCCATCAGAGCAACACCACCGGCTTTAAACCGCGGCGGATCAAGACG 
ACGGCGAAGCCACCACGTCAGATCAATAACAAAGAACCATCTCCGGCGACGCAGCCGGTG 
CTCAAGTGTCCGAGATGTGATTCAGTCAACACCAAATTCTGCTACTACAACAACTACAGC 
TTGTCTCAGCCACGTCACTACTGCAAGAACTGTCGTCGTTACTGGACACGTGGCGGCGCC 
CTCCGTAACGTTCCCATCGGTGGCTCCACTCGAAACAAGAACAAGCCTTGCAGCCTCCAA 
GTCATCTCTTCTCCTCCTTTGTTCTCGAACGGGACGTCATCGGCGTCTCGTGAGCTTGTA 
AGAAACCATCCATCGACGGCAATGATGATGATGAGTTCTGGTGGATTCTCCGGCTATATG 
TTTCCGTTGGATCCTAACTTCAACCTTGCCTCGTCTTCTATCGAGTCTTTGAGTTCTTTT 
AACCAAGATTTGCACCAGAAGCTTCAGCAACAAAGACTCGTCACTTCCATGTTTCTCCAA 
GATTCTCTTCCGGTTAACGAGAAAACGGTTATGTTTCAGAACGTAGAGTTGATTCCTCCT 
TCGACGGTGACGACGGATTGGGTTTTCGATAGGTTCGCCACTGGAGGAGGTGCAACAAGT 
GGCAATCATGAAGATAATGATGATGGGGAGGGTAATTTGGGAAATTGGTTCCATAATGCT 
AATAATAATGCTCTGCTCTAA 

>G1884 Amino Acid Sequence (domain in AA coordinates :43-71) 

MMTS SHQSNTTGFKPRRI KTTAKPPRQ INNKEPS PATQPVLKCPRCDS VNTKFC YYNNYS 

LSQPRHYCKNCRRYWTRGGALRNVPIGGSTRNKNKPCSLQVISSPPLFSNGTSSASRELV 

RNHPSTAMMMMSSGGFSGYMFPLDPNFNLASSSIESLSSFNQDLHQKLQQQRLVTSMFLQ 

DSLPWEKTVMFQNVELIPPSTVTTDWVFDRFATGGGATSGNHEDNDDGEGNLGNWFHNA 

NNNALL* 

>G1891 (1..750) 

ATGGATAACTTGAATGTTTTCGCAAATGAAGACAATCAAGTGAATGATGTGAAGCCCCCA 
CCACCACCACCTCGAGTGTGTGCAAGGTGTGATTCTGATAATACTAAATTTTGTTATTAC 
AACAACTACTGTGAGTTTCAGCCACGATACTTCTGCAAGAACTGTCGTAGATACTGGACT 
CATGGTGGGGCTTTAAGA^CATACCAATTGGTGGAAGTAGTCGTGCCAAACGGGCAAGG 
GTAAATCAACCTTCGGTTGCTCGGATGGTTTCTGTTGAGACCCAACGAGGTAACAATCAA 
CCTTTCTCTAATGTTCAAGAAAACGTTCATCTTGTTGGATCTTTTGGTGCTTCATCTTCA 
TCTTCTGTTGGTGCTGTTGGGAACCTTTTTGGTTCTTTGTATGATATTCATGGTGGTATG 
GTAACAAATTTGCATCCAACTCGAACTGTTCGACCAAATCATCGCTTAGCTTTCCATGAT 
GGATCATTTGAGCAAGACTATTACGATGTTGGGTCCGATAATCTTTTGGTCAACCAACAA 
GTTGGTGGCTACGGTTATCACATGAATCCAGTGGATCAATTCAAGTGGAACCAGAGCTTC 
AACAACACTATGAACATGAATTATAATAACGATAGCACTAGTGGAAGTAGCAGAGGATCT 
GACATGAATGTGAACCATGATAACAAGAAGATCAGATACCGCAACTCTGTGATTATGCAT 
CCTTGTCATCTGGAGAAGGATGGTCCTTGA 

>G1891 Amino Acid Sequence (domain in aa coordinates: 27-69) 
MDlfaNVFANEDNQViray^ 

HGGALRNIPIGGSSRAKRARWQPSVAR1WSVETQRGNNQPFSNVQENVHLVGSFGASSS 

S S VGAVGNLFGSLYDIHGGMVTNLHPTRTVRPNHRLAFHDGS FEQDYYDVGSDNLLW 

VGGYGYHMNPVDQFKWNQSFNNTMJ^^ 

PCHLEKDGP* 

>G1896 (1. .951) 

ATGTCCTCCCATAeeAATCTCCCCTCTCCCAAACCAGTTCCTAAACCAGATCACCGTATC 
TCCGGTACATCCCAAACCAAGAiU^CC^CCGTCTTCCTCCGTAGCTCAAGACCAACAAAAC 
CTAAAATGCCCTCGTTGCAACTCTCCAAAC^CAAAGTrCTC 

CTCTCTCAACCTCGTCACTTCTGCAAATCTTGTCGCCGTTACTGGACACGTGGCGGTGCT 

CTAAGAAACGTCCCCATCGGTGGTGGTTGCCGGAAAACC^^ 

TCCTCC^TGAACACACTTCCTTCGTCTTCTTCCT 

GAAGATTCATCCAAATTCTTCCCTCCTCCGACAACAATGGATTTTCAGCTGGCCGGATTA 
TCTCTCAACAAT^TGAACGATCTTCAACTTTTGAATAACCAAGAAGTTCTTGATCTTAGG 
CCCATGATGTCCTCGGGCCGAGAAAACACACCCGTTGATGTCGGGTCGGGTTTATCCCTA 
ATGGGTTTTGGAGATTTCAACAACAACCATTCACCGACGGGGTTCACAACCGCCGGAGCA 
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AGCGACGGAAACTTAGCTTCTTCTATAGAGACTTTGAGTTGTTTAAACCAAGATTTACAC 

TGGAGGCTTCAGCAACAGAGGATGGCGATGCTTTTTGGTAATTCTAAGGAAGAAACTGTT 

GTCGTCGAGAGGCCACAACCTATTCTTTATCGGAATCTTGAGATCGTAAACTCATCATCG 

CCGTCGTCGCCGACGAAGAAAGGAGATAATCAGACAGAGTGGTATTTTGGTAATAACAGT 

GATAATGAAGGAGTGATTAGTAATAATGCTAATACAGGAGGAGGAGGAAGTGAATGGAAC 

7UVTGGAATTCAAGCTTGGACTGATCTTAATCATTATAATGCATTGCCTTGA 

>G1896 Amino Acid Sequence {domain in aa coordinates: 43-85) 

MSSHTNLPSPKPVPICPDHRISGTSQTKKPPSSSVAQDQQNLKCPRCNSPNTKFCYYNNYS 

LSQPRHFCKSCRRYOTRGGALRNVPIGGGCRKTKKSIKPNSSMNTLPSSSSSQRFFSSIM 

EDSSKFFPPPTTMDFQLAGLSLNKMNDLQLLNNQEVLDLRPMMSSGRENTPVDVGSGLSL 

MGFGDFNttimSPTGFTTAGASDGNLASSIETLSCLNQD^^ 

WERPQPILYRNLEIVNSSSPSSPTKKGDNQTEWYFGNWSDNEGVISNNANTGGGGSEWN 
NG I QAWTDLNH YNALP * 

>G1898 (1..630) 

ATGCCGTCGGAACCAAACCAAACCCGACCCACCAGAGTTCAGCCCTCAACGGCGGCTTAC 
CCACCGCCAAATCTGGCTGAGCCTCTTCCTTGTCCTCGCTGCAACTCCACCACCACCAAG 
TTCTGTTACTACAACAACTATAACCTCGCTCAGCCTCGCTACTACTGCAAATCTTGCCGC 
CGTTACTGGACTCAAGGTGGTACACTCCGTGACGTCCCCGTCGGTGGTGGAACTCGTCGA 
AGCTCCTCAAAACGTCACCGTTCTTTCTCCACCACTGCCACCTCCTCTTCCTCCTCTTCT 
TCCGTCATCACCACCACGACACAAGAACCAGCCACGACTGAAGCGAGTCAAACTAAGGTT 
ACTAATTTAATTTCAGGTCATGGAAGCTTTGCTTCTCTGTTAGGTTTAGGAAGTGGAAAT 
GGTGGGTTGGATTACGGGTTTGGGTACGGGTACGGGCTTGAGGAGATGAGTATTGGGTAT 
CTTGGAGATTCTTCCGTAGGAGAGATTCCGGTGGTTGATGGTTGTGGTGGTGACACGTGG 
CAGATTGGGGAGATTGAAGGTAAAAGTGGAGGAGACAGTTTGATATGGCCTGGTCTTGAG 
ATCTCAATGCAAACCAACGATGTTAAGTGA 

>G1898 Amino Acid Sequence (domain in AA coordinates: 31-59) 
MPSEPNQTRPTRVQPSTAAYPPPNLAEPLPCPRCNSTTT^ 

RYWTQGGTLRD VPVGGGTRRS S S KRHRS FSTTATS S S S S S S VITTTTQEP ATTEASQTKV 
TNIiISGHGSFASLLGLGSGNGGLDYGFGYGYGLEEMSIGYLGDSSVGEIPWDGCGGDTW 
QIGEIEGKSGGDSLIWPGLEISMQTNDVK* 
>G1902 (1..615) 

ATGCAGGATCCAGCAGCATATTACCAGACGATGATGGCGAAGCAACAACAACAACAACAA 
CCACAGTTTGCAGAGCAAGAACAGTTAAAGTGTCCTCGTTGTGACTCACCAAACACTAAA 
TTCTGTTACTACAACAACTACAATCTCTCACAGCCTCGTCACTTTTGCAAAAGCTGTCGT 
CGTTACTGGACTAAAGGCGGCGCTCTCCGTAACGTTCCCGTCGGTGGTGGTTCTCGTAAG 
AACGCAACCAAACGATCCACTTCTTCTTCTTCTTCTGCTTCCTCTCCTTCCAACAGTAGC 
CAAAA(^GAAGACGAAAAACCCGGATCCGGATCCTGATCCACGTAATTCTCAAAAACCG 
GATTTGGATCCGACCCGGATGCTTTACGGGTTTCCGATCGGTGACCAAGACGTGAAGGGT 
ATGGAGATTGGTGGAAGCTTTAGCTCGTTGTTGGCGAATAATATGCAGCTTGGTCTTGGA 
GGAGGAGGGATCATGCTTGACGGGTCGGGTTGGGATCATCCGGGTATGGGTTTGGGTTTG 
AGGAGAACCGAACCGGGTAATAATAATAATAACCCATGGACCGATCTGGCTATGAACAGA 
GCGGAGAAAAACTGA 

>G1902 Amino Acid Sequence (domain in AA coordinates : 3 1-59) 
MQDPAAYYQTMMAKQQQQQQPQFAEQEQLKCPRCDS PNTKFCYYNNYNLSQPRHFCKS CR 
RYWTKGGALRNVPVGGGSRKNATKRSTSSSSSASSPSNSSQNKKTKNPDPDPDPRNSQKP 
DLDPTRMLYGFPIGDQDVKGMEIGGSFSSLLANNMQLGLGGGGIMLDGSGWDHPGMGLGL 

>G1904 (1. ,924> 

ATGCT^GATATTCATGATTTCTCCATGAACGGAGTTGGTGGTGGGGGAGGAGGAGGAGGG 

AGGTTTTTCGGTGGAGGAATCGGCGGCGGAGGAGGTGGTGATCGAAGGATGAGAGCTCAT 

CAGAACAATATACTTAACCATCATCAATCTCTCAAGTGTCCTCGTTGTAATTCT 

ACAAAGTTCTGTTACTACAACAATTACAATCTTTCTCAGCCTCGTCACT^ 

TGTCGTCGTTACTGGACTAAAGGTGGTGTTCTCCGTAACGTTCCCGTCGGAGGTGGTTGC 

CGGAAAGCTAAACGTTCGAAAACAAAACAGGTTCCGTCGTCGTCATCAGCCGACAAACCA 

ACGACGACGCAAGATGATCATCACGTGGAGGAGAAATCGAGTACAGGATCTCACTCTAGC 

AGCGAGAGCTCTTCTCTC^CCGCTTCTAACTCTACCACCGTCGCCGCCGTCTCCGTCACC 

GCGGCGGCGGAAGTTGCTTCGTCGGTTATTCCAGGTTTTGATATGCCrAATATGAAAAT^ 
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TACGGTAACGGGATCGAGTGGTCGACGTTACTTGGAC7\AGGCTCATCGGCCGGTGGTGTT 
TTCTCGGAGATCGGTGGTTTTCCGGCGGTTTCAGCTATTGAAACTACACCGTTTGGATTC 
GGGGGTAAATTCGTT^AATCAAGATGATCATCTGAAGTTAGAAGGTGAAACTGTACAGCAG 
CAACAGTTTGGAGATCGAACGGCTCAGGTTGAGTTTCAAGGAAGATCTTCGGATCCGAAT 
ATGGGATTTGAACCGTTGGATTGGGGAAGTGGCGGTGGAGATCAAACACTGTTTGATTTA 
ACCAGTACCGTTGATCATGCATACTGGAGTCAAAGTCAATGGACGTCGTCTGACCAAGAT 
CAGAGTGGTCTCTACCTTCCTTGA 

>G1904 Amino Acid Sequence (domain in aa coordinates: 53-95) 

MQDIHDFSMNGVGGGGGGGGRFFGGGIGGGGGGDRRMRAHQNNILNHHQSLKCPRCNSLfT 

TKFCYYNNYWLSQPRHFCKNCRRYWTKGGVLRNVPVGGGCRKAKRSKTKQVPSSSSADKP 

TTTQDDHHVEEKSSTGSHSSSESSSLTASNSTTVAAVSVTAAAEVASSVIPGFDMPNMKI 

YGNGIEWSTLLGQGSSAGGVFSEIGGFPAVSAIETTPFGFGGKFVNQDDHLKLEGETVQQ 

QQFGDRTAQVEFQGRSSDPNMGFEPLDWGSGGGDQTLFDLTSTVDHAYWSQSQWTSSDQD 

QSGLYLP* 

>G1906 (1..795) 

ATGGTGGAACGTGCTCGGATCGCAAAAGTCCCATTGCCTGAAGCAGCTCTAAATTGCCCT 
AGATGTGACTCAACCAATACTAAGTTCTGTTACTTCAATAACTATAGCCTTACTCAACCT 
CGCCATTTCTGCAAAACATGTCGTCGCTATTGGACACGTGGCGGTTCCTTGAGGAATGTT 
CCTGTTGGAGGAGGCTTTAGGAGGAACAAGAGAAGCAAATCCAGATCGAAATCTACGGTC 
GTGGTCTCGACTGATAATACTACTAGTACTTCATCACTTACTTCTCGCCCAAGTTACTCA 
AACCCTAGCAAGTTTCATAGCTACGGTCAAATCCCGGAGTTTAATTCCAACTTGCCCATC 
TTGCCTCCTCTCCAAAGCCTTGGAGATTACAATTCAAGCAACACTGGATTAGATTTTGGT 
GGAACTCAAATAAGC7UVCATGATAAGTGGTATGAGTTCTAGTGGTGGGATCTTGGATGCA 
TGGAGAATACCTCCATCACAACAAGCTCAGCAATTCCCTTTCTTGATCAACACTACCGGA 
TTGGTGCAATCTTCAAACGCGTTATATCCATTACTAGAAGGCGGGGTTAGCGCCACGCAA 
ACAAGAAATGTGAAGGCGGAAGAGAATGATCAGGATCGGGGTAGGGATGGGGATGGAGTG 
AATAACTTATCAAGAT^CTTTTTGGGTAATATCAACATAAACTCAGGCAGGAACGAGGAA 
TACACATCATGGGGAGGTAACAGTTCTTGGACCGGTTTCACCTCCAACAACTCAACAGGC 
CATCTCTCATTCTAA 

>G1906 Amino Acid Sequence (domain in AA coordinates : 19-47) 

MVERARIAKVPLPEAALNCPRCDSTNTKFCYFNNYSLTQPRHFCKTCRRYWTRGGSLRNV 

PVGGGFRRNKRSKSRSKSTWVSTDNTTSTSSLTSRPSYSNPSKFHSYGQIPEFNSNLPI 

LPPLQSLGDYNSSNTGLDFGGTQISNMISGMSSSGGILDAWRIPPSQQAQQFPFLINTTG 

LVQSSNALYPLLEGGVSATQTROTKAEENDQDRGRDGDGVNNLSR1JFLGNININSGRNEE 

YTSWGGNSSWTGFTSNNSTGHLSF* 

>G1913 (1..744) 

ATGGAGAGAGCAGAGGCCTTGACATCATCGTTTATATGGCGGCCAAACGCAAACGCAAAC 

GCGGAGATCACGCCGAGTTGTCCAAGATGTGGATCCTCTAACACAAAGTTCTGTTACTAC 

AACAACTATAGCCTCACTCAGCCTCGCTACTTCTGCAAAGGCTGCCGCAGATATTGGACC 

AAAGGTGGTTCCCTCCGCAATGTTCCTGTAGGCGGTGGCTGTCGAAAATCCCGCCGCCCC 

AAATCATCTTCTGGTAACAATACTAAAACTAGCCTAACCGCTAATTCTGGCAACCCCGGT 

GGTGGTTCACCAAGCATCGATCTTGCTCTTGTTTACGCCAATTTCTTGAAT 

GACGAATCTATACTACAAGAAAATTGCGACTTAGCCACTACGGATTTTTTGGTAGATAAT 

CCTACCGGCACTTCCATGGACCCrTCATGGAGTATGGACATCAATGATGGTCATCATGAT 

CATTATATTAATCCGGTGGAACACATTGTGGAGGAATGTGGTTATAATGGCTTGCCTCCA 

TTTCCTGGTGAAGAGCTTCTCTCTTTAGACACTAATGGTGTTTGGTCTGATGCTTTGTTG 

ATTGGTCATAACCATGTAGACGTTGGCGTGACTCCGGTTCAGGCTGTACACGAACCGGTG 

GTTCATTTCGCTGAeGAATCCAATGATTCCACCAATCTCTTGTTTGGAAGTTGGAGCCCT 

TTTGATTTCACTGCCGATGGATGA 

>G1913 Amino Acid Sequence (domain in AA coordinates: 27-55) 

MERAEALTSSFIWRPNANANAEITPSCPRCGSSNTKPCYyN]^SLTQPRYFCKGCIUlYWT 

KGGSLRNVPVGGGCRKSRRPKSSSGNNTKTSLTANSGNPGGGSPSIDLALVYANFLNPKP 

DESILQENCDLATTDFLVDNPTGTSMDPSWSMDINDGHHDHYINPVEHIVEECGYNGLPP 

FPGEELLS^TNGWSDALLIGHNHVDVGVTPVQAVHEPVVHFADESNDSmLiLFGSWSP 

FDFTADG* 

>G1914 (1..945) 

ATGGAGAGATACAAGTGTAGATTTTGCTTCAAGAGCTTC^TCAATGGAAGAGCTTTAGGT 
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GGTCACATGAGATCTCACATGCTTACTCTTTCTGCAGAACGTTGTGTAATAACTGGTGAA 
GCAGAAGAAGAAGTAGAGGAACGGCCGAGTCAACTCTGTGACGACGACGACGATACCGAG 
TCCGATGCTTCTTCTTCTTCTGGTGAGTTTGATAATCAAAAGATGAATCGTCTTGATGAT 
GAATTGGAGTTTGATTTCGCTGAAGACGACGACGTTGAAAGTGAAACCGAGTCGTCCAGG 
ATTAACCCAACTCGGCGACGATCTAAACGAACTCGGAAACTTGGATCGTTTGATTTCGAC 
TTTGAGAAGCTAACAACGAGCCAACCCAGTGAGTTAGTGGCCGAGCCAGAGCATCACAGC 
TCAGCTTCTGATACAACAACGGAGGAAGATCTCGCCTTTTGTCTCATTATGCTGTCCAGA 
GACAAATGGAAGCAACAGAAGAAGAAGAAGCAACGTGTAGAAGAAGATGAGACAGATCAT 
GACAGTGAAGATTACAAATCAAGCAAGAGCAGAGGGAGATTCAAGTGTGAGACTTGTGGT 
AAAGTGTTTAAATCGTATCAAGCATTAGGAGGACACAGAGCAAGCCACAAGAAGAACAAG 
GCATGCATGACGAAAACAGAGCAAGTTGAAACAGAGTACGTTCTTGGAGTAAAGGAGAAG 
AAAGTTCATGAATGTCCGATCTGTTTTAGGGTTTTTACTTCAGGGCAAGCACTTGGAGGT 
CATAAGAGATCTCACGGAAGTAACATCGGAGCAGGAAGAGGATTGTCAGTAAGTCAAATT 
GTCCAAATCGAAGAAGAAGTATCAGTGAAACAGAGGATGATTGATCTTAATCTTCCTGCA 
CCTAATGAAGAAGATGAAACTTCTTTGGTGTTTGATGAATGGTGA 

>G1914 Amino Acid Sequence (domain in AA coordinates : 195-216, 245-266) 

MERYKCRFCFKSFINGRAIiGGHMRSHMLTLSAERCVITGEAEEEVEERPSQLCDDDDDTE 

SDASSSSGEFDNQKMNRLDDELEFDFAEDDDVESETESSRINPTRRRSKRTRKLGSFDFD 

FEKLTTSQPSELVAEPEHHSSASDTTTEEDLAFCLIMLSRDKWKQQKKKKQRVEEDETDH 

DSEDYKSSKSRGRFKCETCGKVFKSYQALGGHRASHKKNKACMTKTEQVETEYVLGVKEK 

KVHECPICFRVFTSGQALGGHKRSHGSNIGAGRGLSVSQIVQIEEEVSVKQRMIDLNLPA 

PNEEDETSLVFDEW* 

>G1925 (1..945) 

ATGGAAGAAAATCTTCCTCCGGGGTTCAGATTTCATCCTACAGACGAGGAGCTCATAACG 
CATTATCTATGTCGGAAAGTCTCCGATATAGGATTCACCGGTAAAGCTGTCGTCGACGTT 
GATCTCAACAAGTGTGAACCTTGGGATTTGCCAGCCAAGGCTTCAATGGGAGAGAAAGAG 
TGGTATTTCTTCAGCCAAAGGGATCGGTy^ATATCCAACCGGTTTAAGAACAAACCGGGCA 
ACAGAAGCTGGTTACTGGAAAACCACCGGGAAAGATAAAGAAATATACCGAAGTGGAGTG 
TTGGTTGGGATGAAGAAAACCCTAGTTTTCTACAAAGGAAGAGCTCCCAAAGGTGAGAAA 
AGCT^TTGGGTTATGCATGAGTACAGGCTTGAGAGCAAACAACCTTTCAACCCCACGAAT 
AAGGAGGAATGGGTAGTGTGTAGGGTTTTCGAAAAGAGCACGGCAGCAAAGAAAGCACAA 
GAACAACAACCTCAATCTTCTCAACCATCTTTTGGATCTCCATGCGATGCAAACTCATCA 
ATGGCAAATGAGTTTGAAGATATTGATGAGCTTCCGAATCTGAATTCAAACTCATCAACC 
ATCGATTACAATAATCATATCCATCAATATTCGCAACGCAATGTTTACTCAGAAGACAAC 
ACAACAAGTACGGCTGGTCTCAACATGAACATGAACATGGCTAGTACTAATCTTCAGTCT 
TGGACAACAAGTCTCCTTGGTCCGCCTTTATCTCCAATCAACTCTTTGTTGCTCAAGGCT 
TTCCAAATCAGGAACTCTTATAGTTTCCCAAAAGAGATGATCCCCAGTTTCAATCATTCT 
TCTCTTCAACAAGGAGTCTCCAATATGATCCTU^ATGCTTCAAGTTCGTCTCAAGTGCAA 
CCCCAACCGCAAGAGGAAGCGTTTAATATGGACTCCATATGGTGA 

>G1925 Amino Acid Sequence (conserved domain in AA coordinates: 6-150) 

MEENLPPGFRFHPTDEELITHYLCRKVSDIGFTGKAVVDVDLNKCEPWDLPAKASMGEKE 

WYFFSQRDRKYPTGLRTNRATEAGYWKTTGKDKEIYRSGVIjVGMKKTLVFYKGRAPKGEK 

SNWVWHEYRLESKQPFNPTNKEEWWCRW 

MANBFEDIDELPNLNSNSSTIDYNNHIHQYSQRNW^ 

VOTSLLGPPLSPINSLLLKAFQIRNSYSFPKEMIPSFNHSSLQQGVSNMIQNASSSSQVQ 

PQPQEEAFNMDSIW* 

>G1929 (1. .366) 

ATGTGTAGAGGCTTCAATAATGAAGAGAGCAGAAGAAGTGACGGAGGAGGTTGCCGGAGT 

CTCTGCACGAGACCGAGTGTTCCGGTAAGGTGTGAGCTTTGCGACGGAGACGCCTCCGTG 

TTCTGTGAAGCGGACTCGGCGTTCCTCTGTAGAAAATGTGACCGGTGGGTTCATGGAGCG 

AATTTTCTAGCTTGGAGACACGTAAGGCGCGTGCTATGCACTTCTTGTCAGAAACTCACG 

CGCCGGTGCCTCGTCGGAGATCATGACTTCCACGTTGTTTTACCGTCGGTGACGACGGTC 

GGAGAAACCACCGTGGAGAATAGAAGTGAACAAGATAATCATGAGGTTCCGTTTGT^ 

CTCTGA 

>G1929 Amino Acid Sequence {domain in AA coordinates : 31-53) 

MCRGLimEESRRSDGGGCRSLCTRPSVPvliCELCDGDASVFCEADSAFLCRKroRWVHGA 

NFLAWRHVRRVT.CTSCQKLTRRCLTC 



104 



BNSDOCID: <WO__03013227A2_I_> 



WO 03/013227 PCT/US02/25805 

105/286 



>G1930 (76.. 1077) 

ATTCACATTACTAATCTCTCAAGATTTCACAATTTTCTTGTGATTTTCTCTCAGTTTCTT 

ATTTCGTTTCATAACATGGATGCCATGAGTAGCGTAGACGAGAGCTCTACAACTACAGAT 

TCCATTCCGGCGAGAAAGTCATCGTCTCCGGCGAGTTTACTATATAGAATGGGAAGCGGA 

ACAAGCGTGGTACTTGATTCAGAGAACGGTGTCGAAGTCGAAGTCGAAGCCGAATCAAGA 

AAGCTTCCTTCTTCAAGATTCAAAGGTGTTGTTCCTCAACCAAATGGAAGATGGGGAGCT 

CAGATTTACGAGAAACATCAACGCGTGTGGCTTGGTACTTTCAACGAGGAAGACGAAGCA 

GCTCGTGCTTACGACGTCGCGGCTCACCGTTTCCGTGGCCGCGATGCCGTTACTAATTTC 

AAAGACACGACGTTCGAAGAAGAGGTTGAGTTCTTAAACGCGCATTCGAAATCAGAGATC 

GTAGATATGTTGAGAAAACACACTTACAAAGAAGAGTTAGACCAAAGGAAACGTAACCGT 

GACGGTAACGGAAAAGAGACGACGGCGTTTGCTTTGGCTTCGATGGTGGTTATGACGGGG 

TTTAAAACGGCGGAGTTACTGTTTGAGAAAACGGTAACGCCAAGTGACGTCGGGAAACTA 

AACCGTTTAGTTATACCAAAACACCAAGCGGAGAAACATTTTCCGTTACCGTTAGGTAAT 

AATAACGTCTCCGTTAAAGGTATGCTGTTGAATTTCGAAGACGTTAACGGGAAAGTGTGG 

AGGTTCCGTTACTCTTATTGGAATAGTAGTCAAAGTTATGTGTTGACCAAAGGTTGGAGT 

AGATTCGTTAAAGAGAAGAGACTTTGTGCTGGTGATTTGATCAGTTTTAAAAGATCCAAC 

GATCAAGATCAAAAATTCTTTATCGGGTGGAAATCGAAATCCGGGTTGGATCTAGAGACG 

GGTCGGGTTATGAGATTGTTTGGGGTTGATATTTCTTTAAACGCCGTCGTTGTAGTGAAG 

GAAACAACGGAGGTGTTAATGTCGTCGTTAAGGTGTAAGAAGCAACGAGTTTTGTAATAA 

CAATTTAACAACTTGGGAAAGAAAAAAAAGCTTTTTGATTTTAATTTCTCTTCAACGTTA 

ATCTTGCTGAGATTA 

>G1930 Amino Acid Sequence (domain in AA coordinates: 59-124) 

MDAMSSVDESSTTTDSIPARKSSSPASLLYRMGSGTSVVLDSENGVEVEVEAESRKLPSS 

RFKGWPQPNGRWGAQIYEKHQRWLGTFNEEDEAARAYDVAAHRFRGRDAVTNFKDTTF 

EEEVEFLNAHSKSEIVDMLRKHTYKEELDQRKRN 

LLFEKTVTPSDVGKLNRLVIPKHQAEKHFPLPL^ 

YWNSSQSYVLTKGWSRFVKEKRLCAGDLISFKRSNDQDQKFFIGWKSKSGLDLETGRVMR 

LFGVDISLNAWWKETTEVLMSSLRCKKQRVL* 

>G195 (51.. 1031) 

TTTTCTTTTTCTTTCTTTTTGGTTTAAGTTTTTTCTCTTTGTTCTTCGTCATGTCTCATG 
AAATCAAAGATCTTAACAACTATCACTACACTTCATCGTATAATCATTACAATATCAACA 
ACCAAAATATGATTAATCTCCCTTACGTTTCTGGTCCATCTGCTTATAATGCAAACATGA 
TCTCATCATCACAAGTAGGTTTTGATCTACCCTCGAAGAACTTGAGTCCTCAAGGAGCCT 
TCGAGTTGGGTTTCGAGCTTTCTCCATCTTCTTCTGACTTTTTTAATCCTTCCCTCGATC 
AAGAGAACGGTTTGTATAATGCTTATAATTATAATAGTAGTCAAAAGAGTCATGAAGTTG 
TCGGTGATGGTTGTGCAACCATTAAGAGTGAAGTTAGGGTTTCAGCATCTCCTTCTTCAA 
GTGAGGCCGATCATCATCCAGGAGAAGATTCCGGCAAGATCCGGAAGAAAAGAGAAGTTC 
GCGATGGAGGAGAAGATGATCAACGCTCTCAGAAAGTAGTTAAAACAAAGAAGAAAGAGG 
AGAAGAAAAAAGAGCCACGAGTCTCGTTCATGACTAAGACCGAAGTTGATCATCTCGAAG 
ACGGCTATCGTTGGAGAAAGTATGGCCAAAAAGCAGTCAAAAACAGTCCTTATCCGAGGA 
GTTACTATAGATGCACGACTCAGAAGTGCAACGTGAAGAAGAGAGTGGAGAGATCTTACC 
AAGACCCAACGGTCGTCATCAC7VACCTACGAGAGTCAACACAACCATCCGATCCCGACCA 
ATCGTCGGACAGCAATGTTCTCTGGAACCACCGCATCTGATTATAACCCATCATCGTCTC 
CAATATTCTCCGATCTCATCATGAATACTCCAAGAAGCTTCTCAAATGATGATCTCTTCC 
GTGTGCCATACGCTAGTGTGAACGTGAACCCTAGTTATCATCAACAGCAACATGGATTTC 
ATCAACAGGAGAGTGAGTTCGAGCTCTTGAAGGAGATGTTTCCTTCGGTTTTCTTCAAAC 
AAGAG C CTTGATGA^ ATAATATAATAT AG AAACAATTTTTTTTCTG CTAAG AAATATAGA 
ACAAAACTTGGATGCATAATAAGTGATGATAGTGCT 

ATACATGTTTTGTTAACTAGCTATAGGATATACTGGTAGTAATTAAGCATAAATATGGAG 

CCCTTCGACTTATTACAATAATTTTTGGTATGGAAAAANTTNGNTACATGCCTC 

NNNTTNNGG 

>G195 Amino Acid Sequence (domain in AA coordinates: 183-239) 
MSHEIKDLNNYHYTSSYNHYMNNQ^ 

QGAFELGFELSPSSSDFFNPSLDQENGLYNAYNYNSSQKSHEWGDGCATIKSEVRVSAS 
PSSSEADHHPGEDSGKIRKKREVRDGGEDDQRSQKWKTKK^ 

HLEDGYRWRKYGQKAVKNS PYPRS YYRCTTQKCNVIQQ^VERS YQDPTVVITTYES QHNHP 
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IPTNRRTAMFSGTTASDYNPSSSPIFSDLIINTPRSFSNDDLFRVPYASVNVNPSYHQQQ 

HGFHQQESEFELLKEMFPSVFFKQEP* 

>G1954 {196. .1440) 

ATTTATGACTTCTCAATACAAAAAGCTCCCCTCACTTTTTTAAGTTTTGTCTTCTCTAAT 
CCGTCTTCTTCTACTATCTTGCATGTCTTGCGTCTTTTATATACATCTCTCGTAAACCCT 
AGCAAATCATACAAGGTCT^AGAAGCTTGACCTTCATTAGACTTAAGCAGTTTATAATCAA 
CTACCACGAATAGCAATGGATAAAGATTACTCGGCACCAAACTTCTTAGGTGAATCCTCA 
GGCGGTAACGATGATAACAGCTCTGGTATGATAGACTATATGTTCAATAGAAACCTTCAA 
CAACAACAAAAGCAATCGATGCCACAACAGCAGCAACATCAACTCTCTCCTTCCGGATTT 
GGAGCAACACCCTTTGATAAAATGAACTTCTCTGATGTGATGCAGTTTGCGGACTTCGGT 
TCGAAACTTGCGTTGAACCAGACCAGAAACCAAGACGATCAAGAAACCGGGATTGACCCC 
GTTTATTTCTTGAAGTTCCCTGTCTTGAACGACAAAATAGAGGACCATAACCAAACCCAA 
CATCTCATGCCTTCTCATCAGACGTCTCAAGAAGGAGGTGAGTGTGGAGGAAACATAGGC 
AATGTGTTTCTTGAAGAAAAAGAAGATCAAGACGATGACAACGACAACAACTCCGTGCAA 
CTACGTTTTATTGGAGGAGAAGAAGAAGATAGGGAGAACAAGAATGTTACGAAAAAGGAG 
GTGAAGAGCAAGAGGAAGAGAGCTAGAACGAGCAAGACCAGCGAAGAAGTGGAAAGCCAA 
CGGATGACTCATATCGCGGTCGAAAGAAACCGTAGGAAGCAAATGAATGAGCATCTTCGT 
GTCCTTAGATCTCTCATGCCTGGCTCCTACGTTCAAAGGGGAGACCAAGCGTCAATCATA 
GGAGGAGCAATAGAGTTTGTGAGAGAGCTCGAGCAACTCCTACAATGTCTTGAATCACAG 
AAGCGTCGAAGAATCTTAGGAGAAACCGGTAGGGACATGACAACGACAACGACTTCTTCT 
TCTTCTCCCATAACTACGGTAGCGAACCAAGCACAACCGCTCATTATTACGGGi\AATGTA 
ACCGAGCTAGAGGGCGGAGGAGGGCTTCGGGAGGAGACTGCGGAGAACAAGTCGTGCTTG 
GCTGACGTGGAGGTGAAGCTGCTAGGGTTTGACGCCATGATCAAGATACTTTCAAGAAGA 
AGGCCGGGACAGCTGATTAAGACTATAGCTGCTTTGGAGGATCTTCATCTCTCTATTCTT 
CACACTAACATCACTACCATGGAACAAACCGTCCTCTACTCCTTTAATGTCAAGATAACA 
AGTGAAACGAGGTTTACGGCAGAAGACATAGCAAGTTCCATCCAACAGATATTTAGTTTC 
ATTCATGCAAATACCAACATATCTGGAAGCTCTAACCTGGGAAATATTGTGTTTACTTGA 
AAATCATCACACGGCGACAACTTTGTACACTGGTGAAGATTACAGTACGTAATAATCTCT 
ACATATTGGGTTTTATTCTCCAAGCATTTGGAAGAGTGTTTAAGTTAAAGGGAGTGCTTA 
CTTTATTTTTTTGGGGCTTTTTTCATGCAATTTAAATTTTAGTGATGATTGTGTCGCTTG 

GTTTAGAATGCTAAACCACTTATTTACTTGAAATAACTTTTTTCACAAAAAAAAAAAAAA 
AAGAAAAAAA 

>G1954 Amino Acid Sequence (domain in AA coordinates : 187-259) 
^KDYSAPNFLGESSGGlTODNSSGMIDYMFISrRNLQQQQKQSMPQQQQHQLSPSGFGATPF 
DKM^FSDVMQFADFGSKLALNQTRNQDDQETGIDPVYFLKFPVLITOKIEDHNQTQHLMPS 
HQTSQEGGECGGNIGNVFLEEKEDQDDDNDIWSVQLRFIGGEEEDRENKNVTKKEVKSKR 
KRARTSKTSEEVESQRMTHIAVERNRRKQMNEHLRVLRSLMPGSYVQRGDQASIIGGAIE 
F VRELEQLLQCLES QKRRRILGETGRDMTTTTTS S S S PI TTVANQAQPLI I TGNVTELEG 
GGGLREETAENKSCLADVEVKLLGFDAMIKILSRRRPGQLIKTIAALEDLHLSILHTNIT 
TMEQTVLYS FNVKI TSETRFTAED IAS S I QQI FS F IHANTNI SGS SNLGNIVFT * 
>G1958 (107.. 1336) 

GTACCGTCGACCGATTATCCCCAAGAGGAGAATCCTCATAATCATTTTCTCCGATTCGAT 
TCGTCTTCCTTGGTCCTGGATTGCTTCATGAATTTCTAGGACAACAATGGAGGCTCGTCC 
AGTTCATAGATCAGGTTCGAGAGACCTCACACGCACTTCTTGAATCCCATCTACACAAAA 
ACCTTCACCAGTAGAAGATAGTTTCATGAGATCAGATAACAACAGTCAGTTAATGTCTAG 
ACC^TTAGGACAAACCTACC^TTTACTTTCATCTAGTAACGGTGGAGCTGTTGGACATAT 
ATGTTCTTCTTCAT€ATCTGGTTTTGCAACCAATCTCCATTACTCAACTATGGTATCTCA 
TGAGAAACAACAACACTACACAGGAAGCAGCAGTAATAATGCTGTGCAGACACCAAGCAA 
CTU^CGATAGTGCTTGGTGTCATGATTCATTG^ 

CAACCCGGCGATTCAAAACAACTGTCAGATTGAGGATGGTGGCATTGCGGCTGCTTTTGA 
TGACATTCAAAAACGAAGTGATTGGCATGAATGGGCTGACCATTTGATCACTGATGATGA 
TCCTTTGATGTCTACTAACTGGAATGATCTCTTGCTTGAAACAAATTCCAATTCAGATTC 
AAAGGACCAGAAGACACTGCAAATTCCGCAACCTCAGATTGTTCAGCAGCAACCTTCTCC 
GTCTGTGGAATTGCGACCTGTTAGCACAACATCTTCAAACAGCAATAACGGAACGGGCAA 
GGCACGAATGCGTTGGACGCCAGAGCTTCACGAGGCTTTTGTTGAGGCTGTCAACAGTCT 
TGGCGGTAGTGAAAGAGCTACTCCTAAAGGGGTACTGAAGATTATGAAAGTTGAAGGCTT 
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G ACTATATATC ATGTTAAAAGC CATTTACAGAAATATAGGAC AGCTAGATATCGG CC AG A 
ACCATCAGAAACTGGTTCGCCAGAAAGGAAGTTGACACCGCTTGAACATATAACATCTCT 
TGATTTGAAAGGTGGGATAGGTATTACAGAGGCTCTACGACTTCAGATGGAAGTACAGAA 
GCAACTCCATGAGCAGCTCGAGATTCAAAGAAACCTGCAACTCCGAATAGAAGAACAAGG 
CAAGTACCTGCAAATGATGTTCGAGAAGCAAAACTCTGGTCTTACCAAAGGGACAGCCTC 
AACATCAGATTCCGCAGCCAAATCTGAACAAGAAGACAAGAAGACTGCTGATTCGAAGGA 
GGTTCCAGAAGAAGAAACCAGGAAATGTGAGGAACTAGAATCTCCACAGCCAAAGCGTCC 
CAAAATCGATAATTGAAAGTATTGGTCTTTTGCTGGATAATCTCGGAGTTTCAGAGTTAA 
CAGTGATAGAGAGAACGAGCTCTTATCTTGAGGTTCTTCAGGACTTCTCTCGCGGCCGCT 
CTAG 

>G1958 Amino Acid Sequence (domain in AA coordinates: 230-278) 

MEARPVHRSGSRDLTRTSSIPSTQKPSPVEDSFMRSDNNSQLMSRPLGQTYHLLSSSNGG 

AVGHICSSSSSGFATNLHYSTMVSHEKQQHYTGSSSNNAVQTPSNNDSAWCHDSLPGGFL 

DFHETNPAI QNNCQI EDGG I AAAFDD I QKRSDWHEW ADHL I TDDDPLM STNWNDLLLETN 

SNSDSKDQKTLQIPQPQIVQQQPSPSVELRPVSTTSSNSNNGTGKARMRWTPELHEAFVE 

AVNSLGGSERATPKGVLKIMKVEGLTIYHVKSHLQKYRTARYRPEPSETGSPERKLTPLE 

HITSLDLKGGIGITEALRLQMEVQKQLHEQLEIQRNLQLRIEEQGKYLQMMFEKQNSGLT 

KGTASTSDSAAKSEQEDKKTADSKEVPEEETRKCEELESPQPKRPKIDN* 

>G196 (111.. 1421) 

TCGACATCAGATTTCTCTCACGGATTCCTAATCATTTTTATTATATTTGGATATTTGCTA 

ATTCTTCCCGTGTATATVATCTCATATAAACACGCATCATACATATATATTATGTGCAGCG 

TCTTTGAGTTTCAAGACATGGACAACTTCCAAGGAGATCTAACAGACGTCGTACGAGGAA 

TAGGATCAGGCCACGTGTCACCATCTCCTGGACCACCGGAAGGTCCATCTCCGAGCAGCA 

TGTCTCCGCCGCCAACATCAGATCTCCACGTGGAATTCCCCTCCGCCGCTACTTCTGCCA 

GCTGTCTCGCAAATCCCTTCGGAGACCCGTTCGTAAGCATGAAGGATCCTCTCATCCACC 

TCCCGGCCAGCTACATCTCCGGCGCCGGTGATAATAAAAGCAACAAAAGTTTTGCAATCT 

TTCCAAAGATTTTTGAGGATGATCATATTAAGAGTCAATGCAGTGTCTTCCCAAGAATTA 

AGATCTCGCAAAGTAACAATATCCACGATGCCTCCACGTGTAATTCTCCGGCCATAACCG 

TCTCCTCTGCCGCCGTAGCAGCTTCGCCGTGGGGCATGATCAACGTTAATACCACTAACA 

GTCCAAGAAACTGTTTACTTGTCGATAATAATAACAACACGTCATCATGCTCACAGGTTC 

AGATCTCTTCTTCCCCTCGGAATCTCGGAATTAAGAGAAGGAAGAGCCAGGCAAAGAAAG 

TGGTGTGCATACCGGCTCCAGCCGCTATGAACAGCCGGTCCAGTGGAGAAGTTGTTCCGT 

CTGATCTATGGGCTTGGCGAAAGTACGGTCAAAAACCTATCAAAGGTTCTCCTTATCCAA 

GGGGTTACTACAGATGTAGCAGCTCAAAAGGTTGTTCAGCTAGGAAAC7VAGTCGAACGTA 

GCCGCACTGATCCAAACATGTTAGTCATTACTTACACCTCTGAGCATAACCACCCATGGC 

CTACTCAACGCAACGCTCTCGCAGGTTCCACTCGTTCCTCTTCCTCCTCCTCTTTAAACC 

CTTCTTCCAAATCCTCAACCGCAGCCGCCACTACTTCTCCCTCATCCAGAGTTTTCCAAA 

ACAACAGCAGCAAAGACGAACCCAATAACTCCAACTTGCCTTCCTCTTCCACTCATCCTC 

CTTTTGACGCCGCCGCAATTAAGGAGGAGMCGTGGAAGAGCGTCAGGAAAAGATGGAGT 

TCGATTATAATGACGTTGAAAATACCTATAGACCGGAGTTGTTGCAAGAGTTTCAACATC 

AGCCGGAGGATTTCTTTGCCGATCTCGACGAGCTTGAGGGAGATTCTTTGACTATGTTGC 

TCrCT^CAGTAGCGGCGGAGGCAACATGGAAAACAAAACGACGATTCCAGACGTT^ 

GTGATTTCTTTGACGACGACGAGTCCTCAAGGTCGTTATAAATATTGTTGTTAATGTATA 

CATAGAAATGAAATTATTCATGTAATTCGTTTTGTGTTAAATGACGGTATTTGCCTTTGC 

A 

>G196 Amino Acid Sequence (conserved domain in AA coordinates : 223-283) 
MCSVFEFQDMDNFQGDLTDWRGIGSGHVSPSPGPPEGPSPSSMSPPPTSDLHVEFPSAA 
TSASCLANPFGDPFVSMKDPLIHLPASYISGAGDNKSNKSFAIFPKIFEDDHIKSQCSVF 
PRIKISQSNNIHDASTCNSPAITVSSAAVAASPWGMINVNT^ 

SQVQISSSPRNLGIKRRKSQAKKWCIPAPAAMNSRSSGEVVPSDLWAWRKYGQKPIKGS 

PYPRGYTRCSSSKGCSARKQVBRSRTDPNMLVITYTSEHNHPWPTQRNALAGSTRSSSSS 

SLNPSSKSSTAAATTSPSSRVFQNNSSKDEPNNSNLPSSSTHPPFDAAAIKEENVEERQE 

KMEFDYNDVENTYRPELLQEFQHQPEDFFADLDELEGDSLTMLLSHSSGGGNMENKTTIP 

DVFSDFFDDDESSRSL* 

>G1965 (1..609) 

ATGGATAACTTCAATGTTGTTGCCAATGAAGACAATCAAGTGAATGATGTGAAGCCTCCA 
CCACCCCC^CCGCGAGTGTGTGCAAGATGTGATTCTGATAACACAAAATTTTGTTACTAC 
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AACAATTATAGTGAGTTTCAACCGCGCTACTTCTGCAAGAACTGTCGAAGATACTGGACT 
CATGGTGGGGCTTTAAGAAACGTACCAATTGGTGGGAGTAGTCGTGCCAAGCGGACAAGG 
ATAAATCAACCTTCAGTTGCTCAGATGGTTTCTGTTGGAATCCAACCAGGGAACCGTTTT 
AGTTCTTTGTCTCATATTCATGGTGGTATGGTAACAAATGTGCATCCAACTCAAACTTTT 
CGACCAAATCATCGCCTAGCTTTCCATAATGGATCATTTGAGCAAGATTATTATGATGTT 
GGGTCTGATAATCTTTTGGTAAACCAACAAGTTGGTGGATATGTTGATAATCACAACGGT 
TATCACATGAATCAAGTGGATCAATACAACTGGAACCAGAGCTTCAATAACGCTATGAAC 
ATGAATTATAATAACGCTAGCACTAGCGGAAGGATGCATCCTAGTCATTTAGAGAAGGGT 
GGTCCTTGA 

>G1965 Amino Acid Sequence (domain in AA coordinates : 27-55) 

MDNFNWANEDNQVNDVKPPPPPPRVCARCDSDNTKFCYYmYSEFQPRYFCKNCRRYWT 

HGGALRNVPIGGSSRAKRTRINQPSVAQMVSVGIQPGNRFSSLSHIHGGMVTNVHPTQTF 

RPNHRLAFHNGSFEQDYYDVGSDNLLVNQQVGGYVDN^ 

MNYNNASTSGRMHPSHLEKGGP* 

>G1976 (1..1152) 

ATGACTGATCCTTATTCCAATTTCTTCACAGACTGGTTCAAGTCTAATCCTTTTCACCAT 
TACCCTAATTCCTCCACTAACCCCTCTCCTCATCCTCTTCCTCCTGTTACTCCTCCCTCT 
TCCTTCTTCTTCTTCCCTCAATCCGGAGACCTCCGCCGTCCACCGCCGCCACCAACTCCT 
CCTCCTTCTCCTCCTCTCCGAGAAGCCCTCCCTCTCCTCAGCCTCAGCCCCGCCAACAAA 
CAACAAGACCACCATCACAACCATGACCACCTTATTCAAGAACCACCTTCAACCTCCATG 
GATGTCGACTACGATCATCACCATCAAGATGATCATCATAACCTCGATGACGATGACCAT 
GACGTCACCGTTGCTCTTCACATAGGCCTTCCAAGCCCTAGTGCTCAAGAGATGGCCTCT 
TTGCTCATGATGTCTTCTTCTTCCTCTTCCTCGAGGACCACTCATCATCACGAGGACATG 
AATCACAAGAAAGACCTCGACCATGAGTACAGCCACGGAGCTGTCGGAGGAGGAGAAGAT 
GACGATGAAGATTCAGTCGGCGGAGACGGCGGCTGTAGAATCAGCAGACTCAACAAGGGT 
CAATATTGGATCCCTACACCTTCTCAGATTCTCATTGGCCCTACTCAGTTCTCATGTCCT 
GTTTGCTTCAAAACCTTCAACAGATACAATAACATGCAGATGCATATGTGGGGACATGGA 
TCACAATACAGAAAAGGACCTGAATCTCTAAGGGGAACACAACCAACAGGAATGCTAAGG 
CTTCCGTGCTATTGCTGCGCCCCAGGCTGTCGCAACAACATTGACCATCCAAGGGCAAAG 
CCTCTCAAAGACTTCAGAACCCTTCAAACACATTACAAGAGAAAACATGGGATCAAACCT 
TTCATGTGTAGGAAATGTGGAAAGGCTTTCGCAGTCCGAGGGGACTGGAGAACACATGAG 
AAGAATTGTGGCAAACTTTGGTATTGCATATGTGGATCTGATTTCAAGCACAAGAGATCT 
CTCAAAGATCACATCAAGGCTTTTGGGAATGGTCATGGAGCCTACGGAATTGATGGGTTT 
GATGAAGAAGATGAGCCTGCCTCTGAGGTAGAACAATTAGACAATGATCATGAGTCAATG 
CAGTCTAAATAG 

>G1976 Amino Acid Sequence (domain in AA coordinates: 219-323) 

MTDPYSNFFTDWFKSNPFHHYPNSSTNPSPHPLPPVTPPSSFFFFPQSGDLRRPPPPPTP 

PPSPPLREALPLLSLSPANKQQDHHHNHDHLIQEPPSTSMDVDYDHHHQDDHHNLDDDDH 

D VTVALHI GLPS PS AQEMAS LLMM S S S S S S S RTTHHHEDMNHKKDLDHE YSHGAVGGGED 

DDEDSVGGDGGCRISRLNKGQYWIPTPSQILIGPTQFSCPVCFKTFNRYNNMQMHMWGHG 

SQYRKGPESLRGTQPTGMLRLPCYCC7VPGCRNNIDHPRAKPLKDFRTLQTHYKRKHGIKP 

FMCRKCGKAFAVRGDWRTHEKNCGKLWYCICGSDFKHKRSLKDHIKAFGNGHGAYGIDGF 

DEEDEPASEVEQLDNDHESMQSK* 

>G2057 (27.. 1289) 

GCCGTCTCGACGAATATGCTCTACCAATGTCTGACGACCAATTCCATCACCCGCCGCCTC 
CTTCTTCAATGAGGCACCGTTCTACGTCGGATGCGGCGGACGGCGGCTGCGGCGAGATTG 
TTGAGGTGCAAGGTGGTCACATTGTTCGGTCTACCGGAAGAAAAGACCGCCACAGCAAAG 
TCTGCACGGCTAAAGGGCCACGTGACCGGCGCGTGAGACTCTCTGCTCACACGGCGATTC 
AGTTTTACGATGTOCAAGACAGGCTTGGTTTCGACCGACCTAGCAAAGCCGTTGATTGGC 
TTATCAAAAAGGCTAAGACTTCCATTGACGAGCTCGCTGAGCTTCCTCCCTGGAATCCCG 
CCGATGCAATTCGCCTAGCCGCTGCTAACGCTAAACCCAGAAGAACCACCGCCAAAACCC 
AAATCTCTCCGTCTCCGCCACCGCCGC^CAGCAAC^ 

GTGTTGGCTTCAACGGAGGAGGAGCAGAGCATCCGAGTAACAACGAGTCGAGTTTTCTCC 

GCTCTTCAACGGAGGCTCCTTCGAATCATAACCTTATGCACAACTATCATCATCAGCATC 
CGCCGGATTTGCTTTOTCGAACTAATAGCCAAAACCAAGATCTCCGTCTCTCGCTGCAAT 
CGTTCCCGGATGGTCCACCGTCGCTTCTGCACCACCAACATCACCACCACACCTCTGCTT 
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CCGCCTCCGAGCCTACTCTGTTCTACGGACAGAGCAATCCGTTAGGGTTTGACACATCGA 

GTTGGGAGCAGCAGTCGTCGGAATTCGGAAGGATTCAGAGACTAGTGGCTTGGAACAGCG 

GCGGTGGCGGCGGAGCAACCGATACAGGAAACGGAGGAGGGTTTCTGTTCGCTCCTCCTA 

CTCCTTCAACGACGTCGTTTCAGCCAGTTCTTGGCCAAAGCCAACAGCTTTATTCTCAGA 

GGGGTCCCCTTCAGTCCAGTTACAGTCCCATGATCCGTGCTTGGTTTGATCCTCACCATC 

ATCACCAATCCATCTCCACCGACGATCTCAACCACCACCATCACCTTCCTCCACCGGTTC 

ACCAATCAGCAATCCCCGGAATCGGATTCGCCTCAGGTGAATTCTCTTCGGGTTTTCGCA 

TACCAGCACGGTTTCAGGGCCAAGAAGAGGAGCAGCACGACGGTCTCACTCACAAGCCGT 

CCTCTGCTTCCTCTATTTCTCGCCATTGACAATCGAAACTAATCCTC 

>G2057 Amino Acid Sequence (domain in AA coordinates: TBD) 

MSDDQFHHPPPPSSMRHRSTSDAADGGCGEIVEVQGGHIVRSTGRKDRHSKVCTAKGPRD 

RRVRLSAHTAIQFYDVQDRLGFDRPSKAVDWLIKKAKTSIDELAELPPWNPADAIRLAAA 

NAKPRRTTAKTQISPSPPPPQQQQQQQQLQFGVGFNGGGAEHPSNNESSFLPPSMDSDSI 

ADTIKSFFPVIGSSTEAPSNHNLMHNYHHQHPPDLLSRTNSQNQDLRLSLQSFPDGPPSL 

LHHQHHHHTSASASEPTLFYGQSNPLGFDTSSWEQQSSEFGRIQRLVAWNSGGGGGATDT 

GNGGGFLFAPPTPSTTSFQPVLGQSQQLYSQRGPLQSSYSPMIRAWFDPHHHHQSISTDD 

LNHHHHLPPPVHQSAIPGIGFASGEFSSGFRIPARFQGQEEEQHDGLTHKPSSASSISRH 

* 

>G2107 (79. .624) 

ACCACAAAACAGAGCAACACACAACACAAAGCTTCATTTCAATTCTGTTTCGAGAACCCT 

TTGAGAACCAGATCGGAGATGGAAAACGACGATATCACCGTGGCGGAGATGAAGCCAAAG 

AAGCGTGCTGGACGGAGGATTTTCAAGGAGACACGTCACCCAATCTACAGAGGCGTGCGG 

CGTAGGGACGGCGACAAATGGGTATGCGAAGTCCGTGAACCGATTCATCAGCGTCGAGTC 

TGGCTCGGAACTTATCCGACGGCAGATATGGCCGCACGTGCTCACGACGTGGCGGTTCTT 

GCTCTGCGCGGGAGATCCGCGTGTTTGAATTTCTCCGATTCTGCTTGGAGGTTGCCGGTG 

CCGGCATCCACTGATCCGGACACGATCAGGCGCACGGCGGCCGAAGCAGCGGAGATGTTC 

AGGCCGCCGGAGTTTAGTACAGGAATTACGGTTTTACCCTCAGCCAGTGAGTTTGACACG 

TCGGATGAAGGAGTCGCTGGAATGATGATGAGGCTCGCGGAGGAGCCGTTGATGTCGCCG 

CCAAGATCGTACATTGATATGAATACGAGTGTGTACGTGGACGAAGAAATGTGTTACGAA 

GATTTGTCACTTTGGAGTTACTAAAATACGTATGTGTTAAAAAACCAAAGATCGTATGTG 

TATGTATGCATAATAAATGGGCTTAATGATGGGCATAGATATGATAGGTCCAGCCTATAT 

GTTAAATGTGTTTTATTTTTTGGTTTATCTAGTTTCCTAGGTATTTACCA71ATTGTATTA . 

GTATAAGTTTTATTAAGAAATAATCAAAAATGTTGTTGCCAAAAAAAAAAAAAAAT^AAAA 

AAAAA 

>G2107 Amino Acid Sequence (domain in AA coordinates: TBD) 
MENDDITVAEMKPKKRAGRRIFKETRHPIYRGVRRRDGDKWVCEVREPIHQRRVWLGTYP, 
TADMAARAHDVAVIiALRGRSACLNFSDSAWRLPVPASTDPDTIRRTAAEAAEMFRPPEFS 
TGITVLPSASEFDTSDEGVAGMMMRIiAEEPLMSPPRSYID^^SVYVI3EEMCYEDLSLWS 

Y* 

>G211 (1..750) 

ATGATGTCATGTGGTGGGAAGAAGCCAGTGTCTAAGAAAACAACGCCGTGTTGCACGAAG 
ATGGGGATGAAGAGAGGACCATGGACGGTGGAGGAAGACGAGATTCTTGTGAGCTTCATT 
AAGAAAGAAGGTGAAGGACGGTGGCGATCGCTTCCTAAGAGAGCTGGTTTACTCAGATGT 
GGAAAGAGCTGTCGTCTACGGTGGATGAACTATCTCCGACCCTCGGTTAAACGTGGAGGA 
ATTACGTCGGACGAGGAAGATCTCATCCTCCGTCTTCACCGCCTCCTCGGCAACAGGTGG 
TCATTGATCGCGGGAAGGATACCGGGAAGGACTGATAATGAAATTAAGAACTATTGGAAC 
ACTCATCTTCGTAAGAAACTTTTAAGGCAAGGAATTGATCCTCAAACCCACAAGCCTCTT 
GATGCAAACAACAT€CATAAACCAGAAGAAGAAGTTTCCGGTGGACAAAAGTACCCTCTA 
GAGCCTATTTCTAGTTCTCATACTGATGATACCACTGTTAATGGCGGGGATGGAGATAGC 
AAGAACAGTATCAATGTCTTTGGTGGTGAACACGGCTACGAAGACTTTGGTTTCTGCTAC 
GACGACAAGTTCTCATCGTTTCTTAATTCGCTCATCAACGATGTTGGTGATCCTTTTGGT 
AATATTATCCCAATATCTCAACCTTTGCAGATGGATGATTGTAAGGATGGGAITGTTGGA 
GCGTCGTCTTCTAGCTTAGGACATGACTAG 

>G211 Amino Acid Sequence (conserved domain in AA coordinates : 24-137) 
MMS CGGKKPVS KKTTPCCTKMGMKRGPWTVEEDE ILVS F I KKEGEGRWRSLPKRAGUjRC 
GKSCRLRWMNYLRPSVKRGGITSDBEDLILRIiHRLLGNRWSLIAGRI PGRTDNEI KNYWN 
THLRKKLLRQGIDPQTHKPLDANNIHKPEEEVSGGQKYPLEPISSSHTDDTTVNGGDGDS 
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KNSINVFGGEHGYEDFGFCYDDKFSSFLNSLINDVGDPFGNIIPISQPLQMDDCKDGIVG 

ASSSSLGHD* 

>G2133 (26.. 457) 

ATCTCATCTTCATCCACCCAAAAACATGGATTC^GAGACACCGGAGAAACTGACCAGAG 

CAAGTACAAAGGTATCCGTCGTCGGAAATGGGGAAAATGGGTATCAGAGATTCGTGTCCC 

GGGAACTCGTCAACGTCTCTGGTTAGGCTCTTTCTCCACCGCAGAAGGCGCTGCCGTAGC 

CCACGACGTCGCTTTTTACTGCTTGCACCGACCATCTTCCCTCGACGACGAATCTTTTAA 

CTTCCCTCACTTACTTACAACCTCCCTCGCCTCCAATATATCTCCTAAGTCCATCCAAAA 

AGCTGCTTCCGACGCCGGCATGGCCGTGGACGCCGGATTCCATGGTGCTGTGTCTGGGAG 

TGGTGGTTGTGAAGAGAGATCTTCCATGGCGAATATGGAGGAGGAGGACAAACTTAGTAT 

CTCCGTGTATGATTATCTTGAAGACGATCTCGTTTGATCTATACGAGTACGTTTTTAGCA 
GTTAA 

>G2133 Amino Acid Sequence (domain in AA coordinates : 11-83 ) 
MDSRDTGETDQSKYKGIRRRKWGKWVSEIRVPGTRQRLWLGSFSTAEGAAVAHDVAFYCL 
HRPS S LDDES FNFPHLLTTS LtASN I S PKS I QKAASD AGMAVBAGFHGAVSGS GGCEERS S 
MANMEEEDKLS ISVYDYLEDDLV* 
>G2134 (36.. 644) 

GAGCAAAAACTTTGTGTGCGTGTGTGTGTGTGTTCATGGCTGGTCTTAGGAATTCCGGTA 

ACAGCGACAAAGCGCAAAACGATGGCAAAGGTGTACCATCTGCCTACAGAGGAGTCCGGA 

AGAGAAAATGGGGGAAATGGGTGTCTGAAATCCGTGAACCGGGGACCAAGAACGGTATCT 

GGCTAGGCAGTTTCGAGACTCCTGAAATGGCTGCAACCGCATACGACGTGGCAGCATTTC 

ATTTCAGAGGGAGAGAAGCTCGTCTCAACTTCCCTGAGCTCGCCAGCAGCCTTCCACGTC 

CTGCAGACTCTAGCTCAGACAGCATTCGCATGGCAGTTCATGAGGCAACACTCTGCCGCA 

CCACCGAAGGAACAGAGTCAGCCATGCAAGTGGACAGCTCAAGCTCCTCCAATGTAGCTC 

CAACAATGGTCAGACTCTCGCCCAGGGAAATTCAAGCGATCAACGAGTCAACTTTGGGAT 

CTCCTACTAC^AATGATGCATTCAACATACGACCCTATGGAGTTTGCTAATGATGTGGAGA 

TGAATGCTTGGGAAACATACCAGAGTGACTTTCTTTGGGACCCTTAACCCCAAAACCTAA 

CTCATGGAGAGCTTCTACAGCTCAATCTTACAATACCAGCATAAGTTACTGGCTTAGAAT 

ACTTAAATTTATTGAAGTTTAGTTTTCAGAGTCTACCACAAGGGTTGTTGATTCTGACGT 

TATAGCAAAGAATAAAGCTCATCAGATTTTGGAGGGAAAGACTCTATGAGCTTGATGGGT 

CCCTGAAAGGACCTCTTCACAAATATTTTTAAATTTTTTTGTTACTAGTAGAAAC^TA 

TTATGAGGTGTGACTTATTATTATTTTTTACAATTGTTTGTTACCTCATTGATGTATTTG 
ATTT 

>G2134 Amino Acid Sequence (domain in AA coordinates: TBD) 

r^GLRNSGNSDKAQNDGKGVPSAYRGVRKRKWGKWVSEIREPGTKNRIWLGSFETPE^^ 

TAYDVAAFHFRGREARLNFPELASSLPRPADSSSDSIRMAVHEATLCRTTEGTESAMQVD 

SSSSSNVAPTMVRLSPREIQAINESTLGSPTTMMHSTYDPM 

WDP* PQNLTHGELLQLNLTI PA* 

>G2151 (236.. 1321) 

TTTTTTTTTTAGGGTTCATAAGAACAAATTGGATTTTGAGCT 

ACTTTGATTACTGGGTAATTTTAAAACCGCCA1TGTTGTTCTCTTTACTACTTTTGGGAA 
TTAGGGTTTATGATTTCTGGGTATTAGATTAGAT^ 

AATTTAAAAATCTCTTATTTCTGTTAAAGACTTGTAATTTTGGAGTTTTTAATGC^ 

CGGAAGAGAAGCAATGGCATTTCCAGGCTCGCATTCTCAGTACTATCTTCAAAGAGGAGC 

CTTTACTAATCTCGCACCTTCCC^GTCGC^^ 

GGGATTGAGGCCAATGTCTTAACCCTAACATTCATCACCCTCAGGCTAACAATCCAGGACC 

TCCTTTCTCGGATTTTGGAC^CACCATTCACATGGGAGTGGTCTCCTCTGCTTCTGATGC 

TGATGTGCAACCGCCACCGCCACCGC(^CCACCAGAGGAACCGATGGTTAAGAGGAAACG 

TGGACGGCCAAGAAAGTATGGAGAACCGATGGTTAGTAATAAGTCTAGGGACTCTTCTCC 

AATGTCTGATCCTAATGAACCTAAACGGGCCAGAGGTCGACCTCCTGGAACTGGAAGGAA 

GCAACGCTTGGCTAATCTTGGTGAGTGGATGAATACTTCAG 

TCATGTGATCAGCATTGGAGCAGGAGAAGACATTGCT 

ACAAAGACCTCGGGCTCTTTGTATAATGTCAGGCACTGGAA 

GTGCAAACCCGGTTCAACCGATCGTCACTTAACATACGAGGGACCTTTTGAGATTATAAG 
TTITGGTGGATCTTATTTGGTGAATGAAGAAGGTGGATCCAGAAGTCGAACAGGCGGATT 
GAGTGTCTCTCTTTCTCGTCCCGATGGTAGTATTATTGCCGGTGGAGTTGACATGCTTAT 
CGCAGCCAACCTTGTTCAGGTGGTGGCATGTAGTTTTGTA 
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TCATAATAACAATAACAAGACCATCAGACAAGAAAAGGAACCAAATGAAGAGGACAACAA 

TAGTGAAATGGAGACCACACCGGGTAGTGCAGCTGAACCAGCAGCATCTGCGGGTCAGCA 

GACGCCACAG71ACTTCTCTTCTCAGGGAATAAGGGGGTGGCCCGGTTCAGGCTCAGGCTC 

TGGCAGATCACTTGACATTTGCAGAAACCCACTCACTGATTTTGATTTGACTCGTGGATG 

ATATACACTATTAGTCTTTGAAGCAGCAGCATACAAAATGTGATTGCTGTACATATGTTA 

TTGTAGATTTCTCTCTGGGAATGTTGAAATCAGACATTTAAGGATTGATACTAGATCTCT 

CAGCTCCTTCTAACATTGTTAATGTAACAGAACCCTCCCACTTTCATGCTATTTGC 

>G2151 Amino Acid Sequence (domain in AA coordinates : 93-113 , 124-144) 

MDGREAMAFPGSHSQYYLQRGAFTNLAPSQVASGLHAPPPHTGLRPMSNPNIHHPQANWP 

GPPFSDFGHTIHMGWSSASDADVQPPPPPPPPEEPMVKRKRGRPRKYGEPMVSNKSRDS 

SPMSDPNEPKRARGRPPGTGRKQRLANLGEWMNTSAGLAFAPHVISIGAGEDIAAKVLSF 

SQQRPRALCIMSGTGTISSVTLCKPGSTDRHLTYEGPFEIISFGGSYLVNEEGGSRSRTG 

GLSVSLSRPDGSIIAGGVDMLIAANLVQWACSFVYGTU^KTHNl^KTIRQEKEPNEE^ 

NWSEMETTPGSAAEPAASAGQQTPQNFSSQGIRGWPGSGSGSGRSLDICRNPLTDFDLTR 

G* 

>G2154 (82.. 1317) 

GCAAAAAGAAAAAATGAAAAAAAATCCCTAACTCTCTCTCTCTAGAAATTCTTATTTTTG 

TGCGTATCTCTCTAAAAAGGAATGGATCCTAACGAAAGCCACCATCACCACCAACAACAA 

CAGCTCCATCACCTCCACCAACAGCAACAGCAACAGCAGCAGCAGCAACGACTCACTTCT 

CCTTACTTCCACCACCAACTACAGCACCATCACCACCTTCCAACCACCGTAGCAACCACC 

GCTTCTACCGGAAACGCCGTTCCATCTTCCAACAATGGGCTTTTCCCTCCGCAGCCTCAG 

CCACAGCACCAGCCTAATGATGGGTCATCTTCTCTCGCGGTGTACCCTCATTCAGTTCCG 

TCCTCGGCTGTGACGGCGCCGATGGAGCCGGTAAAGAGGAAGAGGGGTCGACCAAGAAAG 

TATGTGACGCCGGAACAAGCCCTAGCGGCTAAGAAATTGGCGTCTTCTGCGAGTAGTTCG 

TCTGCTAAACAGAGGCGAGAGCTTGCTGCTGTTACCGGTGGTACGGTATCGACTAATTCC 

GGGTCATCCAAGAAATCTCAGCTTGGTTCTGTCGGGAAAACTGGACAATGTTTTACTCCG 

CATATTGTTAATATAGCTCCTGGCGAGGATGTGGTCCAGAAAATTATGATGTTCGCAAAC 

CAAAGCAAGCATGAACTATGCGTTCTTTCTGCATCAGGCACTATCTCTAATGCATCCTTG 

CGCCAACCGGCTCCATCAGGAGGCAACTTACCATATGAGGGTCAATACGAGATTCTCTCA 

CTATCTGGATCCTATATCCGAACTGAACAAGGTGGTAAATCCGGCGGCCTTAGCGTTTCT 

TTATCTGCTTCAGATGGTCAGATCATCGGTGGAGCGATTGGTAGCCATCTCACAGCTGCT 

GGCCCGGTTCAGGTGATTCTTGGTACGTTTCAGCTTGATAGAAAGAAGGATGCCGCCGGG 

AGTGGTGGGAAAGGGGATGCTTCAAACAGTGGAAGTCGGTTAACTTCTCCTGTAAGCTCT 

GGACAGTTGCTTGGCATGGGTTTCCCTCCTGGTATGGAATCTACGGGAAGAAATCCAATG 

AGGGGAAACGACGAGCAACATGATCATCATCATCATCAAGCCGGTTTGGGTGGACCTCAT 

CATTTCATGATGCAAGCGCCGCAGGGGATACACATGACACATTCCAGGCCATCTGAATGG 

CGCGGAGGAGGCAACAGCGGTCATGATGGCAGAGGCGGTGGCGGGTATGATTTGTCAGGA 

AGGATAGGACATGAGTCGTCGGAGAATGGAGATTACGAGCAGCAAATACCGGATTAGCAG 

AGCTTCCAGGAGAAGTGTGTAGAGTTTAGATCCCAAGTAGAGAAACAGAAGGCGAGCAAA 

GAATCTGTyvCTGAGAGAGGACTTATTAGAC^GAGACTCGTCTGAAGGGTCrTTAATCATA 

GAAAGAAGTTGCTGAGTGATTGCTTTTGTTCTTCTTCTTGGTACGGTGTATTATATTAAC 

TCCACAACCTTTTTTTTATACTTTCAGTAACGA^ 

TTTTTTTATACTCTTTTTCTTTTCTTATAATATTTTTTTTGGTTTTTCTTT 

CTAAATU^GGAAATGCTCTTTTTGTGAAATATATACACTTCGTTTG 

>G2154 Amino Acid Sequence (domain in AA coordinates : 97-119) 

MDPNESHHHHQQQQLHH1JIQQQQQQQQQQRLTSPYFHHQLQHHHHLPTTVATTASTGNAV 

PSSNNGLFPPQPQPQHQPNDGSSSLAVYPHSVPSSAVTAPMEPVKRKRGRPRKYVTPEQA 

LAAKKLASSASSSSTOCQRRELAAVTGGTVSTNSGSSKKSQLGSVGKTGQCFTPHIVNIAP 

GEDWQKIMMFANQSKHELCVLSASGTISNASLRQPAPSGGNLPYEGQYEILSLSGSYIR 

TEQGGKSGGLSVSLSASDGQIIGGAIGSHLTAAGPVQVILGTFQLDRKKDAAGSGGKGDA 

SNSGSRLTSPVSSGQLLGMGFPPGMESTGRNPMRGNDEQHDHHHHQAGLGGPHHFMMQAP 

QGIHMTHSRPSEWRGGGNSGHDGRGGGGYDLSGRIGHESSENGDYEQQIPD* 

>G2157 (306. .1238) 

TCTTTTGATTTTAACCTTTTTTCAGTAGCAAGCCAAAAAAAAAAAACAGACAAAGAAG^ 
CCTTTTATGATAAAGGTATGATGATAGCAAACAAATGATACCCCCATGTCTTGTGTGTCT 
GCTTCATGCAACATGTTGGTTTGGATTTGGTTAATCTAAAAGTTTAAGATAAGGTTTTCG 
GATTCTCTTCCTGTCTTGTAATAGTTTCTTGTC^ 
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AAAAAAACAAG7y\AAAGAAAAAGATTCTCTTTCTCGTTTTATTTCCATTAGAGAAG7U^A 

AAAGAATGGCGAATCCTTGGTGGGTAGGGAATGTTGCGATCGGTGGAGTTGAGAGTCCAG 

TGACGTCATCAGCTCCTTCTTTGCACCACAGAAACAGTAACAACAACAACCCACCGACTA 

TGACTCGTTCGGATCCAAGATTGGACCATGACTTCACCACCAACAACAGTGGAAGCCCTA 

ATACCCAGACTCAGAGCCAAGAAGAACAGAACAGCAGAGACGAGCAACCAGCTGTTGAAC 

CCGGATCCGGATCCGGGTCTACGGGTCGTCGTCCTAGAGGTAGACCTCCTGGTTCCAAGA 

ACAAACCAAAGAGTCCAGTTGTTGTTACCAAAGAAAGCCCTAACTCTCTCCAGAGCCATG 

TTCTTGAGATTGCTACGGGAGCTGACGTGGCGGAAAGCTTAAACGCCTTTGCTCGTAGAC 

GCGGCCGGGGCGTTTCGGTGCTGAGCGGTAGTGGTTTGGTTACTAATGTTACTCTGCGTC 

AGCCTGCTGCATCCGGTGGAGTTGTTAGTTTACGTGGTCAGTTTGAGATCTTGTCTATGT 

GTGGGGCTTTTCTTCCTACGTCTGGCTCTCCTGCTGCAGCCGCTGGTTTAACCATTTACT 

TAGCTGGAGCTCAAGGTCAAGTTGTGGGAGGTGGAGTTGCTGGCCCGCTTATTGCCTCTG 

GACCCGTTATTGTGATAGCTGCTACGTTTTGCAATGCCACTTATGAGAGGTTACCGATTG 

AGGAAGAACAACAGCAAGAGCAGCCGCTTCAACTAGAAGATGGGAAGAAGCAGAAAGAAG 

AGAATGATGATAACGAGAGTGGGAATAACGGAAACGAAGGATCGATGCAGCCGCCGATGT 

ATAATATGCCTCCTAATTTTATCCCAAATGGTCATCAAATGGCTCAACACGACGTGTATT 

GGGGTGGTCCTCCGCCTCGTGCTCCTCCTTCGTATTGATTAGTTAGATAGGCGGTGGTTG 

GTGCGTTCTTTTTACTGGAATGATTATATTTTCCATTAGGATGGTTAGGCTTTTGTTTAT 

TAAAGCTATCAAGTTTCTTTTTTTTTTACGGATAATTCGGATGACAATTAGCTAGTGTTT 

GTTTGTTTGTTTTGTGGCGGCTTTTCTGACTTGACTATTTTGATCGCGGATAGCTTTGTA 

TGAAAGTGAATTGATTGTAGAATCGTCTTTTGAATTTTGATGTTGGAAAAAACCAA 

>G2157 Amino Acid Sequence {domain in AA coordinates: 82-102, 164-107) 

MANPWWGOTAIGGVESPVTSSAPSLHHRN 

QTQSQEEQNSRDEQPAVEPGSGSGSTGRRPRGRPPGSKNKPKSPVWTKESPNSLQSHVL 
EIATGADVAESLNAFARRRGRGVSVLSGSGLVTNVTLRQPAASGGVVSLRGQFEILSMCG 
AFLPTSGSPAAAAGLTI YLAGAQGQWGGGVAGPL I ASGPVIVI AATFCNATYERLP I EE 

EQQQEQPLQLEDGKKQKEENDDNESGNNGNEGSMQPPMYl^PPNFIPNGHQMAQHDVYWG 

GPPPRAPPSY* 

>G2181 (1..1005) 

ATGATGCTTGCGGTGGAAGATGTGTTAAGCGAACTCGCCGGAGAAGAAAGGAACGAGAGA 
GGATTGCCACCTGGCTTCCGGTTTCACCCGACGGACGAAGAGCTCATTACCTTCTACTTA 
GCITCCAAAATCTTCCATGGTGGTCTCTCCGGCATTCACATXTCCGAAGTTGATCTCAAC 
CGCTGTGAACCTTGGGAGCTACCAGAAATGGCGAAGATGGGAGAGAGAGAGTGGTACTTT 
TATAGTCTAAGGGACAGGAAATATCCGACAGGTTTGAGGACTAACAGAGCAACTACTGCT 
GGATACTGGAAAGCTACCGGCAAAGATAAGGAAGTCTTCTCCGGCGGAGGAGGACAGCTT 
GTTGGGATGAAGAAGACGTTGGTGTTCTACAAAGGTAGGGCTCCACGTGGCCTCAAGACT 
AAGTGGGTCATGCATGAGTATCGCCTCGAAAACGACCATTCACACCGCCACACGTGTAAG 
GAGGAATGGGTGATTTGCAGAGTGTTCAATAAAACAGGAGACAGAAAAAATGTTGGATTA 
ATCCATAACCAAATCAGCTACCTTCATAAC(^TT(^CTCTCAACAACACATCAT(^TCAT 
GA.TGAAGCCTTACCTTTGCTTATAGAACCTTCCAACAAAACCCTAACCAACTTCCCATCA 
CTACTCTACGATGATCCACACCAAAACTACAATAATAACAACTTCCTTCATGGATCATCA 
GGCCACAACATCGACGAGCTCAAAGCCTTAATCAACCCTGTCGTCTCTCAGCTCAACGGT 
ATCATCTTTCCTTCAGGGAACAACAACAACGACGAAGACGACTTCGACTTTAACCTCGGC 
GTGAAAACAGAGGAGTCTTCGAACGGTAACGAAATTGACGTACGAGATTACTTGGAGAAC 
CCTCTGTTTCAGGAAGCGAGTTATGGTCTGTTGGGTTTTTCGTCTTCTCCTGGACCTCTT 
CACATGCTACTAGATTCTCCATGTCCTTTAGGATTCCAGCTGTAG 

>G2181 Amino Acid Sequence (conserved domain in AA coordinates : 22-169) 

MMLAVEDVLSEIiAGEERNERGLPPGFRFHPTDEELITFYLASKIFHGGLSGIHISEVDLN 

RCEPWELPEMAKMGEREWYFYSLRDRKYPTGLRTNRATTAGYWKATGKDKEVFSGGGGQL 

VGMKKTLVFYKGRAPRGLKTKWTOHEYRLENDHSHRHTCKEEWIC^ 

IHNQISYLHiraSLSTTHHHHHEALPLLIEPSNKTLTNFPSLLYDDPHQNYl^^ 

GHNIDELKALINPWS QLNG 1 1 FPSGNNNNDEDDFDFNLGVKTEQSSNGNEIDVRDYLEN 

PliFQEASYGLLGFSSSPGPLHMLLDSPCPLGFQL* 

>G221 (115.. 795) 

CTCTCTTATTCTCTCACTCTTTTTTTTTTATATTCCTCTCTCT 

ATTTAAAAACTTGATCGTATATAATAAAGTAAATAAAGAATAATAACAAAAAAAATGGAG 
AAAAGAGGAGGAGGAAGTAGTGGAGGTTCGGGATCATCAGCAGAAGCAGAAGTGAGAAAA 
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GGACCATGGACGATGGAAGAAGATCTTATTCTTATC7VACTATATCGCCAACCACGGCGAT 
GGTGTTTGGAATTCTCTCGCCAAATCTGCAGGTCTAAAACGAACCGGGAAAAGTTGCCGG 
CTCCGGTGGCTGAACTATCTCCGCCCCGACGTACGACGGGGAAACATCACTCCAGAAGAG 
CAACTTATCATCATGGAACTTCATGCTAAGTGGGGAAACAGGTGGTCGAAAATCGCCAAA 
CATCTTCCAGGAAGAACGGACAACGAGATCAAAAATTTCTGTAGGACAAGAATTCAAAAA 
TACATGAAGCAATCGGATGTAACAACAACATCGTCCGTTGGATCTCATCATAGCTCAGAG 
ATCAACGATCAAGCTGCAAGCACGTCGAGCCATAATGTCTTTTGTACACAAGATCAAGCG 
ATGGAGACTTATTCTCCTACACCGACATCATATCAACATACCAATATGGAATTCAACTAT 
GGTAACTATTCGGCCGCGGCAGTGACGGCAACCGTGGATTATCCAGTACCGATGACCGTT 
GATGATCAAACCGGTGA7UIACTATTGGGGCATGGATGATATTTGGTCATCAATGCATTTA 
TTGAATGGTAATTGATTGATCGGTGGACAAAACATGGAATATTAATTGAGTATTATATAT 
GATTTTTAGGAGTACTATTATTAGTACGTGACATGTATATGTTTTTGCCTCGTTGTAGAG 
GTTTGGGGTTATAATTAATATATAATGTTATCTAATATGCAACCTTGATACATATTTGGA 
TCTTTATTGAACCCATGTTATACATAAATAAAATTGTTGAAGGGGTCATAAAAAAAT^ATUV 
AAAAAAAAAAAAA 

>G221 Amino Acid Sequence (domain in AA coordinates: 21-125) 

MEKRGGGSSGGSGSSAEAEVRKGPWTMEEDLILINYIAMIGDGVWNSIjAKSAGLKRTGKS 

CRLRWLNYLRPDVRRGNITPEEQLIIMELHAKWGNRWSKIAKHLPGRTDNEIKNFCRTRI 

QKYIKQSDVTTTSSVGSHHSSEINDQAASTSSHNVFCTQDQAMETYSPTPTSYQHTNMEF 

NYGNYS7^AAVTATVDYPVPMTVDDQTGENYWGI^DIWSSMHLLNGN* 

>G2290 (119.. 982) 

TTCTTTCTTTCTTTCTTTCTCTTCCAATCAAGAACAAACCCTAGCTCCTCTCTTTTTCTC 
TCTCTACCTCTCTTTCTCTATCTTCTCTTATCACTACTTCTCTCGCCGATCAATCATCAT 
GAACGATCCTGATAATCCCGATCTGAGCAACGACGACTCTGCTTGGAGAGAACTGACACT 
CACAGCTCAAGATTCTGACTTCTTCGACCGAGACACTTCCAATATCCTCTCTGACTTCGG 
TTGGAACCTCCACCACTCCTCCGATCATCCTCACAGTCTCAGATTCGACTCCGATTTAAC 
ACAAACCACCGGAGTCAAACCTACCACCGTCACTTCTTCTTGTTCCTCATCCGCCGCCGT 
TTCCGTTGCCGTTACCTCTACTAATAATAATCCCTCAGCTACCTCAAGTTCAAGTGAAGA 
TCCGGCCGAGAACTCAACCGCCTCCGCCGAGAAAACACCACCACCGGAGACACCAGTGAA 
GGAGAAGAAGAAGGCTCAAAAGCGAATTCGGCAACCAAGATTCGCATTCATGACCAAGAG 
TGATGTGGATAATCTTGAAGATGGATATCGATGGCGTAAATATGGACAAAAAGCCGTCAA 
GAATAGCCCATTCCCAAGGAGCTACTATAGATGCACAAACAGCAGATGCACGGTGAAGAA 
GAGAGTAGAACGTTCATCAGATGATCCATCGATAGTGATCACAACATACGAAGGACAACA 
TTGCCATCAAACCATTGGATTCCCTCGTGGTGGAATCCTCACTGCACACGACCCACATAG 
CTTCACTTCTCATCATCATCTCCCTCCTCCATTACCAAATCCTTATTATTACCAAGAACT 
CCTTCATCAACTTCACAGAGACAATAATGCTCCTTCACCGCGGTTACCCCGACCTACTAC 
TGAAGATACACCTGCCGTGTCTACTCCATCAGAGGAAGGCTTACTTGGTGATATTGTACC 
TCAAACTATGCGCAACCCTTGAGGTAAGCTTGGTACGTAGCAATAGCTAAGGAGdTGCTA 
ACTCATTATATATAGAAGATATTGCAGACCAGAATATGCGCAGGGAGGGTATAACAATAT 
GGCGTTGTAACAATGGATCTATATATTACCTCATTGTTGATCAATAGCACACCACCGGTA 
CGTTTGCAATTTCTTCATGTATATTTCTTGTTATATATGTAGTTATATATCCAGGTATAA 
TTTTGATGTAACACAACATTAATCTTAATCGTGGATCCATCCCACATTTGATGCATGTAT 
GTGCACTTAAGAAAAAGAACATGGAGGAAATAACGTTATTTTTTATTATTCT 

>G2290 Amino Acid Sequence (conserved domain in AA coordinates : 147-2 05) 

MNDPDNPDLSNDDSAWRELTLTAQDSDFFDRDTSNILSDFGWNLHHSSDHPHSLRFDSDL 

TQTTGVKPTTVTS S CS S SAAVSVAVTSTNNNPS ATS S S SEDPAENSTASAEKTPPPETPV 

KEKKKAQKRIRQPRFAFMTKSDVDNLEDGYRWRKYGQKAVKNSPFPRSYYRCTN 

KRVERSSDDPSIVI-TTYEGQHCHQTIGFPRGGILTAHDPHSFTSHHHLPPPLPNPYYYQE 

LLHQLHRDNWAPSPRLPRPTTEDTPAVSTPSEEGLLGDIVPQTMRNP* 

>G2299 (231.. 941) 

GCCAAAATTTTACCAACATTTITCTCTTCTCATATCA^ 
CACACTTGACTGCCCTGTTTTTTTTCCTCATO 

TCCCCCTGAAGCCTAGCTATTTCTTTTTATTTGCATTAATCTCGGGATCCGAATCGA 

AAGCAATC^GAATAATAGACTTGTACGATACTTGTGCCTAAGCTAACACAATGGCAGAGG 

AATACTACAGCCTCCGCTCGGAGAGAGTAACTCAGCTTCTTGTCCCTAACTCGGAGTCTG 

ACTCAGTGAGTGACAAAAGCAAAGCTGAGCAAAGCGAGAAGAAGACTAAACGTGGGAGAG 

ACTCCGGTAAACACCCTGTTTATCGCGGAGTAAGGATGAGGAACTGGGGAAAATGGGTGT 
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CGGAGATTCGTGAGCCGAGGAAGAAATCACGTATTTGGCTGGGAACTTTCCCGACGCCGG 
AGATGGCGGCGCGTGCACACGACGTGGCGGCTCTGAGCATTAAAGGAACGGCCGCTATAC 
TAAACTTCCCTGAACTCGCTGACTCATTCCCTCGACCCGTTTCATTAAGCCCTCGAGACA 
TTCAGACAGCAGCTCTTAAAGCAGCTCACATGGAACCGACGACGTCGTTTTCATCTTCCA 
CGTCTTCGTCGTCGTCTTTGTCTTCTACGTCTTCGCTCGAGTCTCTTGTGTTGGTGATGG 
ACCTCTCGAGGACTGAGTCGGAGGAGCTCGGTGAGATTGTGGAGCTTCCAAGTCTCGGGG 
CGAGTTACGACGTCGACTCGGCTAACCTTGGGAACGAGTTTGTCTTCTATGACTCAGTTG 
ACTACTGTTTATATCCGCCGCCGTGGGGACAGTCGTCCGAAGATAACTATGGTCACGGAA 
TTAGCCCTAATTTTGGCCATGGCTTGTCATGGGATCTCTAACAGTTTATTTTGTATCATT 
ACCATAATGTTTTGTTTAAAACAGTTTATTTTGTATCATTGCCATAATGTTTTGTTTAAT 
CACGTTTTTAAAACCCTTTGCTGTTTTTGTTTTTTTTTTGAGTTTTT 

>G2299 Amino Acid Sequence (conserved domain in AA coordinates : 48-115) 

MAEEYYSLRSERVTQLLVPNSESDSVSDKSKAEQSEKKTKRGRDSGKHPVYRGVRMRNWG 

KWSEIREPRKKSRIWLGTFPTPEMAARAHDVAALSIKGTAAILNFPELADSFPRPVSLS 

PRDIQTAALKAAHMEPTTSFSSSTSSSSSLSSTSSLESLVLVMDLSRTESEELGEIVELP 

SLGASYDVDSANLGNEFVFYDSVDYCLYPPPWGQSSEDNYGHGISPNFGHGLSWDL* 
>G2340 (274 . .1275) 

ATACAAAACTCCCTCTTCTCTATCTTCTTCATCTTAAAGAAAAAATAAGAGATATTCGTA 

AAGAGAGAACACAAAATTTCAGTTTACGAAAAGCTAGCAAAGTCGAGTATCGAGG7^ATAA 

CAGAATAAGACGTATCTATCCTTGCCTTAATGTTCTTACCAAAAGATCTAGTCCTTTCTT 

TGTATGATCGATCCATCACAAGCCCACAACAACAACAACTACATCTCTTTCTCTATCTCT 

AGCTTCTATTTTTAATACATTCAAGAATCAAGAATGGTACGGACGCCGTGTTGTAGAGCA 

GAAGGGTTGAAGAAAGGAGCATGGACTCAAGAAGAAGACCAAAAGCTTATCGCCTATGTT 

CAACGACATGGTGAAGGCGGTTGGCGAACCCTTCCGGACAAAGCTGGACTCAAAAGATGT 

GGCAAAAGCTGCAGATTGAGATGGGCGAATTACTTAAGACCTGACATTAAACGTGGAGAG 

TTTAGCCAAGACGAGGAAGATTCCATCATCAACCTCCACGCCATTCATGGCAACAAATGG 

TCGGCCATAGCTCGTAAAATACCAAGAAGAACAGACAATGAGATCAAGAACCATTGGAAC 

ACTCACATCAAGAAATGTCTGGTCAAGAAAGGTATTGATCCGTTGACCCACAAATCCCTT 

CTCGATGGAGCCGGTAAATCATCTGACCATTCCGCGCATCCCGAGAAAAGCAGCGTTCAT 

GACGACAAAGATGATCAGAATTCAAATAACAAAAAGTTGTCAGGATCATCATCAGCTCGG 

TTTTTGAACAGAGTAGCAAACAGATTCGGTCATAGAATCAACCACAATGTTCTGTCTGAT 

ATTATTGGAAGTAATGGCCTACTTACTAGTCACACTACTCCAACTACAAGTGTTTCAGAA 

GGTGAGAGGTCAACGAGTTCTTCCTCCACACATACCTCTTCGAATCTCCCCATCAACCGT 

AGCATAACCGTTGATGCAACATCTCTATCCTCATCCACGTTCTCTGACTCCCCCGACCCG 

TGTTTATACGAGGAAATAGTCGGTGACATTGAAGATATGACGAGATTTTCATCAAGATGT 

TTGAGTCATGTTTTATCTCATGAAGATTTATTGATGTCCGTTGAGTCTTGTTTGGAGAAT 

ACTTCATTCATGAGGGAAATTACAATGATCTTTCAAGAGGATAAAATCGAGACGACGTCG 

TTTAATGATAGCTACGTGACGCCGATCAATGAAGTTGATGACTCCTGTGAAGGGATTGAC 
AATTATTTTGGATGAGTTATATTGA 

TTAGAGTTTGATTTGCTATGGTGTTTTTAGTTTGTGTGTGTAGTGTGTTO 
AAAAAAAAAAAAAAAAAAAAAAAAAA 

>G2340 Amino Acid Sequence (domain in AA coordinates : 14-120) 

MVRTPCCRAEGIjKKGAWTQEEDQKL IAYVQRHGEGGWRTLPDKAGLKRCGKS CRLRWANY 

LRPDIKRGEFSQDEEDSI INIiHAIHGNKWSAIARKI PRRTDNEIKNHWNTHIKKCLVKKG 

IDPLTHKSLLDGAGKSSDHSAHPEKSSvTDJDKDDQNSIJNKKIjSGSSSARFLNRVANRF 
RINHNVTiSDIIGSNGLLTSHTTO^ 

STFSDSPDPCLYEEIVGDIEDMTRFSSRCLSHVXSHEDLLMSVESCLENTSFMREITMIF 

QEDKIETTSFNDSYVTPINEVDDSCEGIDNYFG* 

>G2346 (1. .1011) 

ATGGAGTTGTTAATGTGTTCGGGTCAGGCCGAGTCAGGTGGTTCTTCTTCCACCGAGTCT 
TCTTCACT(^GTGGTGGA^ 

TCCAGAAGCAAGAACCGGGTCAATACCGTTCGTAAGTCGTCTACCACGGCGAGGTGCCAA 

GTGGAAGGTTGTAGAATGGATCTAAGCAATGTTAAAGCTTATTACTCGAGACACAAAGTT 
TGTTGCTVTTCACTCTAAATCATCTAAAGTC^ 

CAACAATGTAGCAGGTTTCACCAGCTTTCTGAGTTTGACTTGGAGAAAAGAAGT^ 

AGAAGACTCGCTTGTCATAACGAACGACGAAGAAAACCACAACCCACAACGGCTCTTTTC 

ACTTCTCATTACTCTCGAATCGCTCCATCTCTTTACGGAAACCCCAATGOTGCAATC 
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AAAAGCGTTTTGGGAGATCCTACTGCGTGGTCAACCGCAAGATCAGTGATGCAGCGGCCT 

GGACCGTGGCAGATTAATCCAGTTAGGGAAACCCATCCACACATGAATGTTTTATCACAT 

GGAAGCTCAAGCTTTACTACATGTCCAGAGATGATAT^ACAACAATAGCACAGATTCAAGC 

TGTGCTCTCTCTCTTCTGTCAAACTCATACCCAATTCATCAGCAGCAACTTCAGACACCA 

AC7UVATACATGGCGACCATCTTCTGGTTTCGACTCGATGATCTCATTCTCCGATAAGGTT 

ACAATGGCTCAGCCACCGCCCATTTCAACCCATCAGCCGCCCATCTCAACACATCAGCAG 

TACCTCAGCCAAACTTGGGAAGTCATCGCGGGCGAAAAGAGCAATTCACATTATATGTCT 

CCTGTGAGTCAAATCTCGGAGCCAGCAGATTTCCAGATAAGCAATGGCAGTGTGTCGCCC 

TATTCTCCTCCGTCCTTACTATCTCTTGTGTGCTACTTGCGGCCGCTATAG 

>G2346 Amino Acid Sequence (domain in AA coordinates: 59-135) 

MELLMCSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQ 

VEGCRMDLSNVKAYYSRHKVCCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCR 

RRLACHNERRRKPQPTTALFTSHYSRIAPSLiYGNPNAAMIKSVLGDPTAWSTARSVMQRP 

GPWQINPTOETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTP 

TNTWRPSSGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYLSQTWEVIAGEKSNSHYMS 

PVSQISEPADFQISNGSVSPYSPPSLLSLVCYLRPL* 

>G237 (1..852) 

ATGGCGAAGACGAAATATGGAGAGAGACATAGGAAAGGGTTATGGTCACCTGT^AGAAGAC 
GAGAAGCTAAGGAGCTTCATCCTCTCTTATGGCCATTCTTGCTGGACCACTGTTCCCATC 
AAAGCTGGGTTACAAAGGAATGGGAAGAGCTGCAGATTAAGATGGATTAATTACCTAAGA 
CCAGGGTTAAAGAGGGATATGATTAGTGCAGAAGAAGAAGAGACTATCTTGACGTTTCAT 
TCTCCCTTGGGTAACAAGTGGTCGCAAATAGCTAAATTCTTACCGGGAAGAACAGACAAT 
GAGATAAAGAACTATTGGCACTCTCATTTGAAAAAGAAATGGCTCAAGTCTCAGAGCTTA 
CAAGATGCAAAATCTATTTCCCCTCCTTCGTCTTCATCATCATCACTTGTTGCTTGTGGA 
GAAAGAAATCCGG71AACCTTGATCTCGAATCACGTGTTCTCCCTCCAGAGACTTCTAGAG 
AACAAATCTTCATCTCCCTCACAAGAAAGC7^ACGGAAATAACAGCCATCAATGTTCTTCT 
GCTCCTGAGATTCCAAGGCTTTTCTTCTCTGAATGGCTTTCTTCTTCATATCCCCACACC 
GATTATTCCTCTGAGTTTACCGACTCTAAGCACAGTCAAGCTCCAAATGTCGAAGAGACT 
CTCTCAGCTTATGAAGAAATGGGTGATGTTGATCAGTTCCATTACAACGAAATGATGATC 
AACAACT^GCAACTGGACTCTTAACGACATTGTGTTTGGTTCCAAATGTAAGAAGCAGGAG 
CATCATATTTATAGAGAGGCTTCAGATTGTAATTCTTCTGCTGAATTCTTTTCTCCACCA 
ACAACGACGTAAATTGCGTTTATTGTAATGTAAATCAAATTTCTT^AGGCAAAACCGGAAA 
AAAAAAAAAAAAAAAAAAAA 

>G237 Amino Acid Sequence (domain in AA coordinates: 11-113) 

MAKTKYGERHRKGLWSPEEDEIOiRSFILSYGHSCWTTVPIKAGLQRNGKSCRLRWINYLR 

PGLKRDMISAEEEETILTFHSPLGNKWSQIAKFLPGRTDNEIKim?HSHLKKKWLKSQSL 

QDAKSISPPSSSSSSLVACGERNPETLISNHVFSLQRLLENKSSSPSQESNGNNSHQCSS 

APEIPRLFFSEWLSSSYPHTDYSSEFTDSKHSQAPNVEETLSAYEEMGDVDQFHYNEMMI 

NNSNWTLNDIVFGSKCKKQEHHIYREASDCNSSAEFFSPPTTT* 

>G2373 (48.. 1199) 

GCAAAATCCTCAGATCGTCTTACCTTCTCCGAATCGATCGATTTTTCATGGAGGACGACG 
ACGAGATTCAGTCAATTCCATCTCCGGGAGATTCTTCCCTTTCACCACAAGCTCCTCCTT 
CTCCGCCGATTTTGCCAACAAACGACGTGACGGTGGCCGTCGTGAAGAAACCACAACCGG 
GGCTTTCTTCTCAATCTCCGTCCATGAACGCTTTAGCGTTAGTGGTTCATACTCCTTCTG 
TAACCGGTGGTGGTGGTAGCGGAAACAGAAACGGACGAGGAGGAGGAGGAGGAAGCGGTG 
GTGGTGGAGGAGGAAGAGATGATTGTTGGAGCGAAGAAGCTACAAAGGTTCTAATCGAAG 
CTTGGGGAGATCGATTCTCTGAACCAGGTAAAGGAACTTTGAAGCAACAACATTGGAAAG 
AAGTAGCTGAGATTGTGAACAAGAGTCGTCAATGCAAATACCCTAAAACTGATATTCAGT 
GTAAGAACAGAATTGATACGGTGAAGAAGAAGTATAAGCAAGAGAAAGCTAAGATTGCTT 
CTGGTGATGGACCTAGTAAATGGGTTTTCTTCAAGAAGCTTGAGAGTTTGATTGGTGGTA 
CTACAACATTCATTGCTTCTTCAAAAGCTTCAGAGAAGGCTCCTATGGGAGGAGCTCTTG 
GGAATAGCCGTTCGAGTATGTTTAAACGGCAAACTAAAGGTAATCAGATTGTGCAGCAAC 
AACAAGAGAAGAGAGGCTCTGATTCGATGCGGTGGCATTTTAGGAAACGTAGTGCTTCTG 
AGACTGAGTCTGAGTCTGATCCTGAACCTGAGGCTTCTCCTGAGGAATCTGCTGAGAGTC 
TCCCACCTTTGCAACCGATTCAACCGCTTTCGTTTCATATGCCAAAGCGGTTGAAGGTGG 
ATAAGAGTGGAGGTGGAGGGAGTGGAGTTGGAGATGTGGCGAGGGCGATACTTGGATTTA 
CGGAAGCTTATGAGAAGGCGGAAACTGCTAAGCTTAAGTTAATGGCGGAACTGGAA7VAGG 
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AGAGGATGAAATTTGCTAAAGAGATGGAGTTGCAGAGMTGCAGTTCTTGAAAACTCAAT 

TGGAGATAACACAGAACAATCAAGAAGAGGAAGAGAGGAGCAGGCAGCGAGGAGAAAGGA 

GGATCGTTGATGATGATGATGATCGCAATGGCAAGAATAACGGCAATGTAAGTAGCTGAC 

AATTGAACACACAAATGTTCCTATGATATTTGCTATGATAAGCTGGATTTTAGGTTTTGA 
TGG 

>G2373 Amino Acid Sequence (domain in AA coordinates : 290-350) 
MEDDDEIQSIPSPGDSSLSPQAPPSPPILPTNDVTVAWKKPQPGLSSQSPSMNALALW 

HTPSVTGGGGSGNRNGRGGGGGSGGGGGGRDDCWSEEATKVLIEAWGDRFSEPGKGTLKQ 
QHWKEVAEIVNKSRQCKYPKTDIQCKNRIDTV^ 

LIGGTTTFIASSKASEKAPMGGALGNSRSSMFKRQTKGNQIVQQQQEICRGSDSMRWHFRK 
RSASETESESDPEPEASPEESAESLPPLQPIQPLSFHMPKRLKVDKSGGGGSGVGDV7UIA 

ILGFTEAYEKAETAKLKLMAELEKERMKFAKEMELQRMQFLKTQLEITQNNQEEEERSRQ 
RGERRI VDDDDDRNGKNNGNVSS * 
>G2376 (39.. 1370) 

CACGAGCTTCTGACTCAGATCCGGCGATATCGAATTCCATGGAGGACGATGAAGACATCC 
GATCTCAGGGTTCCGATTCACCTGATCCGTCTTCCTCCCCGCCGGCGGGACGAATCACGG 
TTACGGTGGCTTCGGCAGGTCCGCCTTCTTATTCTCTGACTCCTCCGGGTAATTCGTCGC 
AGAAGGATCCGGATGCGTTGGCTCTGGCGCTGCTTCCGATTCAGGCCAGCGGTGGAGGGA 
ATAACAGCAGTGGGAG ACCAACCGG CGGCGGCGGGAGGGAGGATTGTTGGAGCGAAG CAG 
CTACGGCTGTGTTGATTGATGCGTGGGGTGAGAGATACTTGGAGCTTAGCAGAGGGAATC 
TGAAGCAGAAGCACTGGAAAGAGGTGGCTGAGATTGTGAGCAGCAGAGAGGATTACGGTA 
AAATTCCCAAAACTGATATACAGTGTAAGAATAGGATCGATACGGTGAAGAAGAAGTATA 
AACAAGAGAAGGTGAGAATCGCTAACGGCGGTGGCCGTAGCAGATGGGTGTTCTTCGACA 
AGCTTGACCGTCTGATTGGATCAACGGCGAAGATCCCGACGGCAACTTCTGGAGTCAGCG 
GTCCTGTCGGAGGATTGCATAAGATTCCTATGGGTATTCCAATGGGAAGTCGTTCGAATC 
TGTACCATC^GCAAGCTAAGGCTGCAACACCGCCTTTCAATAATCTTGACCGGTTAATTG 
GAGCTACGGCTAGAGTCTCAGCTGCTTCTTTCGGTGGCAGTGGTGGAGGAGGCGGAGGAG 
GATCTGTCAATGTACCTATGGGAATTCCGATGAGTAGCCGTTCAGCTCCGTTTGGACAGC 
AAGGGAGGACTCTGCCACAGCAAGGTAGGACACTGCCACAGCAACAGCAGCAAGGGATGA 
TGGTGAAAAGGTGTAGTGAGTCGAAACGCTGGCGTTTCAGGAAGAGGAACGCTTCTGATT 
CAGACTCGGAATCTGAAGCAGCAATGTCAGATGATTCCGGTGACAGTTTACCACCTCCTC 
CTCTGTCGAAGAGGATGAAGACGGAGGAGAAGAAGAAGCAAGATGGTGATGGAGTGGGGA 
ACAAATGGAGGGAGCTGACTCGGGCAATCATGAGATTCGGTGAAGCTTATGAGCAAACAG 
AGAATGCGAAACTGCAACAGGTGGTTGAGATGGAGAAAGAGAGGATGAAGTTCTTGAAGG 
AGCTTGAGTTGCAGAGAATGCAGTTCTTTGTGAAGACTCAATTGGAGATATCACAACTTA 
AGCAGCAACATGGGAGGAGAATGGGAAACACCAGTAATGATCATCATCACAGCCGCAAGA 

ACAACATCAATGCGATTGTCAACAACAACAACGATTTGGGTAATAACTAGAATTTAGTGA 
TGCAGTGTCGTAATTGATATATTTTAGATTTGAG 

>G2376 Amino Acid Sequence (domain in AA coordinates : 79-178 336-408) 

MEDDEDIRSQGSDSPDPSSSPPAGRITVTVASAGPPSYSLTPPGNSSQKDPDALALALLP 

IQASGGGNNSSGRPTGGGGREDCWSEAATAVLIDAWGERYLELSRGNLKQKHWKEVAEIV 

SSREDYGKJPKTDIQCKNRIDTVKKKYKQEKVRIANGGGRSRl^FDKLDRLIGSTAKIP 

TATSGVSGPVGGLHKIPMGIPMGSRSNLYHQQAKAATPPFNNLDRLIGATARVSAASFGG 

SGGGGGGGSVl^VPMGIPMSSRSAPFGQQGRTLPQQGRTLPQQQQQGMMVKRCSESKRWRF 

RKRNASDSDSESEAAMSDDSGDSLPPPPLSIOmKTEEKKKQDGDGVGNKWRELTRAIMRF 

GEAYEQTENAKLQQVVEMEKERMKFLKELELQRMQFFVKTQLEISQLKQQHGRRMGNTSN 

DHHHSRKNNINAIVNNNNDLGNN* 

>G24 (194.. 7244- 

CGGACGCGTGGGCAAATATTAAAATAAAAAGTGTCGGTGAATTCTCAATCTTTGTCTTCT 
TTCGTCGTCTCTTTA5UVACTCCTCCGTCCCTCCTTATTATGTAACCGTCTCGCCGTCAAA 
TTTTCAAAATCTCTCCCTCCGTTCATAAACCCAGATCGAAATTTATGGTTTTGTAATTTT 
TTTACCGGCGGTTATGGAGACGGAAGCGGCGGTGACAGCGACGGTTACGGCGGCGACGAT 
GGGGATTGGGACGAGGAAGAGAGATCTGAAACCGTATAAAGGAATACGAATGAGGAAATG 
GGGGAAATGGGTGGCGGAGATACGGGAACCGAATAAGAGATCAAGGATCTGGTTAGGTTC 
TTATGCGACGCCTGAAGCGGCGGCGAGAGCTTACGACACTGCTGTTTTTTACCTCCGTGG 
TCCTTCAGCGAGGCTTAATTTTCCGGAGCTTTTGGCTGGACTTACTGTTTCTAACGGCGG 
AGGAAGAGGTGGTGATTTATCGGCGGCGTATATTAGGAGAAAAGCGGCGGAGGTTGGTGC 
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TCAGGTTGATGCGCTTGGAGCGACGGTGGTTGTGAATACCGGCGGCGAGAATCGCGGTGA 

TTACGAGAAGATTGAGAATTGTCGTAAGAGCGGTAACGGGTCATTGGAACGGGTCGATTT 

GAATAAATTACCCGACCCGGAAAATTCGGATGGTGATGATGACGAATGTGTGAAAAGAAG 

ATAGAAAAAATAAAAAGTAGTTGTAGAAGGAGAGACGAGAATGTTTGTCTTTAAGATGCG 

CTGTTGCCGCTAACATGCGCTTTCGATTTTAGTGTTAAACATGCGCCTCCATTGTTTTTG 

GGTTTTGTTTTCGTCGTCGATAATCAAAGATTTTAAAACACAATTCTCAAATTTTTCACT 

TGTTACAAACTAGATTTG CATG ATCTTTGTATTAACG AATAACGATTAAGTC CTAAA 

>G24 Amino Acid Sequence (domain in AA coordinates: 25-93) 

METEAAVTATVTAATMGIGTRKRDLKPYKGIRMRKWGKWVAEIREPNKRSRIWLGSYATP 

EAAARAYDTAVFYLRGPSARLNFPELLAGLTVSNGGGRGGDLSAAYIRRKAAEVGAQVDA 

LGATVVVNTGGENRGDYEKIENCRKSGNGSLERVDLNKLPDPENSDGDDDECVKRR* 

>G2424 (1. .999) 

ATGAGGATGGAGATGGTGCATGCTGACGTGGCGTCTCTCTCCATAACACCTTGCTTCCCG 
TCTTCTTTGTCTTCGTCCTCACATCATCACTATAACCAACAACAACATTGTATCATGTCG 
GAAGATC7\ACACCATTCGATGGATCAGACCACTTCATCGGACTACTTCTCTTTAAATATC 
GACAATGCTCAACATCTCCGTAGCTACTACACAAGTCATAGAGAAGAAGACATGAACCCT 
AATCTAAGTGATTACAGTAATTGCAACAAGAAAGACACAACAGTCTATAGT^AGCTGTGGA 
CACTCGTCAAAAGCTTCGGTGTCTAGAGGACATTGGAGACCAGCTGAAGATACTAAGCTC 
AAAGAACTAGTCGCCGTCTACGGTCCACAAAACTGGAACCTCATAGCTGAGAAGCTCCAA 
GGAAGATCCGGGAAAAGCTGTAGGCTTCGATGGTTTAACCAACTAGACCCAAGGATAAAT 
AGAAGAGCCTTCACTGAGGAAGAAGAAGAGAGGCTAATGCAAGCTCATAGGCTTTATGGT 
AACAT^ATGGGCGATGATAGCGAGGCTTTTCCCTGGTAGGACTGATAATTCTGTGT^GAAC 
CATTGGCATGTTATAATGGCTCGCAAGTTTAGGGAACAATCTTCTTCTTACCGTAGGAGG 
AAGACGATGGTTTCTCTTAAGCCACTCATTAACCCTAATCCTCACATTTTCAATGATTTT 
GACCCTACCCGGTTAGCTTTGACCCACCTTGCTAGTAGTGACCATAAGCAGCTTATGTTA 
CCAGTTCCTTGCTTCCCAGGTTATGATCATGAAAATGAGAGTCCATTAATGGTGGATATG 
TTCGAAACCCAAATGATGGTTGGCGATTACATTGCATGGACACAAGAGGCAACTACATTC 
GATTTCTTAAACCAAACCGGGAAGAGTGAGATATTTGAAAGAATCAATGAGGAGAAGAAA 
CCACCATTTTTCGATTTTCTTGGGTTGGGGACGGTGTGA 

>G2424 Amino Acid Sequence (conserved domain in AA coordinates : 107-219) 

MRMEMVHADVASLSITPCFPSSLSSSSHHHYNQQQHCIMSEDQHHSMDQTTSSDYFSLNI 

DNAQHIiRSYYTSHREEDMNPNLSDYSNCNKKBTTVYRSCGHSSKASVSRGHWRPAEDTKL 

KELVAVYGPQNWNLIAEKLQGRSGKSCRLRWFNQLDPRINRRAFTEEEEERLMQAHRLYG 

NKWAMIARLFPGRTDNSVICNHWHVIMARKFREQSSSYRRRKTMVSLKPLINPNPHIFNDF 

DPTRLALTHLASSDHKQLMLPVPCFPGYDHENESPLMTOMFETQMIWGDYIAWTQEATTF 

DFLNQTGKSEIFERINEEKKPPFFDFLGLGTV* 

>G2505 (1..1026) 

ATGGGTTCTTCGTCGAACGGAGGAGTGCCACCTGGTTTCCGGTTTCATCCGACGGACGAA 

GAGCTTCTCCATTACTACTTGAAGAAGAAAATCTCTTACCAAAAGTTTGAGATGGAAGTC 

ATCAGAGAGGTTGACTTAAACAAGCTTGAGCCTTGGGATTTGCAAGAGAGATGCAAGATA 

GGATCMCACCAaW^CGAATGGTACTTCTTCAGCCACAAGGACAGGAAATATCCG 

GGGTCAAGGACCAACCGTGCTACTCATGCAGGGTTCTGGAAGGCGACGGGACGTGACAAG 

TGCATAAGGAACTCTTACAAAAAGATAGGAATGAGGAAGACACTTGTGTTCTACAAAGGT 

AGAGCTCCTCATGGCCLAAAAGACTGATTGGATCATGGATGAGTACCGTCTTGAAGACGCT 

GATGATCCTCAAGCCAACCCTAGTGAAGATGGATGGGTGGTATGTAGAGTGTTTATGAAG 

AAAAATTTGTTCAAGGTAGTAAATGAAGGTAGCTCAAGCATTAACTCATTGGACCAACAC 

AACCATGACGCATCTAACAACAACCATGCACTTCAAGCTCGTAGCTTTATGCACGGAGAC 

AGTCCATACCAGCTAGTACGTAACCACGGAGCCATGACATTCGAACTTAACAAGCCTGAC 

CTTGCTCITCATCAATACCCACCAATCTTCCACAAGCCACCTTCACTTGGATTTGACTAC 

TCTTCAGGACTTGCAAGGGACAGTGAGAGTGCGGCTAGTGAAGGGTTACAATACCAGCAA 

GCGTGTGAGCCGGGTTTAGACGTTGGTACATGTGAGACAGTGGCTAGTCATAATCATCAA 

CAAGGTCTAGGTGAATGGGCAATGATGGATAGACTTGTGACTTGTCACATGGGAAATGAA 

GATTCCTCTAGAGGGATTACGTATGAGGATGGTAACAACAATTCGTCCTCTGTGGTTCAG 

CCAGTTCCCGCGACGAACCAGCTAACATTGCGTAGTGAGATGGATTTCTGGGGTTATTCT 

AAATAG 

>G2505 Amino Acid Sequence (domain in AA coordinates: 10-159) 
MGSSSNGGVPPGFRFHPTDEELLHYYLKKKISYQKFEMEVIRETOIiNKLEPWDLQERCKI 
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GSTPQNEWYFFSHKDRKYPTGSRTNRATHAGFWKATGRDKCIRNSYKKIGMRKTLVFYKG 
RAPHGQKTDWIMHEYRLEDADDPQANPSEDGWWCRVFMKKNLFKWNEGSSSINSLDQH 
NHDASNNNHALQARS FMHRDS P YQLVRNHGAMTFELNKPDLALHQYPP I FHKPPSLGFDY 

SSGLARDSESAASEGLQYQQACEPGLDVGTCETVASHNHQQGLGEWAMMDRLVTCHMGNE 

DSSRGITYEDGNNNSSSWQPVPATNQLTLRSEMDFWGYSK* 

>G2512 (64.-798) 

AACTTAGTGCCACTTAGACACAATAAGAAAACCGTTAACAAGAAGAAAAAAAAAAGATCG 
AAAATGGAATATCAAACTAACTTCTTAAGTGGAGAGTTTTCCCCGGAGAACTCTTCTTCA 
AGCTCATGGAGCTCACAAGAATCATTCTTGTGGGAAGAGAGTTTCTTACATCAATCATTT 
GACCAATCCTTCCTTTTATCTAGCCCTACTGATAACTACTGTGATGACTTCTTTGCATTT 
GAATCATCAATCATAAAAGAAGAAGGAAAAGAAGCCACCGTGGCGGCCGAGGAGGAGGAG 
AAGTCATACAGAGGAGTGAGGAAACGGCCGTGGGGGAAATTCGCGGCCGAGATAAGAGAC 
TCAACGAGGAAAGGGATAAGAGTGTGGCTTGGGACATTCGACACCGCGGAGGCGGCGGCT 
CTCGCTTATGATCAGGCGGCTTTCGCTTTGAAAGGCAGCCTCGCAGTACTCAATTTCCCC 
GCGGATGTCGTTGAAGAATCTCTCCGGAAGATGGAGAATGTGAATCTCAATGATGGAGAG 
TCTCCGGTGATAGCCTTGAAGAGAAAACACTCCATGAGAAACCGTCCTAGAGGAAAGAAG 
AAATCTTCTTCTTCTTCGACGTTGACATCTTCTCCTTCTTCCTCCTCCTCCTATTCATCT 
TCTTCGTCTTCTTCTTCTTTGTCGTCAAGAAGTAGAAAACAGAGTGTTGTTATGACGCAA 
GAAAGTAATACAACACTTGTGGTTCTTGAGGATTTAGGTGCTGAATACTTAGAAGAGCTT 
ATGAGATCATGTTCTTGATAATCTCTGCTTCTACAATTTTTATGTAATTTGA 

>G2512 Amino Acid Sequence (conserved domain in AA coordinates • 79-139) 
MEYQTNFLSGEFSPENSSSSSWSSQESFLWEESFLHQSFDQSFLLSSPTDMYCDDFFAFE 
SSIIKEEGKEATVAAEEEEKSYRGVRKRPWGKFAAEIRDSTRKGIRVWLGTFDTAEAAAL 
AYDQAAFALKGSIiAVXNFPADVVEESLRKMENTOLI^GESPVIAL 

SSSSSTLTSSPSSSSSYSSSSSSSSLSSRSRKQSWMTQESNTTLWLEDLGAEYLEELM 
RSCS* 

>G2513 (69.. 698) 

TTTCAACAGTAATTTAAGTTAACCGGAGTCTCTTTTTGTTTTCCGGCGAATTTTTGGTAC 
TTTGAGTTATGAATAaTGATGATATTATTCTGGCGGAGATGAGGCCTAAGAAGCGTGCGG 
GAAGGAGAGTGTTTAAGGAGACACGTCACCCAGTTTACAGAGGCATAAGGCGGAGGAACG 
GTGACAAATGGGTCTGCGAAGTCAGAGAACCGACGCACCAACGCCGCATTTGGCTCGGGA 
CTTATCCCACAGCAGATATGGCAGCGCGTGCACACGACGTGGCGGTTTTAGCTCTGCGTG 
GGAGATCCGCATGTTTGAATTTCGCCGACTCCGCTTGGCGGCTTCCGGTGCCGGAATCCA 
ATGATCCGGATGTGATAAGAAGAGTTGCGGCGGAAGCTGCGGAGATGTTTAGGCCGGTGG 
ATTTAGAAAGTGGAATTACGGTTTTGCCTTGTGCGGGAGATGATGTGGATTTGGGTTTTG 
GTTCGGGTTCCGGCTCTGGTTCGGGATCGGAGGAGAGGAATTCTTCTTCGTATGGATTTG 
GAGACTACGAAGAAGTCTCAACGACGATGATGAGACTCGCGGAGGGGCCACTAATGTCGC 
CGCCGCGATCGTATATGGAAGACATGACTCCTACTAATGTTTACACGGAAGAAGAGATGT 
GTTATGAAGATATGTCATTGTGGAGTTACAGATATTAAGTGGGACTCACATATCTACTAT 
ACATAATATTTAGCTTTTATGTAAGAGGTATTTATGTGAGTTTTAAGATTGTAGATGTGT 
CCCAGGCGTTAGAAGTTTCCTTGATGGTATGGAATCTTTGTACCTATAAAATTATAAAAT 

>G2513 Amino Acid Sequence (domain in AA coordinates: TBD) 

MNNDDIILAEMRPKKRAGRRVFKETRHPVYRGIRRRNGDKWVCEVREPTHQRRIWLGTYP 

TADMAAIIAHDVAVLALRGRSACLNFADSAWRLPVPESNDPDVIRRVAAEAAEMFRPVDLB 

SGITVLPCAGDDVDLGFGSGSGSGSGSEERNSSSYGFGDYEEVSTTMMRLAEGPLMSPPR 

SYMEDMTPTNVYTEEEMCYEDMSLWSYRY* 

>G2519 (83.. 691) 

CSiAAGTGAAAACATAAGATCATCTTCTTCGTTGATAGATCAATATAGGAACTCCAGAAGA 
GAATCTTGATCAATTAAGTATCATGTCTCACATCGCTGTTGAAAGGAATCGAAGAAGGCA 
AATGAACGAGCATCTTAAATCCCTTCGTTCTTTGACTCCTTGTTTCTACATCAAAAGGGG 
AGATCAAGCTTCGATCATCGGAGGAGTGATAGAGTTCATCAAAGAGTTGCAGCAATTGGT 
TCAAGTTCTTGAGTCCAAGAAACGTCGAAAGACCCTAAACCGACCATCnTCCCTTATGA 
TCACCAGACAATCGAGCCATCCAGTTTAGGAGCCGCCACTACCCGAGTACCGTTTAGTCG 
AATCGAAAATGTGATGACCACAAGTACTTTCAAGGAAGTAGGAGCATGCTGTAACTCCCC 
TCATGCTAACGTAGAAGCAAAGATTTCAGGTTCTAATGTTGTATTGAGAGTTGTCTCTAG 
GCGAATCGTGGGGCAGCTCGTAAAGATCATCTCTGTCTTAGAGAAGCTATCTTTTCAAGT 
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TCTTCACCTCAATATTAGTAGCATGGAGGAGACTGTCTTATACTTTTTCGTTGTTAAGAT 
AGGATTGGAGTGTCACTTAAGCTTGGAGGAGCTAACTCTTGAAGTTCAGAAAAGCTTTGT 
GTCTGATGAAGTGATCGTCTCTACCAATTAAAAACAAAATTCTACATGTACTAGAGCGTG 
TATCGTTTTTTGGGATTAATAATCATATAATCGTTACATGAGCCTTGATACTTTGCTAGA 
AATAAGCTCCTCTAAACAAAACCTTCTTTTTAAAAAAACACACTTATGTTTTACTTAGTT 
TGTTGTTGTATCCGAAGTTGATCAACGTTGTAATTTCCCACAATAAATCATGACATTTTA 
TATGCTCT 

>G2519 Amino Acid Sequence (domain in AA coordinates : 1-65 ) 

MSHIAVERNRRRQMNEHLKSLRSLTPCFYIKRGDQASIIGGVIEFIKELQQLVQVLESKK 

RRKTLNRPSFPYDHQTIEPSSLGAATTRVPFSRIENVMTTSTFKEVGACCNSPHANVEAK 

ISGSNWLRWSRRIVGQLVKIISVLEKLSFQVLHLNISSMEETVLYFFWKIGLECHLS 

LEELTLEVQKSFVSDEVIVSTN* 

>G2520 (133 . .1197) 

AAGGAGTTTTGCATACTCACCAAGCCACAATCATTTCTCTCTTCTCTATCTCTCTGGTTT 

TGAATCGGCGACGACTGAGTCAACTCGGTGTTGTTACTGGTTTCGTCGTATGTGTTGTAA 

CTGATTAAGTTGATGGATCCGAGTGGGATGATGAACGAAGGAGGACCGTTTAATCTAGCG 

GAGATCTGGCAGTTTCCGTTGAACGGAGTTTCAACCGCCGGAGATTCTTCTAGAAGAAGC 

TTCGTTGGACCGAATCAGTTCGGTGATGCTGATCTAACCACAGCTGCTAACGGTGATCCA 

GCGCGTATGAGTCACGCGTTGTCTCAGGCGGTTATTGAAGGTATCTCCGGCGCTTGGAAA 

CGGAGGG7^AGATGAGTCTAAGTCGGCGAAGATCGTCTCCACCATTGGCGCTAGTGAAGGT 

GAGAACAAAAGACAGAAGATAGATGAAGTGTGTGATGGGAAAGCAGAAGCAGAATCGCTA 

GGAACAGAGACGGAACAAAAGAAGCAACAGATGGAACCAACGAAAGATTATATTCATGTT 

CGAGCTAGAAGAGGTCAAGCTACTGATAGTCACAGTTTAGCTGAAAGAGCGAGAAGAGAG 

AAAATAAGTGAGCGGATGAAAATCTTGCAAGATCTTGTTCCGGGATGTAACAAGGTTATT 

GGAAAAGCACTTGTTCTAGATGAGATAATTAACTATATACAATCATTGCAACGTCAAGTT 

GAGTTCTTATCGATGAAGCTTGAAGCAGTCAACTCAAGAATGAACCCTGGTATCGAGGTT 

TTTCCACCCAAAGAGGTGATGA1^CTCATGATCATC71ACTCAATCTTCTCCATTTTTTTC 

ACAAAACAATACATGTTTCTATCGAGGTATTCTCGGGGTAGGAGTCTCGATGTTTATGCG 

GTTCGGTCATTTAAGCATTGCAATAAACGGAGTGACCTCTGTTTTTGCTCCTGCTCCCCA 

AAAACAGAACTTAAGACAACTATATTTTCACAAAACATGACATGTTTCTGTCGATATTCT 

CGAGTAGGAGTCGCTATTAGTTCATCTAAGCATTGCAATG7VACCGTTTGGTCAGCAAGCG 

TTTGAGAATCCGGAGATACAGTTCGGGTCGCAGTCTACGAGGGAATACAGTAGAGGAGCA 

TCACCAGAGTGGTTGCACATGCAGATAGGATCAGGTGGTTTCGAAAGAACGTCTTGA 

>G2520 Amino Acid Sequence (domain in AA coordinates: 135-206) 

MDPSGMMNEGGPFNLAEIWQFPLNGVSTAGDSSRRSFVGPNQFGDTuOLTTAANGDPARMS 

HALSQAVIEGISGAWKRREDESKSAKIVSTIGASEGENKRQKIDEVCDGKABAESLGTET 

EQKKQQMEPTKDYIHVRARRGQATDSHSLAERARREKISERMKILQDLVPGCNKVIGKAL 

VLDEIINYIQSLQRQVEFLSMKLEAVNSRMNPGIEVFPPKEVMILMIINSIFSIFFTKQY 

MFLSRYSRGRSLDVYAVRSFKHCNKRSDLCFCSCSPKTELKTTIFSQNMTCFCRYSRVGV 

AI S S S KHCNEPFGQQAFENPE I QFGSQS TREYSRGAS PE WLHMQ I GSGGFERTS * 

>G2533 (1..1080) 

ATGATAAGCTyVGGATCCAATATCGAGTTTACCTCCAGGGTTTCGATTTCATCCAACAGAT 

GAAGAACTCATTCTCCATTACCTAAGGAAGAAAGTTTCCTCTTCCCCAGTCCCGCTTTCG 

ATTATCGCCGATGTCGATATCTACAAATCCGATCCATGGGATTTACCAGCTAAGGCTCCA 

TTTGGGGAGAAAGAGTGGTATTTTTTCAGTCCGAGGGATAGGAAATATCCAAACGGAGCA 

AGACCAAACAGAGCAGCTGCGTCTGGATATTGGAAAGCAACCGGAACAGATAAATTGATT 

GCGGTACCAAATGGTGAAGGGTTTCATGAAAACATTGGTATAAAAAAAGCTCTTGTGTTT 

TATAGAGGAAAGCG5CC7VAAAGGTGTTAAAACCAATTGGATCATGCATGAATATCGTCTT 

GCCGATTCATTATCTCCCAAAAGAATTAACTCTTCTAGGAGCGGTGGTAGCGAAGTTAAT 

AATAATTTTGGAGATAGGAATTCTAAAGAATATTCGATGAGACTGGATGATTGGGTTCTT 

TGCCGGATTTACAAGAAATCACACGCTTCATTGTCATCACCTGATGTTGCTTTGGTCACA 

AGCAATCAAGAGCATGAGGAAAATGACAACGAACCATTCGTA 

CCAAATTTGCAAAATGATCAACCCCTTAAACGCCA 

TTACTAGACGCTACAGATTTGACGTTTCTCGCAAATTTTCTAAACGAAACCCCG 
CGTTCTGAATCAGATTTTTCTTTCATGATTGGCAATTTCTCTAATCCTGACATTTACGGA 
AACCATTACTTGGATCAGAAGTTACCGCAGTTGAGCTCTCCCACTTCAGAGACAAGCGGC 
ATCGGAAGCAAAAGAGAGAGAGTGGATTTTGCGGAAGAAACGATAAACGCTTCGAAGAAG 
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ATGATGAACACATATAGTTACAATAATAGTATAGATCAAATGGATCATAGTATGATGCAA 

CAACCTAGTTTCCTGAACCAGGAACTCATGATGAGTTCTCACCTTCAATATCAAGGCTAG 
>G2533 Ammo Acid Seauence fmnqpr;;^ a • _ ™ 



>G2533 Amxno Acid Sequence (conserved domain in AA coordinates • 11 -186) 

miskdpisslppgfrfhptdeelilhylrkkvssspvplsiiadvdiyksdpwdlpaS 

FGEKEWYPFSPRDRKYPNGARPNRAAASGYWKATGTDKLIAVPNGEGFHENIGIKKALVF 

YRGKPPKGVKTNWIMHEYRLADSLSPKRINSSRSGGSEVNNNFGDRNSKEYSMRLDDWVL 

CRIYKKSHASLSSPDVALVTSNQEHEENDNEPFVDRGTFLPNLQNDQPLKRQKSSCSFSN 

LLDATDLTFLANFLNETPENRSESDFSFMIGNFSNPDIYGMHYLDQKLPQLSSPTSETSG 

IGSKRERVDFAEETINASKKMMNTYSYNNSIDQMDHSMMQQPSFLNQELMMSSHLQYQG* 
>G2534 (1. .975) 

ATGGATAATATAATGCAATCGTCAATGCCACCGGGATTCCGATTTCATCCGACAGAGGAA 

GAGCTTGTGGGTTATTACCTAGATAGGAAGATCAATTCAATGAAGAGTGCTTTAGATGTC 

ATTGTAGAGATTGATCTCTACAAAATGGAGCCATGGGATATACAAGCGAGGTGTAAACTA 

GGGTATGAAGAGCAAAACGAGTGGTACTTCTTTAGTCATAAGGACAGGAAGTACCCTACC 

™r AGGACC ^ CCGAGCCACTGCGGCTGGGTOCTGG ^ G CCACGGGTAGAGA 

GCGGTACTATCAAAAAACAGTGTCATCGGAATGCGGAAGACACTTGTCTACTACAAGGGT 

CGAGCTCCTAATGGAAGAAAGTCCGATTGGATCATGCACGAATACCGTCTCCAAAACTCC 

GAGCTTGCCCCGGTTCAGGAGGAAGGCTGGGTGGTGTGTCGAGCATTTAGGAAGCCAATT 

CCAAACCAGAGGCCATTAGGGTACGAGCCATGGCAGAACCAGCTCTACCACGTCGAAAGT 
AGTAACAACTACTCATCTTCAGTGACAATfiaannrf-fiOT^^^^m,^^^ 



-"^^-~^"^^v-«rt<Jiv 3 i 3 iAi.iT<JTTTAGTCATAAGGACAGGAAGTACCCTACC 

GGGA r AGGACCAACCGAGCCACTGCGGCTGGGTOCTO 

GCGGTACTATCAAAAAACAGTGTCATCGGAATGCGGAAGACACTTGTCTACTACAAGGGT 

CGAGCTCCTAATGGAAGAAAGTCCGATTGGATCATGCACGAATACCGTCTCCAAAACTCC 

GAGCTTGCCCCGGTTCAGGAGGAAGGCTGGGTGGTGTGTCGAGCATTTAGGAAGCCAATT 

CCAAACCAGAGGCCATTAGGGTACGAGCCATGGCAGAACCAGCTCTACCACGTCGAAAGT 

AGTAACAACTACTCATCTTCAGTGACAATGAACACGAGTCATCATATCGGTGCATCTTCA 

TCAAGTCATAACCTTAATCAAATGCTCATGAGCAATAACCACTACAATCCTAATAATACA 

TCCTCATCGATGCATCAATATGGCAACATTGAGCTCCCGCAGTTGGACAGCCCGAGCTTG 

TCGCCTAGTTTAGGGACGAATAAAGATCAGAACGAGAGTTTCGAGOiAG^GMGAGAAG 

AGCTTTAACTGTGTGGATTGGAGAACACTAGATACCTTGCTTGAGACACAAGTCATACAT 

CCGCATAACCCTAATATTCTTATGTTCGAAACGCAGTCGTATAATCCGGCGCCAAGCTTC 

CCTTCCATGCATCAAAGCTATAATGAGGTCGAAGCTAATATTCATCATTCTCTTGGATGC 
TTCCCTGACTCGTAA 



>G2534 Amino Acid Sequence (conserved domain in AA coordinates -in 
MDNIMQSSMPPGFRFHPTEEELVGYYLDRKINSMKSALDVIV^IDLYKKEPTOIQARCKL * 

gyeeqnewyffshkdrkyptgtrtwrataagfwkatgrdkavlsknsvigmrk^vSS 

RAPNGRKSDWIMHEYRLQNSEIiAPVQEEGWWCRAFRKPIPNQRPLGYEPWQNQLYHVES 
^SS™TSHHIGAS SS SH^^ 

SPSLGTNKDQNESFEQEEEKSFNCTOWRTLDTLLRTnVTH D H M o M TT M ™^o^,„_„„ 



RAPNGRKSD 1 

f^^f^' ,nn± ^^ ihUMliNQMLMSNNHYNPKNTS! SMHQYGNI] » PQLDSJ I 
>G2573 (34.. 957) 

CCAGATTTAATTTGAGACTCTCAAAGAAACACCATGGAAGAAGAGCAACCTCCGGCCAAG 
AAACGAAACATGGGGAGATCTAGAAAAGGTTGCATGAAAGGTAAAGGCGGTCCAGAGAAC 

rS ACG r TACTOTCCGTGGAGTOAGGCAACGG ^^ 

CGTGAGCCTAACCGTGGGACTCGTCTCTGGCTCGGCACGTTTAATACCTCGGTCGAGGCr 
GCCATGGCTTACGATGAAGCCGCTAAGAAACTCTATGGAC^GAGGOTA^CTCAACTTG 
GTGCACCCACAACAACAACAACAAGTAGTAGTGAACAGAAACTTGTCTTTTTCTGGCCAC 

CTCGGCCAGGCAAGTTGTTCACGAGGTTCTTGCTCAGAGAGATCGAGTTTTCTACAAGAA 
GA ! GA I GATCATAGTCAT ^ TCGATCTOCGTC ^ 

ttacctaaac^g T gattcacaagatcaaga<»ccgttaatgctacgactagSa^c 

gg * ga : ggcg ^^ 

A ™^ GAATTA T A ^ ATACAATGGAGC ^ GTCT ^ 

AA ^ A ^? A ^^^^^^^^^^^^^^^atcgtcggacaacaaggagagtatg 

TATTTGGAAATGGATGATCTTTTGGAGATTGATGATTTAGGTTTGTTGATTGGCAAAAAT 

TTOTTATTTATTACTATTATTTATCATACATATTTCTTATATTTGACTTAGG 
itllllJ?" Acid Se ^ ence (domain in AA coordinates: TBD) 

TWATTSYGGEGGGGSTLTFSTNLKPKNI^SQNYGLYNGAWSRFLVGQF^™"--- 



»E 

>EKKTEHDVSSSC 
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GSSDNKESMLVPSCGGERMHRPELEERTGYLEMDDLLEIDDLGLLIGKNGDFKNWCCEEF 
QHPWNWF* 

>G2589 (23.. 1354) 

AAAGAAAAGA7VAAATAAAGATAATGAGGACGAAGACTAAGTTAGTACTCATACCTGATAG 

ACACTTTCGGAGAGCCACATTCAGGAAGAGGAATGCAGGGATAAGGAAGAAACTCCACGA 

GCTGACAACTCTCTGTGACATCAAAGCATGTGCGGTAATCTACAGTCCGTTCGAGAATCC 

AACGGTGTGGCCGTCAACCGAAGGTGTTCAAGAGGTGATTTCGGAGTTCATGGAGAAGCC 

GGCGACAGAACGGTCCAAGACGATGATGAGTCATGAGACTTTCTTGCGGGACCAAATCAC 

CAAAGAACAAAACAAACTAGAGAGTCTACGTCGTGAAAACCGAGAAACTCAGCTTAAGCA 

TTTTATGTTTGATTGCGTTGGAGGCAAGATGAGTGAGCAACAGTATGGTGCAAGGGACCT 

TCAAGATTTAAGTCTTTTTACTGATCT^ATATCTTAATCAGCTTAATGCCAGGAAGAAGTT 

CCTTACAGAATATGGTGAGTCTTCTTCTTCTGTTCCTCCTCTGTTTGATGTTGCGGGTGC 

CAATCCTCCTGTTGTTGCAGATCAAGCTGCGGTAACTGTTCCTCCTTTGTTTGCTGTTGC 

GGGTGCCAATCTTCCTGTTGTTGCTGATCAAGCTGCGGTAACTGTTCCTCCTCTGTTTGC 

TGTTGCGGGTGCCAATCTTCCTGTTGTTGCAGATCAAGCTGCGGTTAATGTTCCTACTGG 

ATTTCATAACATGAATGTGAACCAGAATCAGTATGAGCCGGTTCAGCCCTATGTCCCTAC 

TGGTTTTAGTGATCATATTCAATATCAGAATATGAACTTCAATCAAAACCAACAAGAGCC 

GGTTCATTACCAGGCTCTTGCTGTTGCGGGTGCCGGTCTTCCTATGACTCAGAATCAGTA 

TGAGCCCGTTCACTACCAGAGTCTTGCTGTCGCGGGTGGCGGTCTTCCTATGAGTCAGTT 

GCAGTATGAGCCGGTTCAGCCTTATATCCCTACTGTTTTTAGTGATAATGTTCAATATCA 

GCATATGAATTTGTATCAAAATCAACAAGAGCCGGTTCACTACCAAGCTCTTGGTGTTGC 

AGGTGCCGGTCTTCCTATGAATCAGAATCAGTATGAGCCGGTTCAGCCCTATGTCCCTAC 

TGGTTTTAGTGATCATTTTCAGTTTGAGAATATGAATTTGAATCAAAATCAACAGGAGCC 

GGTTCAATACCAAGCTCCTGTTGATTTTAATCATCAGATTCAACAAGGAAACTATGATAT 

GAATTTGAACCAGAATATGAGTTTGGATCCAAATCAGTATCCGTTTCAAAATGATCCATT 

CATGAATATGTTGACAGAATATCCTTATGAATAAGCGGGTTATGTTGGAGAGCATGCAC 

>G2589 Amino Acid Sequence (domain in AA coordinates: TBD) 

MRTKTKLVL I PDRHFRRATFRKRNAG IRKKLHELTTLCD I KAC AVI YS PFENPTVWPS TE 

GVQEVISEFMEKPATERSKTMMSHETFLRDQITKEQNKLESLRRENRETQLKHFMFDCVG 

GKMSEQQYGARDLQDLSLFTDQYLNQMARKKFLTEYGESSSSVPPLFDVAGANPPWAD 

QAAWVPPLFAVAGAIsniiPWADQAAVTVPPLFAVAGANLPWM 

QNQYEPVQPYVPTGFSDHIQYQNMNFNQNQQEPVHYQALAVAGAGLPMTQNQYEPVHYQS 
LAVAGGGLPMS QLQYEPVQPYI PTVFSDNVQYQHMNLYQNQQEPVHYQALGVAGAGLPMN 
QNQYEPVQPYVPTGFSDHFQFENMNLNQNQQEPVQYQAPVDFNHQIQQGNYDMNLNQNMS 
LDPNQYPFQNDPFMNMLTE YP YE * 
>G2687 (45. .1139) 

CTCTGTCTCTCGTATCTTTCTACTACTCTGTTTCTTGAATTCTAATGAACAACATCGACG 

ACGCAAAGACGGAGACTTCAGTGTCTTCAGGTTCAAGCGACTCTTTCTTGCCTCTCAAGA 

AACGCATGAGACTTGATGACGAACCAGAAAACGCCCTAGTGGTTTCGTCTTCACCAAAGA 

CGGTTGTGGCTTCTGGCAATGTCAAGTACAAAGGAGTCGTTCAGCAACAGAACGGTCATT 

GGGGTGCCCAGATTTACGCAGACCACAAAAGGATTTGGCTTGGAACTTTCAAATCCGCTG 

ATGAAGCCGCCACGGCTTACGATAGTGCATCTATCAAACTCCGAAGCTTTGACGCTAACT 

CGC^CCGGAACTTCCCTTGGTCTACAATCACTCTCAACGAACCAGACTT^ 

ACACAACAGAGACTGTGTTGAACATGATCAGAGACGGTTCGTACCAACACAAATTCAGAG 

ATTTTCTCAGAATCAGATCTCAGATTGTTGCGAGTATCAACATCGGGGGACCAAAACAAG 

CCCGAGGAGAAGTGAATCAAGAATC^GACAAGTGTTTTTCTTGCACACAGCTTTTTCAGA 

AGGAATTGACACCGAGCGATGTAGGGAAACTAAATAGGCTTGTGATACCTAAAAAGTATG 

CAGTGAAGTATATGGCTTTCATAAGCGCTGATCAAAGCGAGAAAGAAGAGGGTGAAATAG 

TAGGATCTGTGGAAGATGTGGAGGTTGTGTTTTACGACAGAGCAATGAGACAATGGAAGT 

TTAGGTATTGTTACTGGAAAAGTAGCCAGAGCTTTGTCTTCACCAGAGGATGGAATAGTT 

TCGTGAAGGAGAAGAATCTCAAGGAGAAGGATGTTATTGCCTTCTACACTTGCGATGTCC 

CGAACAATGTGAAGACATTAGAAGGTCAAAGAAAGAACTTCTTGATGATCGATGTTCATT 

GCTTTTCAGACAACGGTTCCGTGGTAGCTGAGGAAGTAAGTATGACGGTTCATGACAGTT 

CAGTG(^GTAAAGAAAAC^GAAAACTTGGTTAGCTCCATGTTAGAAGATAAAGAAACCA 

AATCAGAGGAGAACAAAGGAGGGTTTATGCTGTTTGGTGTAAGGATCGAATGTCCTTAGG 

GAATTTTTCTTTAAAAGTTTCTTACTTCAACTAGAACTTGTTTTACTTGTACCT 

>G2687 Amino Acid Sequence (domain in AA coordinates: TBD) 
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MNNIDDAKTETSVSSGSSDSFLPLKKRMRLDDEPENALWSSSPKTWASGNVKyKGWQ 
QQNGHWGAQIYADHKRIWLGTFKSADEAATAYDSASIKLRSFDANSHRNFPWSTITLNEP 

^fHf LTPSDVG ^ NRLVIPKKYAVK ™P^SADQSEKEEGEIVGSVE D TOWFY D RA 

MRQWICFRYCyWKSSQSFVFTRGWNSFVKEKNLKEKDVIAFYTCDVPNWVKTLEGQRKNFL 

MID^CFSDNGSWAEEVSMTVHDSSVQVKKTENLVSSMLEDKETKSEENKGGFMLFGVR 
IECP* 

>G27 (83.. 622) 

CAAAATACCAAAAACAAAACATTTTTTTTAATCTTCCCACCAATTTTTTTCTCTTTCTCT 
CGTTACATTAAATTATCTTTAGATGCAAGACTCTTCCTCTCACGAATCGCAACGTAACCT 
CCGGTCACCGGTGCCGGAGAAAACCGGAAAGAGTTCTAAGACTAAAAATG^CAAAAAGG 

tgaaattagagaaccaagaaagaaatcaagaatatggctcggtactttctctacgcSga 

gatggcggcgcgtgcacacgacgtggcggctttagccatcaaaggtSSSScS^^ 

taatttcccggagctagcttaccatttgccgagaccggctagcgcggaccct^ 

s^f CGCCGCCGCAGCAGCTGCCGTTGACTGGAAA ^^ 

CACCGTGACGTCATCTCCAGTCGCCGACGACGCTTTCTCCGATCTTCCTGATCTTTTGCT 
TGACGTGAATGATCACAACAAAAACGATGGATTCTGGGACTCGTTTCCGTACGAAGATCC 
TTTCTTCTTGGAAAATTACTAGAAGGCAAATTCTTGCCGGCGAACGGATTTTCTOGTGGT 

^cccggtaaataagaagacgatgtcgttttgtaccttttttgtctacgatgggaaI??? 

CTTTTTTTTTTACGTGTGAGTAAAAGTTTCCGAATGTGTGATGTGTAAGT^GTACAGGT 

ta^aatttcttttttttgtac™ 

GTGCTTTTATCTTCCAAATTCATTAAAAAAAAAAAAAAAAA 

^ZJ^ in ° Acid Se< 3 uence (domain in AA coordinates: 37-104) 
MQDSSSHESQRNLRSPVPEKTGKSSKTKNEQKGVSKQPNFRGVRMRQWGKWVSEIREPRK 

AA^W^ESPSSTVTSSPVADDAFSDLPDLLLDVNDHNKNBGFWDSFPYEDP^F^ 
>v3^s /zu 11.. 894 ) 

^ GG !^ GCGAAGAAG ^^ 

CTTCTTCAACGCACCGGCAAATCCTGTCGTCTTCGTTGGGTCAATAAACTCCGTCCCAAT 

C * CAAAAATGGATGC ^^ 

gagtttggtaacaaatoggcgagaatcgctacgtatctaccgggaagaactgataacgS 
gtgaagaatttctggagtagcagacaaaagagactcgctaggattcttcataactcSS 
gatgcatcgagttcgagtttcaatcccaaatcttcttcttctcatcgactca^ggg^ 
^caaaccaatccgtcaatcctctcagggttttggSSa 

GTTTCTTCTTCATGTTCCCAGAIGGTTCCTTATTCATCTGATCAAGTTGCTGATC^GTC 
^ AGG r GCCGGATraGGGTGTTAA ™ A ^ 

^™ CCAGAAAGCAGAGM 

CCTGGACCAGCTGATTOTTOTGAGCCATTGTTCGCTCTCCCTCAGCCGTTCOTCGAGCC^ 
p CG ^ GCCGAGAAG ^^^ 

GACGATTTCCCAGCTGACATGTTTGATCAGGTTGATCCAATCCCAAGTCCTTAG 
>G2720 Ammo Acid Sequence (domain in AA coordinates- 10 1141 



SSSJSS* ^^^^^Y^SQNDANQQAISPFSPESRELLARLDDPFYYDI 



SS^f E 14 FALP ° PF P PSPVP ^ C ^ SKDE EADVFLDDFPAD M FDQVDPIPSP* 
(-PTPf, ,.' „„„ 1 1 * rXTGTTCTTTTTTTTTTTTTTACTGTATCTTCTCTTCTCTTTG 

GCTGCGATPGCGGCGTTAAACGAACCGGATGGTTCGAGCAAGATGGCAATTTCGAGATAC 
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ATCGAGAGATGTTACACCGGTTTAACTTCTGCTCATGCTGCTTTGTTGACTCACCATCTC 
AAGACTTTGAAGACCAGTGGTGTTCTTTCTATGGTTAAGAAATCTTACAAAATTGCTGGT 
TCTTCTACTCCTCCTGCTAGTGTAGCTGTTGCTGCTGCTGCCGCCGCTCAAGGTCTCGAT 
GTTCCCAGATCTGAGATTCTCCATTCAAGTAACAACGATCCCATGGCTTCTGGCTCTGCT 
TCTCAGCCTCTGAAACGAGGTCGTGGTCGTCCTCCTAAGCCTAAACCTGAATCTCAACCA 
CAACCACTACAGCAACTTCCACCGACCAATCAAGTCCAGGCTAACGGACAGCCAATCTGG 
GAACAGCAGCAAGTTCAATCACCTGTTCCGGTTCCGACTCCGGTTACAGAGTCGGCGAAG 
AGAGGACCTGGTCGTCCAAGGAAGAACGGTTCTGCTGCTCCTGCTACTGCACCAATCGTT 
CAAGCTTCGGTTATGGCTGGAATTATGAAACGTAGAGGTAGACCACCGGGTCGTCGAGCT 
GCTGGGAGACAGAGGMGCCCAAATCCGTTTCTTCTACTGCCTCTGTGTATCCTTATGTT 
GCTAATGGTGCTAGACGCAGAGGAAGGCCTAGGAGAGTTGTTGACCCTAGCAGTATTGTT 
AGTGTTGCTCCAGTAGGTGGTGAAAATGTGGCAGCGGTTGCGCCAGGGATGAAGCGTGGA 
CGTGGACGACCACCTAAGATTGGTGGTGTTATCAGTAGGCTTATTATGAAGCCTAAGAGA 
6GACGAGGACGTCCTGTA6GTA6ACCCAGAAAGATTGGAACATCAGTCACGACTGGGACA 
CAAGATTCTGGAGAACTCAAGAAGAAGTTTGATATTTTTCAAGAGAAAGTGJ\AAGAAATT 
GTGAAGGTGTTGAAGGATGGAGTTACAAGTGAGAATCAAGCAGTGGTGCAAGCCATAAAA 
GATCTGGAAGCACTAACAGTGACGGAGACCGTTGAGCCACAAGTTATGGAAGAAGTGCAG 
CCAGAGGAGACTGCAGCACCACAGACTGAAGCTCAACAAACTGAAGCTGCTGAGACACAA 
GGAGGACAAGAAGAAGGACAAGAAAGAGAAGGAGAAACACAGACCCAGACAGAAGCAGAG 
GCAATGCAAGAAGCTCTGTTCTGAAGAATAATAATGATCTAGAAAACAACCTAGACATAA 
TAGCCTTGGTGTTTGGCGTTAGGAGTGTTTTTTTTTAGTTGTTTTAGGTGTTGG71ATCGC 
ATCTTAAATTATATAAAAATCTATAAGGAATTTTAATTTTTCTAGGTTTTGTTGTCTGCA 
GAAGAAGAAATAGTAGACTCGTTAATGGTGTTGTTGTCGGTGTGTCTTTAACCAAACCAT 

AAGACGTGGCTGTAAATTAGCGATGTTTCTAGTCTTCCATCTTTAATAATCTCTTATTGC 
GTCTGTGCCTTTGTTTTT 

>G2787 Amino Acid Sequence (domain in AA coordinates; 172-192, 226-247, 256-?76 
290-311, 245-366) 

TOPSLGDPHHPPQFTPFPHFPTSNHHPLGPNPYWNHVVFQPQPQTQTQIPQPQMFQLSPH 
VSMPHPPYSEMICAAIAALNEPDGSSKR^ISRYIERCYTGLTSAHAALLTHHLKTLKTSG 
VLSMVKKSYKIAGSSTPPASVAVAAAAAAQGLDVPRSEILHSSNNDPMASGSASQPLKRG 
RGRPPKPKPESQPQPLQQLPPTNQVQANGQPIWEQQQVQSPVPVPTPVTESAKRGPGRPR 
KNGSAAPATAPIVQASVMAGIMKRRGRPPGRRAAGRQRKPKSVSSTASVYPYVANGARRR 
GRPRRWDPSSIVSVAPVGGEMVAAVAPGMKRGRGRPPKIGGVISRLIMKPKRGRGRPVG 
RPRKIGTSVTTGTQDSGELKKKFDIFQEKVKEIVKVLKDGVTSENQAVVQAIKDLEALTV 
TETVEPQVMEEVQPEETAAPQTEAQQTEAAETQGGQEEGQEREGETQTQTEAEAMQEALF 

>G2789 (82.. 879) 

CTTTAGGGACACCAAATCTATTCAACCTAAAAGCCTTCTTTTCCCCTATATTGACCAACT 

TTTTAGCGAATCAGAAGAGGAATGGATGAGGTATCTCGTTCTC^TACACCGCAATTTCTA 

TCAAGTGATCATCAGCACTATCACCATCAAAACGCTGGACGACAAAAACGCGGCAGAGAA 

GAAGAAGGAGTTGAACCCAACAATATAGGGGAAGACCTAGCCACCTTTCCTTCCGGAGAA • 

GAGAATATCAAGAAGAGAAGGCCACGTGGCAGACCTGCTGGTTCCAAGAACAAACCCAAA 

GCACCAATCATAGTCACTCGCGACTCCGCGAACGCCTTCAGATGTCACGTCATGGAGATA 

ACCAACGCCTGCGATGTAATGGAAAGCCTAGCCGTCTTCGCTAGACGCCGTCAGCGTGGC 

GTTTGCGTCTTGACCGGAAACGGGGCCGTTACAAACGTCACCGTTAGACAACCTGGCGGA 

GGCGTCGTCAGTTTACACGGACGGTTTGAGATTCTTTCTCTCTCGGGTTCGTTTCTTCCT 

CCACCGGCACCACCAGCTGCGTCTGGTTTAAAGGTTTACTTAGCCGGTGGTCAAGGTCAA 

GTGATCGGAGGCAGTGTGGTGGGACCGCTTACGGCATCAAGTCCGGTGGTCGTTATGGCA 

GCTTGATTTGGAAACGCATCTTACGAGAGGCTGCCACTAGAGGAGGAGGAGGAAACTGAA 

AGAGAAATAGATGGAAACGCGGCTAGGGCGATTGGAACGCAAACGCAGAAACAGTTAATG 

CAAGATGCGACATCGTTTATTGGGTCGCCGTCGAATTTAATTAACTCTGTTTCGTTGCCA 

GGTGAAGCTTATTGGGGAACGCAACGACCGTCTTTCTAAGATAATATCATTGATAATATA 

AGTTTCGTCTTCTTATTCTTTTTCACTTTTTAC 

AACGTTTGATTAATACCTGAAGGTTTTTGGAAAATTTTCGATCGGATAAAAGGATTTATG 
TTGCGAGCCGAAACGCGGCC 

>G2789 Amino Acid Sequence (domain in AA coordinates: 53-73, 121-165) 
MDEVSRSHTPQFLSSDHQHYHHQNAGRQKRGREEEGVEPNNIGEDLATFPSGEENIKKRR 
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PRGRPAGSKNKPKAPI IVTRDSANAFRCHVME ITNACDVMESLAVFARRRQRGVCVLTGN 

GAVTNVTVRQPGGGWSLHGRFEILSLSGSFLPPPAPPAASGLKVYLAGGQGQVIGGSW 

GPLTASSPWVMAASFGNASYERLPLEEEEETEREIDGNAARAIGTQTQKQLMQDATSFI 

GSPSNLINSVSLPGEAYWGTQRPSF* 

>G31 (13. .615) 

CTTTTATAAGCAATGGCTCCAAGACAGGCGAACGGTAGAAGCATTGCCGTGAGTGAAGGC 

GGCGGAGGGAAGACGATGACGATGACGACGATGCGGAAGGAAGTGCACTTTAGAGGTGTG 

AGGAAGCGTCCATGGGGTAGATACGCGGCGGAGATCCGTGACCCGGGAAAGAAAACCCGG 

GTTTGGCTCGGGACATTCGACACGGCGGAGGAAGCTGCAAGAGCTTACGACACCGCCGCT 

AGAGAGTTTCGTGGCTCCAAAGCAAAGACTAATTTCCCTCTTCCCGGAGAGTCTACTACG 

GTTAACGACGGTGGCGAGAACGATTCTTACGTCAACCGTACGACGGTGACGACGGCGCGT 

GAGATGACGCGTCAGAGATTTCCGTTTGCATGTCACCGGGAGCGTAAAGTCGTCGGTGGT 

TATGCTTCTGCTGGTTTTTTCTTCGATCCGTCAAGAGCTGCTTCGTTAAGAGCAGAGCTT 

TCTCGGGTTTGTCCGGTTCGGTTTGATCCGGTTAATATCGAGTTGAGTATTGGTATTCGA 

GAAACCGTAAAAGTTGAACCGAGAAGAGAACTAAACCTGGATCTTAACCTAGCTCCACCG 

GTGGTGGACGTTTAGATTTTTTTCTTCTTTTCATAATTTGTATTTTACATTGCCGGAAAA 
TAATTAATGTTTTCTTTAG 

>G31 Amino Acid Sequence (domain in AA coordinates: TBD) 

MAPRQANGRSIAVSEGGGGKTMTMTTMRKEVHFRGWKRPWGRYAAEIRDPGKXTRVWLG 
TFDTAEEAARAYDTAAREFRGSKAKTOFPLPGESTTV^ 

QRFPFACHRERKWGGYASAGFFFDPSRAASLRAELSRVCPVRFDPVNIELSIGIRETVK 
VEPRRELNLDLNLAPPVVDV* 

>G33 (20.. 757) 

ATTCTCCCCCAACCAAAATATGACCACAGAAAAAGAGAATGTCACTACGGCCGTGGCCGT 

GAAAGACGGCGGAGAAAAGAGTAAGGAAGTGAGTGACAAGGGCGTAAAGAAGAGAAAGAA 

TGTAACTAAGGCCCTGGCCGTGAATGACGGCGGAGAAAAGAGTAAGGAAGTGCGTTACAG 

GGGTGTAAGGAGGAGACCATGGGGGAGATATGCTGCGGAGATCCGTGATCCGGTAAAGAA 

AAAACGGGTCTGGCTCGGGTCCTTCAACACGGGGGAGGAAGCCGCCAGAGCCTACGACTC 

CGCTGCCATAAGGTTTCGAGGATCGAAAGCTACTACTAACTTCCCTCTAATCGGATACTA 

TGGGATTTCTTCGGCGACGCCGGTGAACAACAACCTTTCCGAGACGGTGAGTGATGGAAA 

TGCCAACCTCCCTCTCGTTGGAGACGATGGGAATGCTTTGGCTTCTCCGGTGAACAACAC 

CCTTTCCGAAACGGCGCGTGATGGAACACTTCCATCGGATTGTCACGACATGTTATCTCC 

GGGGGTGGCTGAAGCGGTTGCTGGATTTTTCTTAGATCTGCCTGAAGTTATTGCGTTGAA' 

AGAGGAGCTTGATCGAGTTTGTCCTGACCAGTTTGAGTCCATTGATATGGGGTTGACTAT 

TGGTCCTCAAACCGCCGTGGAAGAGCCTGAGACTTCCTCCGCCGTGGATTGTAAGCTGCG 

AATGGAACCGGATCTTGACCTCAACGCAAGTCCCTAAAGATTGATCTGATGTTGTTGTCC 

TTGAATAAGTTTGTTATCTTGTCGCTCTTCTGATTGTCTGTACTTCTATTGGTTGATTCG 

TGCTTTTGGAGGACAAAACAAACATTTTTTTATGTATTAAAAAAAGGTAATTGAACTATT 

ATCGAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

>G33 Amino Acid Sequence (domain in AA coordinates: 50-117) 
MTTEKENVTTAVAVKDGGEKS KE VSDKG VKKRKNVTKALAWDGGEKS KEVRYRGVRRRP 
WGRYAAEIRDPVKKKRWLGSFNTGEEAARAYDSAAIRFRGSKATTNFPLIGYYGISSAT 
PVNNNLSEWSDGNANLPLVGDDGNAIjASPVNNTLSETARDGTLPSDCHDM^ 

AGFFLDLPEVIALKEELDRVCPDQFESIDMGLTIGPQTAVEEPETSSAVDCKLRMEPDLD 
LNASP* 

>G342 (1..723) 

ATGGACGTCTACGGCATGTCTTCACCGGACTTGCTTCGTATCGACGACCTTCTCGATTTC 
TCCAACGACGAAATGTTCTCTTCCTCITCCACCGTCACTTCCT 

GCTTCTTCCGAAAACCCTTTCAGCTTTCCTTCTTCCACCTACACTTCTC 

ACCGACTTCACTCACGATCTCTGCGTTCCCAGTGACGACGCAGCTCATCTCGAATGGTTA 

TCGCGATTCGTTGACGATTCATTCTCCGATTTCCCAGCAAATCCTTTAAC(^TGACCGCT 

AGACCCGAGATTTCATTCACCGGAAAACCTAGAAGTCGCCGATCAAGAGCACCAGCACCT 

TCCGTAGCTGGAACTTGGGCTCCGATGTCTGAATCAGAGCTTTGTCACTCCGTCGCTAAA 

CCTAAACCGAAGAAAGTCTACAACGCTGAATCGGTTACGGCGGATGGAGCGAGGCX3 

ACGCACTGTGCCTCGGAGAAAACX3CCACAGTGGAGAACTGGACCGCTTGGACCTAAAACA 

CTTTGTAACGCTTGTGGAGTTCGTTACAAATCAGGGAGGCTTGTACCGGAATACAGACCG 

GCGTCGAGTCCGACGTTTGTATTGACTCAGCATTCGAACTCTCATCGGAAACT 
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CTCCGGCGACAGAAGGMCAACAAGAATCTTGCGTTCGAATTCCGCCGTTTCAGCCGCAG 
TAA 

>G342 Amino Acid Sequence (domain in AA coordinates: 155-190) 

MDVYGMSSPDLLRIDDLLDFSNDEIFSSSSTVTSSAASSAASSENPFSFPSSTYTSPTLL 

TDFTHDLCVPSDDAAHLEWLSRFVDDSFSDFPANPLTMTVRPEISFTGKPRSRRSRAPAP 

SVAGTWAPMSESELCHSVAKPKPKKVYNAESVTADGARRCTHCASEKTPQWRTGPLGPKT 

LCNACGVRYKSGRLVPEYRPASSPTFVLTQHSNSHRKVMELRRQKEQQESCVRIPPFQPQ 
* 

>G352 (80.. 817) 

AATACACCACACACTTCACTCTTTCTTCATCTTCTTCTTCTTAAATAGCTCGAAATCACA 
TCTCACAGAATTAAATCTTATGGCTCTCGAGACTCTCAATTCTCCAACAGCTACCACCAC 
CGCTCGGCCTCTTCTCCGGTATCGTGAAGAAATGGAGCCTGAGAATCTCGAGCAATGGGC 
TAAAAGAAAACGAACA7^ACGTCAACGTTTTGATCACGGTCATCAGAATC/y\GAAACGAA 
CAAGAACCTTCCTTCTGAAGAAGAGTATCTCGCTCTTTGTCTCCTCATGCTCGCTCGTGG 
CTCCGCCGTACAATCTCCTCCTCTTCCTCCTCTACCGTCACGTGCGTCACCGTCCGATCA 
CCGAGATTACAAGTGTACGGTCTGTGGGAAGTCCTTTTCGTCATACCAAGCCTTAGGTGG 
ACACAAGACGAGTCACCGGAAACCGACGAACACTAGTATCACTTCCGGTAACCAAGAACT 
GTCTAATAACAGTCACAGTAACAGCGGTTCCGTTGTTATTAACGTTACCGTGAACACTGG 
TAACGGTGTTAGTCAAAGCGGAAAGATTCACACTTGCTCAATCTGTTTCAAGTCGTTTGC 
GTCTGGTCAAGCCTTAGGTGGACACAAACGGTGTCACTATGACGGTGGCAACAACGGTAA 
CGGTAACGGAAGTAGCAGCAACAGCGTAGAACTCGTCGCTGGTAGTGACGTCAGCGATGT 
TGATAATGAGAGATGGTCCGAAGAAAGTGCGATCGGTGGCCACCGTGGATTTGACCTAAA 
CTTACCGGCTGATCAAGTCTCAGTGACGACTTCTTAA 

>G352 Amino Acid Sequence (domain in AA coordinates: 99-119,166-186) 

MALETLNSPTATTT7VRPLLRYREEMEPENLEQWAKRKRTKRQRFDHGHQNQETNKNLPSE 

EEYLALCLLMLARGS AVQS PPLPPLPSRASPSDHRD YKCTVCGKS FS S YQALGGHKTSHR 

KPTNTS ITSGNQELSNNSHSNSGS WINVTVNTGNGVSQSGKIHTCS I CFKSFASGQALG 

GHKRCHYDGGNNGNGNGSSSNSVELVAGSDVSDVDNERWSEESAIGGHRGFDLNLPADQV 

SVTTS* * , 

>G357 (1..615) 

ATGCAGAACAAACACAAATGCAAGCTCTGTTCCAAGAGTTTCTGTAATGGCAGAGCACTT 
GGTGGTCACATGAAGTCTCACTTGGTCTCATCTCAGTCTTCAGCTCGGAAGAAACTAGGT 
GACTCGGTCTATTCTTCTTCTTCCTCTTCCTCCGATGGTAAAGCGCTCGCCTACGGGTTA 
CGAGAGAACCCGAGGAAGAGTTTCCGGGTCTTTAATCCGGATCCTGAGTCATCCACAATT 
TACAACAGTGAGACAGAGACCGAACCTGAATCCGGAGACCCGGTTAAGAAACGGGTCAGA 
GGAGATGTTTCAAAGAAGAAGAAGAAGAAGGCAAAGAGTAAGAGAGTGTTTGAGAACTCG 
AAGAAGCAAAAGACAATTCACGAGTCACCAGAACCAGCGAGTTCTGTCTCTGATGGTTCT 
CCTGAACAAGATTTAGCTATGTGCTTGATGATGCTGTCAAGAGATTCAAGGGAGCTCGAG 
ATTAAACTGAAAAAACCGGAGGAAGAGAGGAAGCCGGAAAAAAGACATTTCCCTGAGCTC 
CGTCGCTGTATGATAGATCTGAATCTTCCTCCGCCGCAAGAAGCTGAAGCTGTCACCGTC 
GTTTCAGCCATATAA 

>G357 Amino Acid Sequence (domain in AA coordinates: 7-29) 
MQNKHKCKLCS KS FCNGRALGGHMKSHLVSSQSSARKKLGDSVYS S S S SS SDGKALAYGL 
RENPRKSFRVFNPDPESSTIYNSETETEPESGDPVKKRVRGDVSKKK3CKKAKSKRVFENS 
KKQKTIHESPEPASSVSDGSPEQDLAMCLMMLSRDSRELEIKLKKPEEERKPEKRHFPEL 
RRCMIDLNLPPPQEAEAVTWSAI * 
>G358 (1,.855) 

ATGGGTCAAGATGAGGTTGGGAGTGATCAGACGCAAATCATAAAAGGGAAACGTACGAAG 
CGACAAAGATCGTCTTCGACGTTTGTGGTGACGGCGGCGACAA(^GTGACTTCAAC^G^ 
TCATCGGCCGGTGGT^AGTGGAGGAGAAAGAGCTGTTTCAGATGAATACAACTCGGCGGTT 
TCGTCTCCGGTGACTACTGATTGTACGCAAGAAGAAGAAGACATGGCGATTTGTCTCATC 
ATGTTAGCTCGTGGGACAGTTCTTCCATCGCCGGATCTCAAGAACrCGAGAAAAATTCAT 
CAGAAGATTTCGTCGGAGAATTCTAGTTTCTATGTGTACGAGTGTAAAACGTGTAACCGG 
ACGTTTTCGTCGTTCCAAGCACTTGGTGGACACAGAGCGAGCCACAAGAAGCCGAGGACG 
TCGACTGAGGAAAAGACTAGACTACCCCTGACGCAACCCAAGTCTAGTGCATCAGAAGAA 
GGGCAAAACAGTCATTTCAAAGTTTCCGGCTCAGCCCTAGCTTCACAGGCAAGTAACATC 
ATCAACAAGGCAAACAAAGTACACGAGTGTTCCATCTGCGGTTCTGAGTTCACTTCCGGG 
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CAAGCTCTCGGTGGTCACATGAGGCGGCACAGGACAGCCGTAACCACGATTAGCCCCGTT 

GCAGCCACCGCAGAAGTAAGCAGAAACAGTACAGAGGAAGAGATTGAGATCAATATAGGC 

CGTTCGATGGAACAGCAGAGGAAATATCTACCGTTGGATCTTAATCTACCAGCACCAGAA 

GATGATCTAAGAGAGTCAAAGTTTCAAGGGATAGTATTCTCAGCAACACCAGCGTTAATA 
GATTGTCATTACTAG 

>G358 Amino Acid Sequence (domain in AA coordinates: 124-135 188-210) 

MGQDEVGSDQTQIIKGKRTKRQRSSSTFWTAATTVTSTSSSAGGSGGERAVSDEYNSAv' 

SSPVTTDCTQEEEDMAICLIMLARGTVLPSPDLKNSRKIHQKISSENSSFYVYECKTCNR 

TFSSFQALGGHRASHKKPRTSTEEKTRLPLTQPKSSASEEGQNSHFKVSGSALASQASNI 

INKANKVHECS I CGS E FTS GQALGGHMRRHRTAVTT I S PVAATAEVS RNS TEEEI E IN I G 

RSMEQQRKYLPLDLNLPAPEDDLRESKFQGIVFSATPALIDCHY* 

>G360 (1..543) 

ATGTGGAACCCTAACAAAATTGAAGAATTGGAGGATGATGATGAATCTTGGGAAGTCAAA 

GCCTTTGAGCAAGACACTAAAGGCAACATCTCTGGTACCACTTGGCCTCCAAGATCTTAC 

ACTTGCAATTTCTGCCGCCGTGAGTTCCGTTCTGCTCAAGCCTTAGGCGGTCACATGAAT 

GTCCACCGCCGTGACCGCGCCTCATCTAGGGCTCATCAAGGTTCCACCGTTGCGGCTGCG 

GCTAGAAGCGGCCACGGGGGGATGTTACTCAATTCTTGTGCTCCGCCGTTGCCTACAACG 

ACACTTATAATAC7^ATCCACGGCGAGTAACATTGAAGGTTTGTCCCATTTCTACCAACTG 

CAAAACCCTAGTGGCATTTTTGGTAATTCTGGTGACATGGTGAATCTTTATGTAGAAGTT 

CCTCCTCGGCTTATTGAATATTCGACAGGAGATGATGAGAGCATTGGCTCGATGAAAGAA 

GCGACAGGAACATCAGTGGATGAGCTTGATCTTGAACTTCGGCTAGGGCACCATCCACCG 
TGA 

>G360 Amino Acid Sequence (domain in aa coordinates: 42-62) 
MWNPNKIEELEDDDESWEVKAFEQDTKGNISGTTWPPRSYTCNFCRREFRSAQALGGHMN 
VHRRDRASSRAHQGSTVAAAARSGHGGMLLNSCAPPLPTTTLIIQSTASNIEGLSHFYQL 
QNPSGIFGNSGDMVNLYVEVPPRLIEYSTGDDESIGSMKEATGTSVDELDLELRLGHHPP 

>G362 (195.. 830) 

ATAAAAAACCCTTCATACAATATAAAATTTCTTTAGACATACAATATATTATACTATTAC 

AGATGCAATGCATCATTAGTTACAAACTATTAAACTAAATATCCCCCGTCTCTCTCTTGC 

TATATAAAGAAGATCATTTACACATCTCCTTAAGCAAATTAAACCCATCGATAAACACAT 

ACGTTCACACATATATGTCTATAAATCCGACAATGTCTCGTACTGGCGAAAGTTCTTCAG 

GTTCGTCCTCCGACAAGACGATAAAGCTATTCGGCTTCGAACTCATCAGCGGCAGTCGTA 

CGCCGGAAATCACGACGGCGGAAAGCGTGAGCTCGTCCACAAACACGACGTCGTTAACAG 

TGATGAAAAGACACGAGTGCCAATACTGCGGTAAAGAGTTTGCAAATTCTCAAGCCTTAG 

GAGGTCACCAAAACGCTCACAAGAAGGAGAGGTTGAAGAAGAAGAGGCTTCAGCTTCAAG 

CTCGGCGAGCCAGCATCGGCTATTATCTCACCAACCACCAACAACCCATAACGACGTCAT 

TTCAGAGACAATACAAAACGCCGTCGTATTGTGCATTCTCCTCCATGCACGTGAATAATG 

ATCAGATGGGTGTGTACAACGAAGATTGGTCGTCGAGGTCGTCGCAGATTAACTTCGGTA 

ATAATGACACGTGCCAAGATCTTAATGAACAAAGCGGTGAGATGGGTAAGCTGTACGGTG 

TTCGACCGAACATGATTCAGTTCCAGAGAGATCTGAGTTCTCGTTCTGATCAGATGAGAA 

GTATTAACTCGCTGGATCTTCATCTAGGTTTTGCCGGAGATGCGGCATAACAAATTAAAG 

AGAGATATATGATTAAGATTATATGTACTATAGTGGCGTATTTCATTGGGATCATGAAGG 

GGAAAATU^CGAGACATATAGTATTCTTGATGCAATTTGAGTTTTGTAATTTATTTAGGTO 
TATGTATGTTTTCGAAG 

>G362 Amino Acid Sequence (domain in AA coordinates: 62-82) 

MSINPTMSRTGESSSGSSSDKTIKLFGFELISGSRTPEITTAESVSSSTNTTSLTVMKRH 

ECQYCGKEFANSQALGGHQNAHKKERLKKKRLQLQARRASIGYYLTNHQQPITTSFQRQY 

KTPSYCAFSSMHVNNDQMGVYNEDWSSRSSQINFGNNDTCQDI^ 

IQFQRDLSSRSDQMRS INSLDLHLGFAGDAA* 

>G364 (64 . .516) 

AAGCTTGATATCGCCTCTCTCTAATCTCTCTTTCTCTCTCTATCTOT 

GGTATGGACTACCAGCCAAACACATCCCTACGTCTAAGCCTACCAAGTTACAAGAACCAC 

CAACTAAACCTAGAACTTGTTCTCGA 

ACGAACTCATCATCATG^^ 

AAGTTTTAC^GCTCTCAAGCTCriTGGTGGTCATCAAAACG 
TTAGCCAAGAAGAGTCGAGAACTCTTTAGATCCT 



126 



BNSOOCfD: <WO_03013227A2J_> 



WO 03/013227 



127/286 



PCT/US02/25805 



TACCCGTTCTCCGGTCGCTTTGAGCTTTACGGCCGTGGCTACCAAGGATTTCTCGAAAGT 
GGCGGCTCGAGGGACTTCTCCGCCCGCCGTGTGCCGGAGAGTGGTCTTGATCAGGATCAG 
GAGAAGAGTCACCTTGACTTATCCTTAAGGCTCTAAAAGAATCTTATATTTTGTTAGTCT 
ATATATTATCATATCAATTGTTAATCTTAAAATTGATTGTTTTACTTATTAGTCATTTCC 
TATTATCTGAAAGTTTTCTTTGTAAGTTGTAACTATGGTCCTAAATTCAAATCCAAATTT 
GATTTTGGAAGATGGTACCTAATGCAGTAGTTAAATAAGTTA7VA7VAAATGAAGGATCTAT 
AATTCTCT 

>G364 Amino Acid Sequence (domain in AA coordinates: 54-76) 

MDYQPNTSLRLSLPSYKNHQLNLELVLEPSSMSSSSSSSTNSSSCLEQPRVFSCNYCQRK 

FYSSQALGGHQNAHKLERTLAI<KSRELFRSSNTVDSDQPYPFSGRFELYGRGYQGFLESG 

GSRDFSARRVPESGLDQDQEKSHLDLSLRL* 

>G365 (69. .755) 

CAATTCTTTTACTTTCATTCTCTTTATATATTCTCTCTACGCTATAATATATATTACACA 
GAATATACATGGAACCGTCCATCAAAGGAGATCAAGAAATGTTAAAAATCAAGAAACAAG 
GTCATCT^AGATCTTGAGTTGGGGTTGACCCTTTTGTCACGTGGAACCGCGACCTCATCAG 
AGCTCAATCTCATCGATTCTTTCAAAACCAGCTCATCATCGACTTCTCATCATCAGCACC 
AGCAAGAACAATTGGCAGATCCGAGAGTGTTCTCGTGTAATTATTGTCAAAGAAAGTTCT 
ATAGTTCACAAGCG CTAGG CGGTCACC AAAACGCTCATAAACGTGAGCGCACCTTAGCCA 
AACGTGGACAGTATTACAAGATGACTCTCTCCTCCTTGCCTTCTTCAGCGTTTGCGTTTG 
GCCACGGTTCAGTCAGCAGATTCGCAAGCATGGCATCGTTACCATTACATGGCTCGGTGA 
ATAACAGGTCAACGTTAGGGATTC7VAGCTCATTCAACGATCCATAAGCCCAGCTTCTTAG 
GAAGACAAACGACGAGTTTAAGTCATGTTTTCAAACAGAGCATTCACCAGAAACCGACCA 
TAGGAAAGATGTTGCCGGAGAAATTTCACCTTGAAGTCGCCGGAAATAATAACAGTAACA 
TGGTTGCTGCTAAGTTGGAGAGAATTGGACATTTCAAGAGCAACCAAGAAGATCATAATC 
AGTTTAAGAAAATTGACTTGACTCTTAAGCTATGAGCTCTGCCATCTTCTTTTTAGTCTT 
CATTATAACTTTTTTTATTCTCATCTTTGTTTGATATAATGATTGACGGCAGGGTGTGTT 
AGAGTTTCACTAATGATCAAGTTGTACTTTTTATATATTTCATTGATACCTTGTTGATGT 
AATTCAATATTTTAGGTCTGTTTTT 

>G365 Amino Acid Sequence (domain in aa coordinates: 70-90) 

MEPSIKGDQEMLKIKKQGHQDLELGLTLLSRGTATSSELNLIDSFKTSSSSTSHHQHQQE 

QLADPRVFSCNYCQRKFYSSQALGGHQNAHKRERTLAKRGQYYKMTLSSLPSSAFAFGHG 

SVSRFASMASLPLHGSVNNRSTLGIQAHSTIHKPSFLGRQTTSLSHVFKQSIHQKPTIGK 

MLPEKFHLEVAGNNNSNMVAAKLERIGHFKSNQEDHNQFKKIDLTLKL* 

>G367 (1..708) 

ATGGACGCTTCAATAGTTTCCTCATCCACTGCTTTTCCATATCAAGATTCTCTAAACCAG 
AGCATCGAAGACGAAGAAAGAGACGTTCATAATTCTAGTCACGAACTCAATCTCATCGAC 
TGCATAGACGACACAACGAGTATCGTTAACGAATCTACAACATCCACAGAACAAAAGCTT 
TTCTCATGCAACTATTGTCAAAGAACTTTCTATAGCTCACAAGCACTTGGTGGTCACCAA 
AACGCACACAAGAGAGAGAGAACGTTGGCGAAGAGAGGACAACGTATGGCAGCGTCAGCC 
TCAGCTTTTGGACATCCTTACGGTTTCTCTCCACTTCCTTTCCACGGACAGTACAACAAC 
CATAGGTCJTTTAGGGATCCAAGCGCATTCGATAAGCCACAAGCTAAGTTCTTATAACGGG 
TTTGGTGGTCACTATGGTCAGATCAACTGGTCAAGACTTCCATT^ 

ATAGGTAAATTTCCCTCAATGGATAATTTTCATCATCATCATCATCAGATGATGATGATG 

GCTCCTTCAGTAAATTCACGGTCCAATAACATCGATAGCCCAAGCAACACAGGACGGGTT 

CTAGAAGGGTCACCGACTCTTGAACAATGGCACGGAGACAAAGGATTGTTGTTAAGTACA 

AGTCATCATGAAGAGCAGCAGAAACTTGACTTGTCCCTCAAGCTTTGA 

>G367 Amino Acid Sequence (domain in AA coordinates : 63-84) 

MDASIVSSSTAFPYQDSLNQSIEDEERDVHNSSHELNLIDCIDDTTSIVNESTTSTEQKL 

FSCNYCQRTFYSSQALGGHQNAHKRERTLAKRGQRMAASASAFGHPYGFSPLPFHGQYNN 

HRSLGIQMSISHKLSSYWGFGGHYGQINWSRLPFDQQPAIGKFPSMDNFHHHHHQMMMM 

APSWSRSNNIDSPSNTGRVLEGSPTLEQWHGDKGLLLSTSHHEEQQKliDLSLKL* 

>G373 (1..1854) 

ATGGCGATTGAAACTCAGCTTCCTTGCGACGGTGACGGTGTGTGTATGCGGTGTCAGGTG 
AATCCTCCGTCAGAAGAGACTCTCACTTGTGGCACGTGCGTCACTCCATGGCACGTGCCG 
TGTCTCCTCCCCGAATCACTCGCTTCTTCCACTGGAGAGTGGGAGTGTCCCGATTGCTCC 
GGCGTTGTCGTTCCCTCCGCCGCTCCGGGTACCGGAAACGCTCGACCTGAATCTTCCGGT 
TCAGTTCTCGTTGCTGCGATCCGTGCGATTCAGGCTGATGAGACTTTAACCGAAGCTGAG 
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AAAGCCAAAAAAAGGCAGAAACTGATGAGTGGGGGTGGTGACGATGGTGTCGATGAAGAA 

GAGAAGAAGAAGTTAGAAATCTTTTGTTCTATTTGCATTCAATTGCCAGAAAGACCTATC 

ACGACACCGTGTGGGCACAATTTCTGTTTGAAATGTTTCGAGAAATGGGCAGTAGGTCAA 

GGGAAGCTAACTTGTATGATATGCCGAAGCAAAATTCCGAGACATGTGGCAAAAAATCCT 

CGCATCAACTTAGCTCTAGTTTCTGCTATTCGTTTAGCAAATGTTACCAAATGTTCTGTT 

GAGGCAACTG CAGCCAAGGTTCATCATATTATCCGCAACCAAG ACCGTCCTGAGAAAG CA 

TTTACTACCGAGCGGGCAGTAAAAACTGGGAAAGCTAATGCTGCTAGCGGTAAGTTTTTT 

GTGACAATACCTCGTGATCATTTTGGTCCCATACCAGCTGAGAATGATGTCACTAGAAAG 

CAAGGTGTTTTGGTTGGAGAATCTTGGGAGGACAGGCAAGAGTGTAGGCAGTGGGGAGCT 

CATTTCCCGCATATTGCTGGCATTGCCGGGCAATCAGCGGTTGGAGCTCAGTCTGTGGCC 

CTCTCTGGAGGTTATGACGATGATGAGGATCATGGTGAATGGTTTCTCTACACAGGAAGT 

GGTGGAAGGGATCTCAGTGGAAACAAAAGAATTAACAAGAAACAGTCGTCTGACCAGGCG 

TTTAAAAACATGAATGAATCTCTAAGACTTAGTTGCAAAATGGGCTATCCTGTCCGAGTT 

GTCAGGTCTTGGAAGGAGAAGCGTTCTGCATATGCCCCTGCTGAAGGTGTGAGATATGAT 

GGGGTCTATCGAATTGAGAAGTGCTGGAGTAATGTTGGAGTACAGGGTTCTTTTAAGGTC 

TGTCGTTACCTGTTTGTTAGATGTGACAATGAGCCAGCTCCATGGACCAGTGATGAGCAT 

GGCGATCGTCCAAGACCGTTGCCTAATGTTCCGGAGCTTGAGACTGCTGCTGACCTGTTT 

GTGAGAAAGGAGAGTCCATCATGGGATTTCGATGAAGCTGAGGGTCGTTGGAAATGGATG 

AAGTCTCCTCCTGTTAGCAGAATGGCTTTGGATCCTGAGGAGAGGAAGAAGAATAAGAGA 

GCAAAAAATACTATGAAGGCCAGACTTCTGAAAGAATTTAGTTGCCAAATCTGTCGGGAA 

GTGCTGAGTCTTCCAGTGACGACGCCTTGTGCACACAACTTCTGCAAAGCATGCTTAGAA 

GCGAAGTTTGCTGGGATAACTCAACTGAGAGAGAGAAGCAATGGCGGACGTAAACTACGT 

GCAAAGAAGAACATCATGACCTGCCCTTGCTGCACGACGGATCTCTCCGAGTTTCTCCAA 

AACCCGCAGGTGAACAGAGAGATGATGGAGATAATAGAGAATTTTAAGAAGAGTGAGGAA 

GAGGCTGATGCATCCATTTCTGAAGAAGAAGAAGAAGAATCCGAACCTCCAACTAAGAAG 

ATTAAGATGGATAACAACTCTGTTGGTGGTAGTGGTACAAGTCTCTCAGCTTAA 

>G373 Amino Acid Sequence (domain in AA coordinates: 129-168) 

MAIBTQLPCDGDGVCMRCQVNPPSEETLTCGTCVTPWHVPCLLPESLASSTGEWECPDCS 
GVWPSAAPGTGNARPESSGSVLVAAIRAIQADETLTEAEK 

EKKKLEIFCS I CI QLPERPITTPCGHNFCLKCFEKWAVGQGKLTCMI CRSKI PRHVAKNP 
RINLALVSAIRIANVTKCSVEATAAKVHHIIRNQDRPEKAFTra 

VTI PRDHFGPI PAENDVTRKQGVLVGES WEDRQECRQWGAHFPJH I AG IAGQSAVGAQS VA 

LSGGYDDDEDHGEWFLYTGSGGRDLSGNKRINKKQSSDQAFKNMNESLRLSCKMGYPVRV 

VRSWKEKRSAYAPAEGVRYDGVYRIEKCWSNVGVQGSFKVCRYLFVRCDNEPAPWTSDEH 

GDRPRPLPNVPELETAADLFVRKESPSWDFDEAEGRWKWMKSPPVSRMALDPEERKKNKR 

AI^NTMKARLLKEFSCQICREVLSLPVTTPC^UINFCKACLEAKFAGITQLRERSNGGRKLR 

AKKNIMTCPCCTTDLSEFLQNPQVNREMMEIIENFKKSEEEADASISEEEEEESEPPTKK 
IKMDNNSVGGSGTSLSA* 

>G396 (1..957) 

ATGGGGGAAAGAGATGATGGGTTGGGTTTGAGTCTAAGCTTGGGAAATAGTCAACAAAAA 

GAAC(^TCTCTGAGGTTGAATCTTATGCCGTTGAC7^CTTCTTCTTCTTCTTCTTCGTTT 

CAACACATGCACAATCAGAATAACAATAGCCATCCCCAGAAGATTCATAACATCTCTTGG 

ACTCATCTGTTTCAATCTTCTGGGATTAAACGTACAACTGCAGAGAGAAACTCCGACGCC 

GGGTCATTTCTAAGAGGTTTCAACGTGAACAGAGCTCAGTCTTCGGTGGCGGTAGTGGAC 

TTGGAAGAAGAAGCCGCCGTCGTCTCGTCTCCAAACAGCGCCGTTTCGAGTCTGAGTGGA 

AATAAAAGGGATCTTGCGGTGGCGAGAGGAGGAGATGAAAACGAGGCGGAGAGAGCTTCT 

TGCTCACGCGGAGGGGGAAGCGGTGGTAGCGACGATGAAGACGGCGGAAACGGCGACGGA 

TCAAGGAAGAAACTACGGTTATCGAAGGATCAAGCTCTTGTTCTCGAGGAGACTTTTAAA 

GAACATAGCACTCTTAATCCGAAGCAAAAGCTGGCTCTAGCAAAACAGTTGAATCTAAGG 

GCAAGACAAGTTGAAGTGTGGTTTCAGAACCGTAGGGCAAGGACGAAGCTGAAACAAACG 

GAGGTTGATTGTGAGTATTTAAAGAGATGTTGCGATAATCTGACCGAGGAGAATCGACGG 

CTGCAGAAAGAAGTGTCGGAGCTGAGGGCGTTGAAGTTGTCTCCACATCTCTACATGCAC 
ATGACTCCTCCTACTACTCT^^ 

GCCACTGTGACCGCTGCTCCTTCCACTACTACTACTCCTACGGTGGTGGGGCGGCCAAGT 
CCACAGCGATTAACTCCTTGGACTGCTATTTCTCTCCAGCAAAAATCAGGTCGCTAG 
>G396 Amino Acid Sequence (domain in AA coordinates: 159-220) 

MGERDDGLGLSLSLGNSQQKEPSLRLNLMPLTTSSSSSSFQHMHNQNNNSHPQKIHNISW 
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THLFQSSGIKRTTAERNSDAGSFLRGFNVNRAQSSVAWDLEEEAAWSSPNSAVSSLSG 

NKRDLAVARGGDENEAERASCSRGGGSGGSDDEDGGNGDGSRKKLRLSICDQALVLEETFK 

EHSTLNPKQKLALAKQLNLRARQVEVWFQNRRARTKLKQTEVDCEYLKRCCDNLTEENRR 

LQKEVSELRALKLSPHLYMHMTPPTTLTMCPSCERVSSSAATVTAAPSTTTTPTWGRPS 

PQRLTPWTAI SLQQKSGR* 

>G431 (1..1149) 

ATGGAGAGTGGTTCCAACAGCACTTCTTGTCCAATGGCTTTTGCCGGGGATAATAGTGAT 
GGTCCGATGTGTCCTATGATGATGATGATGCCGCCCATCATGACATCACATCAACATCAT 
GGTCATGATCATCAACATCAACAACAAGAACATGATGGTTATGCATATCAGTCACACCAC 
CAACAAAGTAGTTCCCTTTTTCTTCAATCACTAGCTCCTCCCCAAGGAACTAAGAACAAA 
GTTGCTTCTTCTTCTTCTCCTTCCTCTTGTGCTCCTGCCTATTCTCTAATGGAGATCCAT 
CATAACGAAATCGTTGCAGGAGGAATCAACCCTTGCTCCTCTTTCTCTTCTTCAGCCTCT 
GTCAAGGCCAAGATCATGGCTCATCCTCACTACCACCGCCTCTTGGCCGCTTATGTCAAT 
TGTCAGAAGGTTGGAGCACCACCGGAGGTTGTGGCGAGGCTGGAGGAGGCATGCTCGTCT 
GCCGCAGCCGCAGCCGCATCTATGGGGCCAACAGGGTGTCTTGGTGAAGATCCAGGGCTT 
GATCAATTCATGGAAGCTTACTGTGAAATGCTCGTTAAGTATGAGCAAGAGCTCTCCAAA 
CCTTTCAAGGAAGCTATGGTCTTCCTTCAACGTGTCGAGTGTCAATTCAAATCCCTCTCT 
CTATCCTCACCTTCCTCTTTCTCCGGTTATGGAGAGACAGCAATTGATAGGAACAATAAT 
GGGTCATCCGAGGAAGAAGTCGATATGAACAATGAATTTGTAGATCCACAAGCTGAGGAT 
AGAGAGCTTAAAGGACAGCTCTTGCGCAAGTACAGTGGTTACTTAGGGAGCCTCAAGCAA 
GAGTTCATGAAGAAGAGGAAGAAAGGAAAGCTCCCTAAAGAAGCTCGTCAACAACTGCTT 
GATTGGTGGAGCCGTCACTACAAATGGCCTTACCCTTCGGAGCAACAAAAGCTCGCCCTT 
GCGGAATCAACGGGGCTGGACCAGAAACAGATAAACAATTGGTTCATAAACCAGAGGAAA 
CGGCATTGGAAGCCGTCGGAGGACATGCAGTTTGTAGTAATGGACGCAACACATCCTCAC 
CATTACTTCATGGATAATGTCTTGGACAATCCTTTCCCAATGGATCACATCTCCTCCACC 
ATGCTTTGA 

>G431 Amino Acid Sequence (domain in AA coordinates: 286-335) 

MESGSNSTSCPMAFAGDNSDGPMCPMMMMMPPIMTSHQHHGHDHQHQQQEHDGYAYQSHH 

QQSSSLFLQSLAPPQGTKNKVASSSSPSSCAPAYSLMEIHHNEIVAGGINPCSSFSSSAS 

VKAKIMAHPHYHRLIiAAYVNCQKVGAPPEWARLEEACSSAAAAAASMGPTGCLGEDPGL 

DQFMEAYCEMLVKXEQELSKPFKEAMVFLQRVECQFKSLSLSSPSSFSGYGETAIDRITNN 

GSSEEEVDMNl^FVDPQAEDRELKGQLLRKYSGYL 

DWWSRHYKWPYPSEQQKLALAESTGLDQKQINNWFINQRKRHWKPSEDMQFVVMDATHPH 

HYFMDNVLDNPFPMDHISSTML* 

>G479 (1..1128) 

ATGGAGATGGGTTCCAACTCGGGTCCGGGTCATGGTCCGGGTCAGGCAGAGTCGGGTGGT 
TCCTCCACTGAGTCATCCTCTTTCAGTGGAGGGCTCATGTTTGGCCAGAAGATCTACTTC 
GAGGACGGTGGTGGTGGATCCGGGTCTTCTTCCTCAGGTGGTCGTTCAAACAGACGTGTC 
CGTGGAGGCGGGTCGGGTCAGTCGGGTCAGATACCAAGGTGCCAAGTGGAAGGTTGTGGG 
ATGGATCTAACCAATGCAAAAGGTTATTACTCGAGACACCGAGTTTGTGGAGTGCACTCT 
AAAACACCTAAAGTCACTGTGGCTGGTATCGAACAGAGGTTTTGTCAACAGTGCAGCAGG 
TTTCATCAGCTTCCGGAATTTGACCTAGAGAAAAGGAGTTGCCGCAGGAGACTCGCTGGT 
CATAATGAGCGACGAAGGAAGCCACAGCCTGCGTCTCTCTCTGTGTTAGCTTCTCGTTAC 
GGGAGGATCGCACCTTCGCTTTACGAAAATGGTGATGCTGGAATGAATGGAAGCTTTCTT 
GGGAACCAAGAGATAGGATGGCCAAGTTCAAGAACATTGGATACAAGAGTGATGAGGCGG 
CCAGTGTCGTCACCGTCATGGCAGATCAATCCAATGAATGTATTTAGTCAAGGTTCAGTT 
GGTGGAGGAGGGACAAGCTTCTCATCTCCAGAGATTATGGACACTAAACTAGAGAGCTAC 
AAGGGAATTGGCGACTGAAACTGTGCTCTCTCTCTTCTGT 

GACAACAACAACAACAACAACAACAACAGCAACAACAACAACAATACATGGCGAGCTTCT 
TCAGGTTTTGGCCCGATGACGGTTACAATGGCTCAACCACCACCTGCACCTAGCCAGCAT 
CAGTATCTGAACCCGCCTTGGGTATTCAAGGACAATGATAATGATATGTCTCCTGTTTTG 
AATTTAGGTCGATACACCGAGCCAGATAATTGTCAGATAAGTAGTGGCACGGCAATGGGT 
GAGTTCGAGTTATCTGATCACCATCATCAAAGTAGGAGACAGTACATGGAAGATGAGAAC 
ACAAGGGCTTATGACTCTTCTTCTCACCATACCAACTGGTCTCTCTGA 

>G479 Amino Acid Sequence (conserved domain in AA coordinates : 70-149) 

MEMGSNSGPGHGPGQAESGGSSTESSSFSGGLMFGQKIYFEDGGGGSGSSSSGGRSNRRV 

RGGGSGQSGQIPRCQVEGCGMDLTNAKGYYSRHRVCGVHSKTPKVTVAGIEQRFCQQCSR 
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^^^^rriAC-TAAGAGACCAGGGTTTGATCAAGAAGCAAATCTCTGCCGGGACC 



GCGTATGTTCAGCAGCTGGAAGATAGTCGATTAAAGCTGACTCAAGTTCAGCAGGAGCTG 
^CAATGGTGGGGCTTTGGCATTTCATG^^ 

TTATCCMTGGAACTCTO^^ 

atggccatggcaatgggcaagttaggcaccctcgaaggattcatacgccaS 

TTCAGGCrecAAAC^CTA(^^C^GATGCTTCGAGTATTAACyUVCACGTCAGTSGCTroT 
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>G578 Amino Acid Sequence (domain in AA coordinates 36-96) 

MHSLNETVI PDVDYMQSDRGHMHAAASDSSDRS KDKLDQKTLRRL AQNREAARKSRLRKK 

AYVQQLEDSRLKLTQVEQELQRARQQGVFISSSGDQAHSTGGNGGALAFDAEHSRWLEEK 

NRQMNELRSALNAHAGDTELRIIVDGVMAHYEELFRIKSNASKNDVFHLLSGMWKTPAER 

CFLWLGGFPSSELLKLLANQLEPMTERQVMGINSLQQTSQQAEDALSQGMESLQQSLADT 

LSSGTLGSSSSDNVASYMGQMAMAMGKLGTLEGFIRQADNLRLQTLQQMLRVLTTRQSAR 

ALLAIHDYSSRLRALSSLWLARPRE* 

>G596 (168. .1121) 

TAATTTCTCTACTTCAGATTTTTTTCTCCTTAGATTAATTTAATTGAGTTATTGTACATC 
CCTCAAGCTAAGATTCTGGTTTTGTGAGTTGAGTGGATGAGAAGAGGAGAGATTAACTAA 
ATTAGGGTTTCAATTGTTTACTTTTTGTTTGCTTTTTATATCAAGTAATGGATCAGGTCT 
CTCGCTCTCTTCCTCCACCTTTTCTCTCAAGAGATCTCCATCTTCACCCACACCATCAAT 
TCCAGCATCAGCAGCAGCAGCAGCAACAGAATCACGGCCACGATATAGACCAGCACCGAA 
TCGGTGGGCTAAAACGTGACCGAGATGCTGATATCGATCCCAACGAGCACTCTTCAGCCG 
GAAAAGATCAAAGTACTCCTGGCTCCGGTGGAGAAAGCGGCGGCGGAGGAGGAGGAGATA 
ATCACATCACGAGAAGGCCACGTGGCAGACCAGCGGGATCTAAGAACAAACCAAAACCGC 
CAATCATCATCACTCGAGACAGCGCAAACGCTCTCAAATCTCATGTCATGGAAGTAGCAA 
ACGGATGTGACGTCATGGAAAGTGTCACCGTCTTCGCTCGCCGTCGCCAACGTGGCATCT 
GCGTTTTGAGCGGAAACGGCGCCGTTACCAACGTTACCATAAGACAACCAGCTTCAGTAC 
CTGGTGGTGGCTCATCTGTCGTTAACTTACACGGACGTTTCGAGATTCTTTCTCTCTCGG 
GATCATTCCTTCCTCCTCCGGCTCCACCAGCTGCGTCAGGTCTAACGATTTACTTAGCCG 
GTGGTCAGGGACAGGTTGTTGGAGGAAGCGTGGTTGGTCCACTCATGGCTTCAGGACCTG 
TAGTGATTATGGCAGCTTCGTTTGGAAACGCTGCGTATGAGAGACTGCCGTTGGAGGAAG 
ACGATCAAGAAGAGCAAACAGCTGGAGCGGTTGCTAATAATATCGATGGAAACGCAACAA 
TGGGTGGTGGAACGCAAACGC7VAACTCAGACGCAGCAGCAACAGCAACAACAGTTGATGC 
AAGATCCGACGTCGTTTATACAAGGGTTGCCTCCGAATCTTATGAATTCTGTTCAATTGC 
CAGCTGAAGCTTATTGGGGAACTCCGAGACCATCTTTCTAAATCGCGAAGATU^AAACAAG 
TTAGATACGTTCGTTGTTTTTAATTTATAATCTCTCTTCTGTCAAGTTTTAATTTTCTTT 
TTOTTCTTCTTTGTTTTCTAAAGATAATTGTAGTCTTTGACGAAGATTCGTGGTACGTAT 
GAATCGAAGAGAATCGTTTTGGTCATGGGATTGCTCGATCTATTAGGTTTGAGAGGGGGT 
TTGTGTTTTGCGTTGACTAGCAGATTATAAAATTGTTGATTTTCGAGTTTTTATTTTCAT 
GTGTTGGTGATAAA 

>G596 Amino Acid Sequence (domain in AA coordinates: 89-96) 

MDQVSRSLPPPFLSRDLHIiHPHHQFQHQQQQQQQNHGHDIDQHRIGGLKRDRDADIDPNE 

HSSAGKDQSTPGSGGESGGGGGGDNHITRRPRGRPAGSKNKPKPPI I ITRDSANALKSHV 

MEVANGCDVMESVTVFARRRQRGICVLSGNGAVTNVTIRQPASVPGGGSSWNLHGRFEI 

LSLSGSFLPPPAPPAASGLTIYLAGGQGQWGGSWGPLMASGPWIMAASFGNAAYERL 

PLEEDDQEfeQTAGAVANNIDGNATMGGGTQTQTQTQQQQQQQLMQDPTSFIQGLPPNLMN 

SVQLPAEAYWGTPRPSF* 

>G617 (59.. 1141) 

CAGATCTGTTCTTTACACCAAATTGAGTACTGAAGATCTTGTTGAGTGAATTAAAGAGAT 
GAGATCAGGAGAATGTGATGAAGAGGAGATTCAAGCAAAGCAAGAAAGAGATCAAAATCA 
AAATCATCAAGTAAACTITAAACCACATGTTGCAACAACi^ 

TTC^GGCAATGGACTTCAGCTTTTAGGAATCCAAGAATCGTTCGAGTCTCAAGAACATT 
CGGTGGCAAAGACAGACACAGCAAAGTATGTACAGTCCGTGGTCTTCGAGACCGGAGGAT 
AAGGTTGTCCGTACCTACAGCTATTCAACTCTACGACCTTCAAGATCGATTAGGGCTGAG 
TCAGCCAAGCAAAGTCATTGATTGGTTACTCGAAGCAGCAAAAGATGACGTAGACAAGCT 
ACCTCCTCTACAAT-XCCCACATGGATTTAACCAGATGTATCCAAATCTCATCTTCGGAAA 
CTCCGGGTTTGGAGAATCTCCATCTTCAACTACATCAACAACGTTTCCAGGAACCAATCT 
CGGGTTCTTGGAAAATTGGGATCTTGGTGGTTCTTCAAGAACAAGAGCAAGATTAACCGA 
TACAACTACGACCCAAAGAGAAAGTTTTGATCTTGATAAAGGAAAATGGATCAAAAACGA 
CGAGAATAGTAATCAAGATCATCAAGGGTTTAACACCAATCATCAACAACAATTTCCTCT 
GACGAATCCGTACAACAACACTTCAGCTTATTACAACCTTGGACATCTTCAACAATCGTT 
AGACCAATCTGGTAATAACGTTACTGTCGCAATATCTAATGTTGCTGCTAATAATAACAA 
TAATCTCAATTTGCATCCTCCTTCCTCGTCTGCCGGAGATGGATCTCAGCTTTTTTTGGG 
TCCTACTCCTCCGGCAATGAGCTCTCTATTCCCGACATACCCTTCGTTTCTTGGAGCTTC 
TCATCATCATCATGTCGTCGATGGAGCCGGTCATCTTCAGCTCTTTAGCTCGAATTCAAA 
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TACCGCATCGCAGCAACACATGATGCCGGGTAATACGAGTTTGATTAGACC 



:CATTTCATCA 



^lJ^i^ZiL^ e ^J^ in in AA coordinates: 64-118) 

••RNPRIV 
JLLEAAK 

^QSGNNVTVAISNVAANNNITOLNLHPPSSSAGDCSOtIpp^ 



>G620 (40.. 666) 



GGCT 
CTTA 
CGAG 
GACC 
GGAC 
ACTF 



TGCGTAAAACCTTJ 
AATGTGTCTCCGAC 
AGCAACGTAAGACC 
ATAACTACGTGGAC 



OTGAOCMGACCAATACaTCKCMTCGCaAaCGTCSTMOAATCAraCGTAMACCTT. 

GCCAJ- 

aacg; 

^ ATGAC 

CCCCTCACCGTGTTCATTAACCGGTACCGTGAGATAGAGACCGATCGTGGTTCTGCAi 



TACATCAGCTTCGTGACCGGTGAAGCCAACGAGCGTTGCCAACGTGAGCAArr*Tnn^»n^ 
ATAACTGCT^AAGATATCCTTTGGGCTATGAGCAAGCTTGGCTTC^ 



™™ GGTGGTCGGT ^^ 

GGTGGTGGCTCTTCGTCTTCCATTMiCGGAATGCCGGCTTTTGACCATTATG 
ATTTTATTTTTATGTCTTATCAA' 



ATTTOATTTTTATGTCTTATCAATAACATTTCTATATAATGT^^ 

TGTTGTATGTCAATACTTTATGAGAAACTGATTTATATATGCAAA? ° 



Sill^^tt e ^tJ d ^ - AA coordinates: 20-118) 

csDn 



TCCTCTTCTTCTTCTTCTTC1TCTTCATCTATGGACCCTTTAGCTTCCCAACATCAACAC 



TCCAAGTTCCGTTACCGTGGCGTTCGACAAAGAAGCTGGG^ > "" ,i " UU * 1 lJlJl:;UA< ~^^' 



3GCAAATGGGTCGCCGAGATC 



^CCTATACGGGTCACGTGCTCAGCTCAACTTJ 



'A 



s==iii= 

CAAC^CAAAATCAGATGGTCCAGATGGGACAATTCCAACACCAACAGTaTr^^^ 



^^^T^^^P^T^^^^^^^^^^^TCTAGCTGGTTCGGTGGGTTCGAGTCTA 

TCTATGGGTCTGGATCCGGGT 
GGAGGAGAAGAAGAATATAGT 
TTGGGGGAATTCTATTAATTT 
TAAGGTCGGTCAAGAGCATTG 



^^^^^^^rcCGGTATGrrCTAO^TCI^ATCCGGGT 

5CCTTTT( 

GTO GTGG ^ GATCAXATTATATA --— CATCCCTAAGGTCGGTCAAGAGCATTG 



GAGATTCATTGTTGAGAGGAATCAAAGAGATTGCATTCTATG 
" " n ■ rTACTACCTATAGAGATAAATAAGAG _ 

rCAAGGAATTCGTAAAAGAGATTACGGTTCCAATAAAGTATGTA 



ttgaaga™ att ^g T S?S^~ 

TATGTGGAAGAGAATCGGAGGAGATGGTGGAAAGriGTATGGGAATTTrATTGGTTCAAC 
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ACTTCCTTCACAGTGTGCCTACCTTAATATATAATTATTGATAGGATATGATAATTTCTG 

>G625 Amino Acid Sequence (conserved domain in AA coordinates : 52-119) 

MDPLASQHQHNHLEDNNQTLTHNNPQSDSTTDSSTSSAQRKRKGKGGPDNSKFRYRGVRQ 

RSWGKWVAEIREPRICRTRKWLGTFATAEDAARAYDRAAVYLYGSRAQLNLTPSSPSSVSS 

SSSSVSAASSPSTSSSSTQTLRPLLPRPAAATVGGGANFGPYGIPFNNNIFLNGGTSMLC 

PSYGFFPQQQQQQNQMVQMGQFQHQQYQNLHSNTNNNKISDIELTDVPVTNSTSFHHEVA 

LGQEQGGSGCNNNSSMEDLNSLAGSVGSSLSITHPPPLVDPVCSMGLDPGYMVGDGSSTI 

WPFGGEEEYSHNWGSIWDFIDPILGEFY* 

>G658 (17.. 757) 

CCACGCGTCCGCTCACATGAACAAAGGAGCTTGGACTAAAGAAGAAGATCAGCTTCTTGT 
TGATTACATCCGTAAACACGGTG7VAGGTTGCTGGCGATCTCTCCCTCGCGCCGCTGGATT 
ACAAAGATGTGGTAAGAGTTGTAGATTGAGATGGATGAATTATCTAAGACCAGATCTCAA 
AAGAGGCAATTTTACTGAAGAAGAAGATGAACTCATCATCAAGCTCCATAGCTTGCTCGG 
TAACAAATGGTCTTTAATAGCTGGGAGATTACCAGGAAGAACAGATAACGAGATCAAGAA 
CTATTGGAACACTCATATCAAGAGGAAGCTTCTCAGCCGTGGGATTGATCCAAACTCTCA 
CCGTCTGATCAACGAATCCGTCGTGTCTCCGTCGTCTCTTCAAAACGATGTCGTTGAGAC 
TATACATCTTGATTTCTCTGGACCGGTTAAACCGGAACCGGTGCGTGAAGAGATTGGTAT 
GGTTAATAATTGTGAGAGTAGTGGAACGACGTCGGAGAAGGATTATGGGAACGAGGAAGA 
TTGGGTGTTGAATTTGGAACTCTCTGTTGGACCGAGTTATCGGTACGAGTCGACTCGGAA. 
AGTGAGTGTTGTTGACTCGGCTGAGTCGACTCGACGGTGGGGTTCCGAGTTGTTTGGAGC 
TCATGAGAGTGATGCGGTGTGTTTGTGTTGTCGGATTGGGTTGTTTCGT71ATGAGTCGTG 
TCGGAATTGTCGGGTTTCTGATGTTAGAACTCATTAGAGAGTCAATCGAGAATTCTTTAG 

AACATCAAGTAAGAAACTAGCATAATTATTTGATGGCAAAGCCAAAAGATTGTGCTC 

>G658 Amino Acid Sequence (domain in AA coordinates: 2-105) 

MNKGAWTKEEDQLLVX)YIRKHGEGCWRSLPRAAGLQRCGKSCRLRWMNYLRPDLKRGNFT 

EEEDELIIKLHSLLGNK^SLIAGRLPGRTDNEIKimmTHIKRKLIiSRGIDPNSHRLINE 

SWSPSSLQNDVTOTIHLDFSGPVTCPEPTOEEIGMVNNCESSGTTSEKDYGNEEDWVX 

ELSVGPSYRYESTRKVSVVDSAESTRRWGSELFGAHESDAVCLCCRIGLFRNESCRNCRV 

SDVRTH* 

>G716 (271.. 2079) 

AAAAAAAAAGGGGAGAGATTTAGTTTTATCCNNCAGNGCCTGAANTACGTTCTGCAATCA • 

AWACGGACATAACCGNCCGTTGTGTCCTGTTTATAAAGTTTTGCTTTTTTTATTTTCTCC 

ANTGATGGGTCTTTTCTTTCTTCTCTCTCTNGTGTTTCTTTCATGGGGTTAAGACTAGTG 

TTTACCGCGTGAAGGTTTTTTTTTCTTTCTATTTTCTTTCATTTCCTCTCCTTCTACTTC 

TTCTTCTCCAGTTCTCATCTGGGTTCTTCAATGGCGAGTGTTGMGGTGATGATGATTTC 

GGAAGTTCTTCGTCAAGGTCTTATCAAGATCAACTATACACAGAGCTATGGAAAGTTTGT 

GCAGGTCCATTAGTGGAAGTTCCTCGTGCTCAAGAGAGAGTTTTCTACTTCCCTCAGGGT 

aCATGGAACAACTTGTGGCGTCAACTAATCAAGGAATCAATTCAGAAGAAATACCTGTT 

TTTGATCTTCCTCCAAAGATACTTTGTCGAGTTCTTGATGTCACTTTAAAGGCGGAGCAT 

GAAAC^GATGAGGTTTACGCTCAGATCACATTACAACCAGAGGAAGATCAAAGTGAACCA 

ACAAGTCTTGATCCACCTATTGTTGGACCAACTAAGCAAGAGTTTCATTCGTTTGTTAAG 

ATTTTAACGGCTTCAGATACAAGCACTCATGGTGGATTCTCTGTTCTTCGTAAACACGCC 

ACTGAATGCTTGCCTTCTTTGGATATGACACAAGCTACT 

AGAGATCTTCATGGCTTTGAATGGAGGTTTAAGCATATATTCAGAGGACAACCACGGAGG 

CATTTGCTTACTACGGGTTGGAGTACATTTGTATCCTCGAAAAGACTTGTAGCTGGAGAT 

GCTTTTGTGTTCTTGAGGGGTGAGAATGGGGATTTACGGGTTGGAGTGAGACGATTAGCT 

CGGCATCAAAGCACAATGCCTACTTCGGTTATTTCAAGTCAGAGCATGCATTTGGGAGTT 

CTTGCTAC^GCTTCTCATGCTGTGCGTACAACAACAATCTTTGTTGTCTTTTACAAGCCT 

AGGATAAGCCAATTCATAGTTGGGGTGAACAAGTATATGGAAGCTATAAAGCATGGATTT 

TCTCTCGGTACCCGATTCAGAATGAGGTTTGAAGGAGAAGAGTCTCCTGAGAGAATATTT 

ACTGGTACGATTGTGGGAAGTGGAGATCTATCTTCACAATGGCCAGCTTCTAAATGGAGG 

TCATTGCAGGTACAATGGGATGAGCCAACAACAGTTCAGAGACCAGATAAAGTCTCACCA 

TGGGAGATAGAGCCTTTCTTGGCAACTTCCCCAAxTTCAACTCCTGCTCAACAAC^ 

TCGAAATGCAAGCGGTCAAGACCCATCGAGCCATCAGTTAAAACACCAGCCCCACCTAGT 

TTCTTGTACAGCCTCCCTCAGAGCCAAGATTCCATTAATGCATCCCTTAAACTGTTTCAA 

GATCCATCACTTGAGAGAATTTCAGGTGGATACTCCTCAAACAACAGCTTCAAACCCGAG 
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ACTCCTCCTCCTCCAACGAATTGTAGCTATAGGTTGTTTGGATTTGATCTCACAAGCAAT 

tctcctgctccaatccctcaagacaagcaaccgatggatacttgtggISgc^ 

CAAGAACCCATCAOTCCAACrc^ 

cgaactaaagtgcaaatgcaaggcattgcggttggtcgtgcggttgatttaacac^g 
aaatcttacgatgmctgattgatgagcttgaggagatgtttgagat^SSS 
cttgcccgagacaaatggatcgttgtcttcactgatgatgaaggS 

ggtgatgatccgtggaatgagttttgcaagatggcaaagaagatat^atSSJSS 
gatgaggttaagaaaatgacaacgaaactgaagatttcttc^^^ 

tatggtaatgaatcattcgaaaatcgtagtagggggtgagagttctagctgSSa^ 
gttaattcggcgacgtcgttttagtgcgtaagtgtctaaagactttttttt^ag^ctgtg 
tatataaagtcttgtcctctttttcatgtcaatttitcaagttggcgaSSScg 

>G716 Amino Acid Sequence (domain in AA coordinates- 24-355) 

^svegdddfgssssrsyqdqlytelwkvcagplvevpraqervfyfp^meqlvaS 
o^^seeipvfdlppkilcrvldvtlka^etdevyaqitlqpeedqsSt^dpmvgp 
tcqefhsfvkiltasdtsthggfsvlrkhateclpsldmtqatptqelvSgfeto 

TPono^^ >RJaiL ' J, '* raWS TFVS S KRiVAGDAFVFLRGENGDLRVGVRRLARHQSMPTSV 

issqsmlgvlatashatotttifwfykprisqfivgvnkymeaikhgfsIgt™^ 

pistpaqqpqskckrsrpxepsvktpappsflyslpqsqdsiSSSpsleSgg 
yssnnsfkpetpppp^csyrlfgfdltsnspapipqdkqpmdtcSS™ 
TOqtsrsrtkvqmqgiavgravdltllksydelideleemfeiqXlla^S^I 
todegdmmiagddpwnefckmakkifiyssdevkkmttklkisssleneey^^es^e^rs 

>G725 (46.. 1122) 

aaaccttcaagtatgaatggttcatatgagaacagagctatgtgcgttcaa^ 

GGCCTTGTCCTCACCACCGACCCTAAACCGCGTTTGCGITGGACCGTCGAACTC^^ 

cg^gtggacgccgtcgctcagctcggcggccccgac^gcgaccccaaaSgS? 

ATGAGAGTTATGGGTGTGAAGGGTCTTACTCTTTACCACCTAAAGAGCCATCTTCAG^AA 

aacatgaatgagatgcaaatggaagtgcagagaaggttccatgaacagctagI^ 
agacatctgc^ctgaggattgaagcacaaggaaagtacatgcaatotatcS^ 
gcttgccaaaccctagccggtgagaacatggcagccgccaccgS^ 
ggaggatacaagggtaatctgggaagttcgagtcittcagcagcggtggS^ 

CATCCTCTTAGTTTCCCGCCGT^^ 
rS^ C n A f ATCACAACTOCCATCATC ^ 

GCTGCAGACACCAACATTTACTTGGGGAAGAAGCGACCTAATCCTAATTTTGGTAAPP^t 
ATCGATGATGAGCATAGAArrCAGATACA(^ TC GCTACACATGTCTCCAC^™ 

GGG ^™ cTOGAAAGGceATcGrc ^^^ 

^GTGGATTAATACAAGGAAGAAACTCGrc^^ 

>G725 Amino Acid Sequence (domain in AA coordinates- iq mi 

F ;^^ AA ?^^^^^ PN ^ G ^ TOKGLL ^ SD Q D ®^SANQSIDDEHRIQIQMAT^ 
^ ^, GL SGDEGmGG ^ LERPS ^PLSP f ^GGLIQ< 3 ™sPFGr 

CT^CTCCTTCTCTGATCGTTCGITTTCTGGACGA^GAGATCOTAA^^ 
GGAAGAGGACCCGATTCGGGTACTGCTGCTGGTGGGTCAAACTCCGACCCG1TTCCTGCG 
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AATCTTCGAGTTCTTGTCGTTGATGATGATCCAACTTGTCTCATGATCTTAGAGAGGATG 

CTTATGACTTGTCTCTACAGAGAGCAGAGAGCGCATTGTCTCTGCTTCGGAAGAACAAAG 

AATGGTTTTGATATTGTCATTAGTGATGTTCATATGCCTGACATGGATGGTTTCAAGCTC 

CTTGAACACGTTGGTTTAGAGATGGATTTACCTGTTATCAATCTGAATGTTTTGAAACCT 

TTGGTTATAGTGATGTCTGCGGATGATTCGAAGAGCGTTGTGTTGAAAGGAGTGACTCAC 

GGTGCAGTTGATTACCTCATCAAACCGGTACGTATTGAGGCTTTGAAGAATATATGGCAA 

CATGTGGTGCGGAAGAAGCGTAACGAGTGGAATGTTTCTGAACATTCTGGAGGAAGTATT 

GAAGATACTGGCGGTGACAGGGACAGGCAGCAGCAGCATAGGGAGGATGCTGATAACAAC 

TCGTCTTCAGTTAATGAAGGGAACGGGAGGAGCTCGAGGAAGCGGAAGGAAGAGGAAGTA 

GATGATCAAGGGGATGATAAGGAAGACTCATCGAGTTTAAAGAAACCACGCGTGGTTTGG 

TCTGTTGAATTGCATCAGCAGTTTGTTGCTGCTGTGAATCAGCTAGGCGTTGACAGTGAG 

TTAAAAACTTGCTTGCTTATGCATTTGTGTGTGTCGATTGGTAACATTGTGGAATTCCAG 

AAGTATCGGATATATCTGAGACGGCTTGGAGGAGTATCGCAACACCAAGGAAATATGAAC 

CATTCGTTTATGACTGGTCAAGATCAGAGTTTTGGACCTCTTTCTTCGTTGAATGGATTT 

GATCTTCAATCTTTAGCTGTTACTGGTCAGCTCCCTCCTCAGAGCCTTGCACAGCTTCAA 

GCAGCTGGTCTTGGCCGGCCTACACTCGCTAAACCAGGGATGTCGGTTTCTCCCCTTGTA 

GATCAGAGAAGCATCTTCAACTTTGAAAACCCAAAAATAAGATTTGGAGACGGACATGGT 

CAGACGATGAACAATGGAAATTTGCTTCATGGTGTCCCAACGGGTAGTCACATGCGTCTG 

CGTCCTGG ACAGAATGTTCAGAG CAGCGGAATGATGTTGCCAGTAGCAGACCAGCTACCT 

CGAGGAGGACCATCGATGCTACCATCCCTCGGGCAACAGCCGATATTGTCAAGCAGCGTT 

TCAAGAAGAAGCGATCTCACTGGTGCGCTGGCGGTTAGAAACAGTATCCCCGAGACCAAC 

AGCAGAGTGTTACCAACTACTCACTCGGTCTTCAATAACTTCCCCGCGGATCTACCTCGC 

AGCAGCTTCCCGTTGGCAAGTGCCCCAGGGATTTCAGTTCCAGTATCAGTTTCTTACCAA 

G7^AGAGGTCAACAGCTCGGATGCAAAAGGAGGTTCATCAGCTGCTACTGCTGGATTTGGT 

AACCCAAGCTACGACATATTTAACGATTTTCCGCAGCACCAACAGCACAACAAGAACATC 

AGCAATAAACTAAACGATTGGGATCTGCGGAATATGGGATTGGTCTTCAGTTCCAATCAG 

GACGCAGCAACTGCAACCGCAACCGCAGCATTTTCCACTTCGGAAGCATACTCTTCGTCT 

TCTACGCAGAGAAAAAGACGGGAAACGGACGCAACAGTTGTGGGTGAGCATGGGCAGAAC 

CTGCAGTCACCGAGCCGGAATCTGTATCATCTGAACCACGTTTTTATGGACGGTGGTTCA 

GTCAGAGTGAAGTCAGAAAGAGTGGCGGAGACAGTGACTTGTCCTCCAGCAAATACATTG 

TTTCACGAGCAGTATAATCAAGAAGATCTGATGAGCGCATTTCTCAAACAGGTTTGATTA 

TTACTCGAATACAGTGCACTCTAAAAC 

>G727 Amino Acid Sequence (domain in AA coordinates: 226-269) 

^T^PGHGRGPDSGTAAGGSNSDPFPANLRVIiVVI)DDPTCLMILERMLMTCLYREQRAHCL 

CFGRTKNGFDIVISDVHMPDMDGFKIiLEHVGLEMDLPVINLNVLKPLVIVMSADDSKSVV 

LKGVTHGAVDYLIKPVRIEALKNIWQHVVRKKRNEWNVSEHSGGSIEDTGGDRDRQQQHR 

ED ADNNS S SVNEGNGRS S RKRKEEEVDDQGDDKEDS S SLKKPRVVWS VELHQQFVAAVNQ 

LGVDSELKTCLLMHLCVSIGNIVEFQKYRIYLRRLGGVSQHQGNMNHSFMTGQDQSFGPL 

SSLNGFDLQSLAVTGQI.PPQSLAQLQAAGLGRPTLAKPGMSVSPLVDQRSIFNFENPKIR 

FGDGHGQTMNNGNLLHGVPTGSHMRLRPGQNVQSSGMMLPVADQLPRGGPSMLPSLGQQP 

ILSSSVSRRSDLTGALAVRNSIPETNSRVLPTTHSVFNNFPADLPRSSFPLASAPGISVP 

VSVSYQEEWSSDAKGGSSAATAGFGNPSYDIFNDFPQHQQHNKNISNKLNDWDLRKMGL 

VFS SNQDAATATATAAFSTSEAYS s S STQRKRRETDATWGEHGQNLQSPSRNLYHLNHV 

FMDGGSVRVKSERVAETVTCPPANTLFHEQYNQEDLMSAFLKQV* 

>G740 (25.-924) 

CTTCTTCAACTTTTTTTTTTAACGATGGCTTCAGAGGATCAATCGGCGGCGAGATCTACC 
GGGAAGGTGAACTGGTTCAACGCTTCTAAAGGCTATGGTTTCATTACTCCTGACGATGGC 
AGCGTAGAGCTTITeGTTCATCAATCTTCAATTGTCTCCGAAGGTTACCGGAGTTTAACC 
GTCGGCGATGCGGTTGAGTTCGCTATTACTCAGGGAAGCGACGGTAAGACTAAAGCCGTC 
AATGTTACTGCTCCTGGTGGTGGTTCTCTCAAGAAGGAGAATAACTCTCGTGGTAACGGT 
GCTAGGCGCGGCGGCGGTGGAAGCGGTTGCTACAATTGCGGTGAGTTAGGTCATATCTCT 
AT^AGATTGTGGTATTGGTGGCGGCGGCGGAGGTGGTGAACGTAGATCTAGAGGAGGAGAA 
GGTTGTTACAATTGTGGTGATACTGGTCACTTCGCTAGGGATTGTACTTCAGCTGG7VAAC 
GGTGACCAACGTGGAGCCACCAAAGGTGGAAACGATGGTTGCTACACTTGCGGTGATGTT 
GGTCATOTGGCTAGGGATTGTACTCAGAAATCAGTTGGAAACGGAGACCAACGTGGAGCG 
GTCAAAGGTGGAAACGATGGTTGCTACACTTGTGGTGATGTTGGTCACTTTGCTAGGGAT 
TGTACTCAGAAGGTTGCTGCCGGAAACGTCAGAAGCGGTGGTGGTGGTAGTGGAACTTGT 
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TA1TCATGCGGTGGAGTTGGTCACATTGCAAGAGATTGTGCGACTAAGAGACAGCCTT, 
CGTGGGTGTTACCAGTGTGGTGGTTCTGGTCACTTGGCTCGTGATTGTGACCArapar! 
AGCGGTGGAGGAGGTAATGATAATGCGTGCTACAAGTGTGG^ 
AGGGAATGTTCTTCTGTAGCTTAATCGA1TTCCTAATC 



AGCGGTGGAGGAGGTAATGATAATGCGTGCTi 



CT 
.GGA 



GAAATOGAATCGAGTTATATAGTTTGGTATATATTACTCTTCGTTTTCATTTOTC^TTT 
^G^GrTGATGGGAATGAAATTGCCTGGTCCTTTTGGTGTGTTTSSGC™ 

awatacagagtgatcccttttttgttataactattacaagtt^tagcSa™^^ 

^cid Sequence (domain in AA cc 

3 I VS EG YR S LTVGDAVEFA 



itqgsdgktkavnvtapgggslkkennsrgSSgc™^ 



?™ GDQRGAVKGG ^ 



ggaacaaggagatcatcagcagcataagaaagaagaai ~ 
aaacttcaccgggaaagcaatcgctgacgttgatcttaa^^ 
gtacccgacgggagtgaggacgaatagggcgacgaatacaggataSggSS^^ 

===== 

TCTACAACCACACCCGAGCCTCAATAT 



-ACATGCC 



>G858 (99.. 869) 



:gtcaagtcactttctctaagagacgaaacggtt 

^ GACACAAATCCTGCTCAACCAGA ^ GAGA TCCAGC3ATAC^^ 

TGGAAGAAAACCAAATCTTGCGCAAACAGGTTGAGATGTTGGGGAGAGGTTra^ 

AAGTGTTGAATGAAAGGCCTCAAGATTCTAGCCC^GAAGCCGATrc 
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CAGAAGAGGATGAGAATGACAACGAGGAGCACCATTCCGACACTTCCTTGCAGTTGGGGT 

TGTCGTCGACGGGGTATTGCACAAAGAGAAAGAAGCCGAAGATCGAACTGGTCTGCGATA 

ACTCTGGGAGTCAAGTGGCTTCTGATTGATGGAATCGATTATTTTTCT7VATTCTGGTTGT 

TTAGGGGTCTCTATGTGTCTTCTTGTTTCTGGCTGTTCTTTTGCTTTATTTCATCTCAAG 

TAGAGTTTTCTTAATGTTTAGGTGGAACATTTTTCCATAATCAAGAAGGGATTTGATCAA 

TCAATAACATTAGATTTTCTTAGTTAAAGACTTAAAGTTGCCCACACACCACACCATATG 

TGATTATGATGAATTTACATTTTATAAAAAAAAAAAAAAAAAAAAAAAAA 

>G858 Amino Acid Sequence (domain in AA coordinates: 2-57) 

MGRGRIEIKKIENINSRQVTFSKRRNGLIKKAKELSILCDAEVALIIFSSTGKIYDFSSV 

CMEQILSRYGYTTASTEHKQQREHQLLICASHGNEAVLRNDDSMKGELERLQLAIERLKG 

KELEGMSFPDLISLENQLNESLHSVKDQKTQILLNQIERSRIQEKKALEENQILRKQVEM 

LGRGSGPKVLNERPQDSSPEADPESSSSEEDENDNEEHHSDTSLQLGLSSTGYCTKRKKP 

KIELVCDNSGSQVASD* 

>G865 (282.-920) 

ATCCCCACTTGTTGTTCATCACCAAGCCAAGCTCCATGTCCTAGTCACTCCACAGATTCC 

CTATCATCATCAATTCGTTTCAAACTTAGTTCCTTTCAAAGTCTTGTACATATATACACA 

CACACCTATTATTCTCTTGGTGTGTTTGTGTGTTACATATACGTGTGAGTACATACTTTG 

TTGTAAAAGTGGATCGGAGGTATGGAAAGGGACCGGTTCCACCGGAAACATCGGCGGCGG 

CGGATGATAATTCGTCTTGGAACGAGACTGATGTCACCGCCATGGTCTCCGCTCTCAGCC 

GTGTCATAGAGAATCCGACAGACCCGCCGGTCAAACAAGAGCTTGATAAATCGGATCAAC 

ATCAACCAGACCAAGATCAACCAAGAAGAAGACACTATAGAGGCGTAAGGCAGAGACCAT 

GGGGTAAATGGGCGGCAGAAATCCGCGATCCAAAGAAAGCAGCCCGTGTCTGGCTCGGGA 

CTTTCGAGACGGCAGAGGAAGCTGCTTTAGCCTATGACCGAGCTGCCCTCA7^ATTCAAAG 

GCACCAAGGCTAAACTGAACTTCCCTGAACGGGTCCAAGGCCCTACTACCACCACAACCA 

TTTCTCATGCACCAAGAGGAGTTAGTGAATCCATGAACTCACCTCCTCCTCGACCTGGTC 

CACCTTCAACTACTACTACTTCGTGGCCAATGACTTATAACCAGGACATACTTCAATACG 

CTCAGTTGCTTACGAGTAACAATGAGGTTGATTTATCATACTACACGTCGACTCTCTTCA 

GTCAACCTTTTTCAACGCCTTCTTCATCTTCTTCTTCCTCCCAACAGACGCAGCAACAGC 

AGCTACAACAACAACAACAGCAGCGTGAAGAAGAAGAGAAGAATTATGGTTACAATTATT 

ATAACTACCCAAGAGAATAATCTAATTATTATTGTTGGTCGAATCAGTTTTATAAATAGC 

TATCATAGTTTCATTTTTGGTTTCCGTAACCTTTGTTGCATGGAAAATATGAATGT^ACGA 

GGGACATGTGTAACAATTTGTTTGTGTTTCGTAAATGTTAGTTGTATTTGGATTTGCTGA 

AGTTTGATTTTCTGAGCATAAATCATTTGACGGTCAAAAAAAAAAA 

>G865 Amino Acid Sequence (domain in AA coordinates: 36-103) 

MVSALSRVIENPTDPPVKQELDKSDQHQPDQDQPRRRHYRGVRQRPWGKWAAEIRDPKKA 

ARVWLGTFETAEEAALAYDRAALKFKGTKAKLNFPERVQGPTTTTTISHAPRGVSESMNS 

PPPRPGPPSTTTTSWPMTYNQDILQYAQLLTSNNEVDLSYYTSTLFSQPFSTPSSSSSSS 

QQTQQQQLQQQQQQREEEEKNYG YN Y YNYPRE * 

>G872 (59.. 646) 

CCGGAAACAGAATCCAATTCAACCAAACCGAATCGAACCGAACCGGAGTTTTTATCCAAT 

GGTGAAGCAAGCGATGAAGGAAGAGGAGAAGAAGAGAAACACGGCGATGCAGTCAAAGTA 

CAAAGGAGTGAGGAAGAGGAAATGGGGAAAATGGGTATCGGAGATCAGACTTCCACACAG 

CAGAGAACGAATTTGGTTAGGCTCTTACGACACTCCCGAGAAGGCGGCGCGTGCTTTCGA 

CGCCGCTC^TTTTGTCTCCGCGGCGGCGATGCTAATTTCAATTTCCCTAATAATCCACC 

GTCGATCTCCGTAGAAAAGTCGTTGACGCCTCCGGAGATTCAGGAAGCTGCTGCTAGATT 

CGCTAACACATTCCAAGACATTGTCAAGGGAGAAGAAGAATCGGGTTTAGTACCCGGATC 

CGAGATCCGACCAGAGTCTCCTTCTACATCTGCATCTGTTGCTACATCGACGGTGGATTA 

TGATTTTTCGTTTTTGGATTTGCTTCCGATGAATTTCGGGTTTGATTCCTTCTCCGACGA 

CTTCTCTGGCTTCTCCGGTGGTGATCGATTTACAGAGATTTTACCCATCGAAGATTACGG 

AGGAGAGAGTTTATTAGATGAATCTTTGATTCTTTGGGATTTTTGAATTCCCAAACATAA 

TATTTTTTTAGAGCGAACTGTGAGATTTTCCTTGGAGTCATGGAGAAATCTGGAGATTTT 

TTGTAACACGGAGCTCCAATGACCCGGGAATTTCTTTCGTTTCGGATCCGAATTTGATGT 

GGATCATATTCACACCTATATTTTTTCATTTTTTTGTTGTAAAGAAAAATCGGATAAGAT 

TCTAGTAATAAATGTTAAAAGTCCATTTCATTAAAAAAAAAAAAAAAAAAA 

>G872 Amino Acid Sequence (domain in AA coordinates: 18-85) 

MVTCQAMKEEEKKROTAMQSKY 

DAAQFCLRGGDANFNFPNNPPS ISVEKSLTPPE IQEAAARFANTFQDIVKGEEESGLVPG 
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>G904 (1..1005) 

ctcgatagtctcaaaccaagcgtactagtcatcattctcattctcctcatgactcttctc 

ccccccctttcatcttcctcttccgtcgcaaccgtaacttccgattcccaacaafhr-^^ 

93acatcgagtctctcccgaaacagaacggtccLcgtgcttgattcgc?tccpattttn 
aaattctcctccgtcactcgccoatctaactc.^oL^. 9 !;:!?!_. CC9attttC 



LPLSSSSSVATVTSDSRRFSGHRVSPETERSSVLDSLPIFKFSSVTRRSSSMS^nrffJ 
>G910 (1..1071) 

AGAGCAGTGGTTTATTGTATAGC^ATACAGCAAATCTTrGTTTAACA^ 

aagttgtccaaagaaac^gctgggaggaagmgi^cgtgagacacgtcatccgatSa 
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CAGAGGAGTTCGTCAGAGGAATTCTGGTAAATGGGTITGTGAAGTTAGAGAGCCTAATAA 
GAAATCTAGGATTTGGTTAGGTACTTTTCCGACGGTTGAAATGGCTGCTCGTGCTCATGA 
TGTTGCTGCTTTAGCTCTTCGTGGTCGCTCTGCTTGTCTCAATTTCGCTGATTCTGCTTG 
GCGGCTTCGTATTCCTGAGACTACTTGTCCTAAGGAGATTCAGAAAGCTGCGTCTGAAGC 
TGCAATGGCGTTTCAGAATGAGACTACGACGGAGGGATCTAAAACTGCGGCGGAGGCAGA 
GGAGGCGGCAGGGGAGGGGGTGAGGGAGGGGGAGAGGAGGGCGGAGGAGCAGAATGGTGG 
TGTGTTTTATATGGATGATGAGGCGCTTTTGGGGATGCCCAACTTTTTTGAGAATATGGC 
GGAGGGGATGCTTTTGCCGCCGCCGGAAGTTGGCTGGAATCATAACGACTTTGACGGAGT 
GGGTGACGTGTCACTCTGGAGTTTTGACGAGTAATTTTTTGGCTCTTTTTCTGGATAATA 
AGTT 

>G912 Amino Acid Sequence (domain in AA coordinates : 51-118) 

MNPFYSTFPDSFLSISDHRSPVSDSSECSPKLASSCPKKRAGRKKFRETRHPIYRGVRQR 

NSGKWVCEWEPNKKSRIWLGTFPTVEMAARAHD 

TTCPKEIQKAASEAAMAFQNETTTEGSKTAAEAEEAAGEGVREGERRAEEQNGGVFYMDD 

EALLGMPNFFENMAEGMLLPPPEVGWNHNDFDGVGDVSLWSFDE* 

>G920 (114.. 1154) 

AAAAAATCTATTTTCTTCTCTTTCCACTATATTACAACATTTCTTCATTCTCAAATCATC 
ATACTAAAAACCTAAAAAAAGTTACATATTCATTGTATCTTTGTGAGAAAAAAATGGATT 
CGAATAGTAACAACACGAAATCCATAAAGAGAAAAGTTGTCGACCAACTTGTCGAAGGCT 
ATGAATTCGCTACTCAGCTTCAGCTTCTCCTTTCTCATCAACACTCTAACCAGTACCACA 
TCGATGAGACCCGTCTTGTTTCCGGGTCGGGTTCAGTTTCCGGTGGTCCAGATCCCGTTG 
ATGAGCTCATGTCT7^AGATCTTGGGATCTTTCCATAAAACTATATCGGTTCTTGATTCTT 
TTGATCCCGTCGCCGTCTCTGTCCCCATCGCCGTCGAGGGTTCATGGAATGCTTCATGTG 
GGGATGATTCGGCGACTCCGGTGAGTTGCAACGGTGGAGATTCCGGTGAGAGTAAGAAGA 
AGAGATTAGGGGTTGGTAAGGGTAAAAGAGGATGCTACACTAGAAAGACGAGATCACATA 
CAAGGATCGTGGAAGCTAAAAGTTCTGAAGACAGATATGCTTGGAGGAAATATGGACAAA 
AGGAGATTCTTAATACCACATTCCCAAGAAGTTACTTTAGATGCACACACAAGCCAACGC 
AAGGATGCAAAGCAACAAAGCAAGTTCAGAAACAGGATCAAGATTCTGAGATGTTCCAAA 
TCACATACATTGGCTACCACACATGCACTGCCAATGACCAAACGCACGCGAAGACCGAGC 
CTTTTG ATCAAGAAATCATTATG G ATT CGGAAAAGACATTGG CTG CTAGCACTG CTC AGA 
ACCATGTCAATGCTATGGTGCAAGAGCAAGAGAACAACACCAGCAGTGTGACAGCAATAG 
ACGCAGGCATGGTTAAGGAGGAACAAAATAACAATGGTGATCAGAGTAAAGATTATTATG 
AGGGCTCTTCGACAGGTGAGGACTTGTCATTGGTTTGGCAAGAGACGATGATGTTTGATG 
ATCATCAAAATCACTACTATTGTGGTGAAACCAGTACTACTTCTCATCAATTTGGTTTCA 
TCGACAACGATGATCAGTTTTCCTCCTTCTTCGACTCATATTGTGCTGATTATGAAAGAA 
CAAGTGCTATGTGAACATCCAAATCTGGAATGATGAATCAGCACTAGGTCTTCTCTTTGA 
GTATGTCTAGTTTAATGTAATATTTTTGTTGTATGTTTGATAAAAACACCATATATACTT 
CTCTTTTTACACCAAAAAAAAAAAAAAAAAAAAAAA 

>G920 Amino Acid Sequence (domain in AA coordinates: 152-211) 

MDSNSNNTKSIKRKVVDQLVEGYEFATQLQLLLSHQHSNQYHIDETRLVSGSGSVSGGPD 

PVDELMSKILGSFHKTISVLDSFDPVAVSVPIAVEGSWNASCGDDSATPVSCNGGDSGES 

KK1CRLGVGKGKRGCYTRKTRSHTRIVEAKSSEDRYAWRKYGQKEILNTTFPRSYFRCTHK 

PTQGCKATKQVQKQDQDSEMFQITYIGYHTCTANDQTHAKTEPFDQEIIMDSEKTLAAST 

AQNHVNAMVQEQENWTS S VTAIDAGMVKEEQNNNGDQS KDYYEGS STGEDLSLVWQETMM 

FDDHQNHYYCGETSTTSHQFGFIDNDDQFSSFFDSYCADYERTSAM* 

>G939 (9.. 1565) 

CAGATTCTATGGATATGTATAACAACAATATAGGGATGTTCCGGAGTTTAGTTTGTAGCT 
CGGCGCCTCCATTTACAGAGGGACATATGTGTTCTGATTCGCATACGGCTTTGTGCGATG 
ATCTGAGTAGTGATGAGGAAATGGAAATAGAGGAGCTTGAGAAGAAGATCTGGAGAGACA 
AGCAGCGTTTAAAGCGGCTCAAGGAAATGGCGAAGAACGGTCTAGGAACAAGATTGTTGT 
TGAAGCAGCAACATGATGATTTTCCAGAGCACTCTAGTAAGAGAACCATGTACAAGGCAC 
AAGATGGGATCTTGAAGTACATGTCGAAGACAATGGAGCGATATAAAGCTCAAGGTTTTG 
TTTATGGGATTGTGTTAGAGAATGGGAAAACGGTAGCGGGATCTTCTGATAATCTCCGTG 
AATGGTGGAAAGACAAAGTGAGGTTTGATAGGAACGGCCCAGCTGCrATAATC^^GCACC 
AAAGGGATATCAATCTTTCTGATGGAAGTGATTCAGGGTCTGAGGTTGGGGATTCTACCG 
CACAGAAGTTGCTTGAGCTTCAAGATACTACTCTTGGAGCTCTGTTATCGGCTCTGTTTC 
CTCACTGCAACCCTCCTCAGAGGCGGTTTCCGTTGGAGAAAGGCGTGACACCGCCATGGT 
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GGCCAACGGGGAAAGAAGATTGGTGGGATCAACTGTCTTTACCCGTTGATTTTCGAGCTr 

^cgccaccttac^gaagcctcatgatctcaagaagctgtggSJS?^ 

^ggtgtaatcagacatatggcttctgacattagcaacatacccaatctcgtSSSg? 

ctagaagtttgcaggagaaaatgacgtcaagagaaggcgctttatggctcgctgScS 

accgagaaaaggctattgttgatcaaatagccatgtcta^gaaaaS 

actttcttgttcctgcaaccggtggagacccagatgti^titcc^^aSS 

acaactac^ctgtgtttacaagagaaagtttgaagaagattttgggatgccaatcca^ 
caacactcctaacatgtgagaacagtctctgtccttatagccaaccacatatgggatttc 

accaaccaactaaaccctatggtatgacgggtttaatggttccitgtccggaSataIcg 

gaccaaaagctccacaaagaggcaacgatgacttggttgaggatttgaatccwctcS 

cgacgctgaatcagaatcttggtttagtcttacctactgacttcIatS^ 

^gtaggaacagagaacaatctgcataatcaagggcaagagt^^^ 

agtaaagaaagcttcagagttttctttttatgttttctagtctttatagctttc 

gcttattctctcattaaacacagtttttgatctctccatttcatagccca^ 

gagaagattaggtttcataataagttaataaccaaattcaaT 

>G93 9 Amino Acid Sequence (domain in AA coordinates- 97-10S> 

lkrlkemaknglgtrlllkqqhddfpehssi^tmykaqdgilkymsktmerSgftog 

LLELQDTTLGALLSALFPHCNPPQRRFPLEKGVTPPWWPTGKEDWWDQLSLPTOFRGWP 

S™™™ KIGTOIG ™^ 

>G963 (1..897) 

ATCAGTTTGCCTCCAGGATTCAGGTTTCATCCCACTGATGAAGAACTGGTGGCTTACTAT 
OTGATAGGAAGGTCAACGGCCAAGCCATTGAGC^^ 

tataaatgcgagccatgggacttgcctgaaaagtcatttwgccggga^gSSgSI 
tggtacttttacagcacaagggataagaagtatccaaatggctSS^ 

acccgagcgggttactggaaggccacggggaaagatcgtacagtagaatcSJgSgI?? 

aagatgggaatgaagaagacactggtttattatagaggaagggS?S 

actaattgggtcatgcatgaatatcgtctcacgcac^ 

tcgtatgcattgtgccgagtgtttaagaagaacataca^ 

^^TGGAGAAAATGTGATGGTAATTATATTG^ 

gcggagacatcttcatcagagc™^^ 

AACTATTCTCATCAGCTTCCATATCATCCTCCTCTrCAACTCCAAGATTTCCCTC^^ 

™ G f CGAAGCAGA ^ 

====== six™ 

======== 

ss"™ ssr iTOiMDFc<Ms> °' GTi,DBipiissATpMsL * 

ATCTTTGGGACAAAAGCTCTTGGAATTCGATTCAGAACAAGAAAGGCAAACAAGTTTATC 
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TGGGAGCATATGACAGTGAAGAAGCAGCAGCACATACGTACGATCTGGCTGCTCTCAAGT 
ACTGGGGACCCGACACCATCTTGAATTTTCCGGCAGAGACGTACACAAAGGAATTGGAAG 
AAATGCAGAGAGTGACAAAGGAAGAATATTTGGCTTCTCTCCGCCGCCAGAGCAGTGGTT 
TCTCCAGAGGCGTCTCTAAATATCGCGGCGTCGCTAGGCATCACCACAACGGAAGATGGG 
AGGCTCGGATCGGAAGAGTGTTTGGGAACAAGTACTTGTACCTCGGCACCTATAATACGC 
AGGAGGAAGCTGCTGCAGCATATGACATGGCTGCGATTGAGTATCGAGGCGCAAACGCGG 
TTACTT^TTTCGACATTAGTAATTACATTGACCGGTTAAAGAAGATVAGGTGTTTTCCCGT 
TCCCTGTGAACCT^AGCTAACCATCAAGAGGGTATTCTTGTTGAAGCCAAACAAGAAGTTG 
AAACGAGAGAAGCGAAGGAAGAGCCTAGAGAAGAAGTGAAACAACAGTACGTGGAAGAAC 
CACCGCAAGAAGAAGAAGAGAAGGAAGAAGAGAAAGCAGAGCAACAAGAAGCAGAGATTG 
TAGGATATTCAGAAGAAGCAGCAGTGGTCAATTG CTG CATAGACTCTTCAACCATAATGG 
AAATGGATCGTTGTGGGGACAACAATGAGCTGGCTTGGAACTTCTGTATGATGGATACAG 
GGTTTTCTCCGTTTTTGACTGATCAGAATCTCGCGAATGAGAATCCCATAGAGTATCCGG 
AGCTATTCAATGAGTTAGCATTTGAGGACAACATCGACTTCATGTTCGATGATGGGAAGC 
ACGAGTGCTTGAACTTGGAA7^ATCTGGATTGTTGCGTGGTGGGAAGAGAGAGCCCACCCT 
CTTCTTCTTCACCATTGTCTTGCTTATCTACTGACTCTGCTTCATCAACAACAACAACAA 
CAACCTCGGTTTCTTGTAACTATTTGGTCTGAGAGAGAGAGCTTTGCCTTCTAGTTTGAA 
TTTCTATTTCTTCCGCTTCTTCTTCTTTTTTTTCTTTTGTTGGGTTCTGCTTAGGGTTTG 
TATTTCAGTTTCAGGGCTTGTTCGTTGGTTCTGAATAATCAATGTCTTTGCCCCTTTTNN 
AANGNTNCAAGNTNAAANAAAAAAAAAAAA 

>G979 Amino Acid Sequence (domain in AA coordinates: 63-139,165-233) 

MKKRLTTSTCSSSPSSSVSSSTTTSSPIQSEAPRPKRAKRAKKSSPSGDKSHNPTSPAST 

RRSSIYRGVTRHRWTGRFEAHLWDKSSWNSIQNKKGKQVYLGAYDSEEAAAHTYDLAALK 

YWGPDTILNFPAETYTKELEEMQRVTICEEYLASLRRQSSGFSRGVSKYRGVARHHHNGRW 

EARIGRVFGNKYLYLGTY^TQEEAAAAYDMAAIEYRGANAVTNFDISOTIDRLKKKGVFP 

FPVNQA3SFHQEGILVEAKQEVETREAKEEPREEVKQQYVEEPPQEEEEKEEEKAEQQEAEI 

VGYSEEAAVTOCCIDSSTIMEMDRCGDNNEIAW 

ELFNEIiAFEDNIDFMFDDGKHECLNLENLDCCVVGRESPPSSSSPLSCLSTDSASSTT^ 

TTSVSCNYLV* 

>G987 (1..4011) 

ATGGGTTCTTACTCAGCTGGCTTCCCTGGATCCTTGGACTGGTTTGATTTTCCCGGTTTA 
GGAAACGGATCCTATCTAAATGATCAACCTTTGTTAGATATTGGATCTGTTCCTCCTCCT 
CTAGACCCATATCCTCAACAGAATCTTGCTTCTGCGGATGCTGATTTCTCTGATTCTGTT 
TTGAAGTACATAAGCCAAGTTCTTATGGAAGAGGACATGGAAGATAAGCCTTGTATGTTT 
CATGATGCTTTATCTCTTCAAGCAGCTGAGAAGTCTCTCTATGAAGCTCTCGGCGAGAAG 
TACCCGGTTGATGATTCTGATCAGCCTCTGACTACTACTACTAGCCTTGCTCAATTGGTT 
AGTAGTCCTGGTGGTTCTTCTTATGCTTCAAGCACCACAACCACTTCCTCTGATTCACAA 
TGGAGTTTTGATTGTTTGGAGAATAATAGGCCTTCTTCTTGGTTGCAGACACCGATCCCG 
AGTAACTTCATTTTTCAGTCTACATCTACTAGAGCCAGTAGCGGTAACGCGGTTTTCGGG 
TCAAGTTTTAGCGGTGATTTGGTTTCTAATATGTTTAATGATACTGACTTGGCGTTACAA 
TTCAAGATU^GGGATGGAGGAAGCTAGTAAATTCCTTCCTAAGAGCTCTCAGTTGGTTATA 
GATAACTCTGTTCCTAACAGATTAACCGGAAAGAAGAGCCATTGGCGCGAAGAAGAACAT 
TTGACTGAAGAAAGAAGTAAGAAAC^ATCTGCTATTTATGTTGATGAAACTGATGAGCTT 
ACTGATATGTTTGACAATATTCTGATATTTGGCGAGGCTAAGGAACAACCTGTATGCATT 
CTTAACGAGAGTTTCCCTAAGGAACCTGCGAAAGCTTCAACGTTTAGTAAGAGTCCTAAA 
GGCGAA71AACCGGAAGCTAGTGGTAACAGTTATACAAAAGAGACACCTGATTTGAGGACA 
ATGCTGGTTTCTTGTGCTCAAGCTGTTTCGATTAACGATCGTAGAACTGCTGACGAGCTG 
TTAAGTCGGATAAGGCAACATTCTTCATCTTACGGCGATGGAACAGAGAGATTGGCTCAT 
TATTTTGCTAACAGTCTTGAAGCACGTTTGGCTGGGATAGGTACACAGGTTTATACTGCC 
TTGTCTTCC^GAAAACATCTACTTCTGACATGT0X3AAAGCTTAT(^GAC^TATATATCA 
GTCTGTCCGTTCAAGAAAATCGCAATCATATTCGCCAACCATAGTATTATGCGGTTGGCT 
TCAAGTGCTAATGCCAAAACCATCCACATCATAGATTTTGGAATATCTGATGGTTTCCAG 
TGGCCTTCTCTGATTCATCGACTTGCTTGGAGACGTGGTTCATCTTGTAAGCTTCGGATA 
ACCGGTATAGAGTTGCCTCAACGTGGTTTTAGACCAGCCGAGGGAGTTATTGAGACTGGT 
CGTCGCTTGGCTAAGTATTGTCAGAAGTTCAATATTCCGTTTGAGTACAATGCGATTGCG 
CAGAAATGGGAATCAATCAAGTTGGAGGACTTGAAGCTAAAAGAAGGCGAGTTTGTTGCG 
GTAT^CTCTTTATTTCGGTTTAGGAATCTTCTAGATGAGACGGTGGCAGTGCATAGCCCG 



141 



BNSDOCID: <WO_03013227A2_I_> 



WO 03/0.13227 

PCT/US02/25805 



AGAGATACGGTTTTGAAGCTGATAAGGAAGATAAAGCCAGACGTGTTCATCCCCGGGATC 
CTCAGCGGATCCTACAACGCGCCTTTCTTTGTCACGAGGTTTAGAGAAGTTCTGTTTCAT 



ATGTTTGAGAAAGAGTTCTATGGGCGGGAGATCATGAACGTGGTGGCGTGTGAGGGGACG 

GAGAGAGTGGAGAGGCCAGAGAGTTATAAGCAGTGGCAGGCGAGGGCGATGAGAGCCGGG 

TTTAGACAGATTCCGCTGGAGAAGGAACTAGTTCAGAAACTGAAGTTGATGGTGGAAAGT 

GGATACAAACCCAAAGAGTTTGATGTTGATCAAGATTGTCACTGGTTGCTTCAGGGCTGG 

AAAGGTAGAATTGTATACGGTTCATCTATTTGGGTTCCTTTCTTTTTCTATGTGGGCAGA 

GCAACTAGGGTTTTGATCATGGATCCAAACTTCTCTGAATCTCTAAACGGCTTTGAGTAT 

TTTGATGGTAACCCTAATTTGCTTACTGATCCAATGGAAGATCAGTATCCACCACCATCT 

GATACTCTGTTGAAATACGTGAGTGAGATTCTTATGGAAGAGAGTAATGGAGATTATAAG 

CAATCTATGTTCTATGATTCATTGGCTTTACGAAAAACTGAAGAAATGTTGCAGCAAGTC 

ATTACTGATTCTCAAAATC^GTCCrrTTAGTCCTGCTGATTCATTGATTACTAATTCTTGG 

GATGCAAGCGGAAGCATCGATGAATCGGCTTATTCGGCTGATCCGCAACCTGTGAATGAA 

ATTATGGTTAAGAGTATGTTTAGTGATGCAGAATCAGCTTTACAGTTTAAGAAAGGGGTT 

GAAGAAGCTAGTAAATTCCTTCCCAATAGTGATCAATGGGTTATCAATCTGGATATCGAG 

AGATCCGAAAGGCGCGATTCGGTTAAAGAAGAGATGGGATTGGATCAGTTGAGAGTTAAG 

AAGAATCATGAAAGGGATTTTGAGGAAGTTAGGAGTAGTAAGCAATTTGCTAGTAATGTA 

GAAGATAGTAAGGTTACAGATATGTTTGATAAGGTTTTGCTTCTTGACGGTGAATGCGAT 

CCGCAAACATTGTTAGACAGCGAGATTCAAGCGATTCGGAGTAGTAAGAACATAGGAGAG 

AAAGGGAAGAAGAAGAAGAAGAAGAAGAGTCAAGTGGTTGATTTTCGTACACTTCTCACT 

CATTGTGCACAAGCCATTTCCACAGGAGATAAAACCACGGCTCTTGAGTTTCTGTTACAG 

ATAAGGCAACAGTCTTCGCCTCTCGGTGACGCGGGGCAAAGACTAGCTCATTGTTTCGCT 

AACGCGCTTGAAGCTCGTCTACAGGGAAGTACCGGTCCTATGATCCAGACTTATTACAAT 

GCTTTAACCTCGTCGTTGAAGGATACTGCTGCGGATACAATTAGAGCGTATCGAGTTTAT 

CTTTCTTCGTCTCCGTTTGTTACCTTGATGTATTTCTTCTCCATCTGGATGATTCTTGAT 

GTGGCTAAAGATGCTCCTGTTCTTCATATAGTTGATTTTGGGATTCTATACGGGTTTCAA 

TGGCCGATGTTTATTCAGTCTATATCAGATCGAAAAGATGTACCGCGGAAGCTGCGGATT 

ACTGGTATCGAGCTTCCTCAGTGCGGGTTTCGGCCCGCGGAGCGAATAGAGGAGACAGGA 

CGGAGATTGGCTGAGTATTGTAAACGGTTTAATGTTCCGTTTGAGTACAAAGCCATTGCG 

TCTCAGAACTGGGAAACAATCCGGATAGAAGATCTCGATATACGACCAAACGAAGTCTTA 

GCGGTTAATGCTGGACTTAGACTCAAGAACCTTCAAGATGAAACAGGAAGCGAAGAGAAT 

TGCCCGAGAGATGCTGTCTTGAAGCTAATAAGAAACATGAACCCGGACGTTTTCATCCAC 

GCGATTGTCAACGGTTCATTCAACGCACCCTTCTTTATCTCGCGGTTTAAAGAAGCGGTT 

TACCATTACTCCGCTCTCTTCGACATGTTTGATTCGACGTTGCCTCGGGATAACAAAGAG 

AGGATTAGGTTCGAGAGGGAGTTTTACGGGAGAGAGGCTATGAACGTGATAGCGTGCGAG 

GAAGCTGATCGAGTGGAGAGGCCTGAGACTTACAGGCAATGGCAGGTTAGAATGGTTAGA ' 

GCCGGGTTTAAGCAGAAAACGATTAAGCCTGAGCTGGTAGAGTTGTTTAGAGGAAAGCTG 

AAGAAATGGCGTTACCATAAAGACTTTGTGGTTGATGAAAATAGTAAATGGTTGTTACAA 

GGCTGGAAAGGTCGAACTCTCTATGCTTCTTCTTGTTGGGTTCCTGCCTAG 

>G987 Amino Acid Sequence (domain in AA coordinates: 428-432 704-708) 

MGSYSAGFPGSLDWFDPPGLGNGSYLNDQPLIiDIGSVPPPIiDPyPQQNLASADADFSDSv' 

LKYISQVLMEEDMEDKPCMFHDALSLQAAEKSLYBALGEKYPVDDSDQPLTTTTSLAQLV 

SSPGGSSYASSTTTTSSDSQWSFDCLENNRPSSWLQTPIPSNFIFQSTSTRASSGNAVFG 

SSFSGDLVSjmFNDTDIALQFKKGMEEASKFLPKSSQLVIDNSVPNRLTGKKSHWREEEH 

LTEERSKKQSAIYVDETDELTDMFDNILIFGEAKEQPVCILNESFPKEPAKASTFSKSPK 

GEKPEASGNSYTKETPDLRTMLVSCAQAVSINDRRTADELLSRIRQHSSSYGDGTERLAH 

YFANSLEARLAGIGTQVYTALSSKKTSTSDMLKAYQTYISVCPFKKIAIIFANHSIMRIA 

SSANAKTIHIIDFGISDGFQWPSLIHRLAWRRGSSCKLRITGIELPQRGFRPAEGVIETG 

RRLAKYCQKFNI PFE YNAIAQKWES I KLEDLKLKEGE FVAVNSLFRFRNLLDETVAVHSP 

RDTVLKLIRKIKPDVFIPGILSGSYNAPFFVTRFREVLFHYSSLFDMCDTNLTREDPMRV 
MFEKEFYGREIM^ACEGTERVERPESYKQWQARAMRAGFRQIPLEKELVQKIKLMVES 
GYKPKEFDVDQDCHWLLQGWKGRIVYGSSIWVPFFFYVGRATRVLIMDPNFSESLNGFBY 
FDGNPlttbTDPMEDQYPPPSOTLLKYVSEILMEESNGDYKQSMFYDSLALRKTEEMLOOV 
ITDSQNQSFSPADSLITOSWDASGSIDESAYSADPQPVNEIMVKSMFSDAESALQFKKGV 
EEASKFLPNSDQWVINLDIERSERRDSVKEEMGLDQLRVKKNHERDFEEVRSSKQFASNV 
EDSKVTDMFDKVLLLDGECTPQTLLDSEIQAIRSSKNIGEKGKKKKKKKSQVVDFRTLLT 
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HCAQAISTGDKTTALEFLLQIRQQSSPLGDAGQRLAHCFANALEARLQGSTGPMIQTYYN 
ALTSSLKDTAADTIRAYRVYLSSSPFVTLMYFFSIWMILDVAKDAPVLHIVDFGILYGFQ 
WPMFIQSISDRKDVPRKLRITGIELPQCGFRPAERIEETGRRLAEYCKRFNVPFEYKAIA 
SQNWETIRIEDLDIRPNEVLAVNAGLRLICNLQDETGSEENCPRDAVLKLIRNMNPDVFIH 
AIVNGSFNAPFFISRFKEAVYHYSALFDMFDSTLPRDNKERIRFEREFYGREAMNVIACE 
EADRVERPETYRQWQWMVRAGFKQKTIKPELVELFRGKLKKWRYHKDFVVDENSKWLLQ 

G WKGRTLY AS S C WVP A* 
>G993 (6.. 1091) 

CAAATATGGAATACAGCTGTGTAGACGACAGTAGTACAACGTCAGAATCTCTCTCCATCT 

CTACTACTCCAAAGCCGACAACGACGACGGAGAAGAAACTCTCTTCTCCGCCGGCGACGT 

CGATGCGTCTCTACAGAATGGGAAGCGGCGGAAGCAGCGTCGTTTTGGATTCAGAGAACG 

GCGTCGAGACCGAGTCACGTAAGCTTCCTTCGTCGAAATATAAAGGCGTTGTGCCTCAGC 

CTAACGGAAGATGGGGAGCTCAGATTTACGAGAAGCATCAGCGAGTTTGGCTCGGTACTT 

TCAACGAGGAAGAAGAAGCTGCGTCTTCTTACGACATCGCCGTGAGGAGATTCCGCGGCC 

GCGACGCCGTCACTAACTTCAAATCTCAAGTTGATGGAAACGACGCCGAATCGGCTTTTC 

TTGACGCTCATTCTAAAGCTGAGATCGTGGATATGTTGAGGAAACACACTTACGCCGATG 

AGTTTGAGCAGAGTAGACGGAAGTTTGTTAACGGCGACGGAAAACGCTCTGGGTTGGAGA 

CGGCGACGTACGGAAACGACGCTGTTTTGAGAGCGCGTGAGGTTTTGTTCGAGAAGACTG 

TTACGCCGAGCGACGTCGGGAAGCTGAACCGTTTAGTGATACCGAAACAACACGCGGAGA 

AGCATTTTCCGTTACCGGCGATGACGACGGCGATGGGGATGAATCCGTCTCCGACGAAAG 

GCGTTTTGATTAACTTGGAAGATAGAACAGGGAAAGTGTGGCGGTTCCGTTACAGTTACT 

GGAACAGCAGTCAAAGTTACGTGTTGACCAAGGGCTGGAGCCGGTTCGTTAAAGAGAAGA 

ATCTTCGAGCCGGTGATGTGGTTTGTTTCGAGAGATCAACCGGACCAGACCGGCAATTGT 

ATATCCACTGGAAAGTCCGGTCTAGTCCGGTTCAGACTGTGGTTAGGCTATTCGGAGTCA 

ACATTTTCAATGTGAGTAACGAGAAACCAAACGACGTCGCAGTAGAGTGTGTTGGCAAGA 

AGAGATCTCGGGAAGATGATTTGTTTTCGTTAGGGTGTTCCAAGAAGCAGGCGATTATCA 

ACATCTTGTGACAAATTCTTTTTTTTTGGTTTTTTTCTTCAATTTGTTTCTCCTTTTTCA 

ATATTTTGTATTGAAATGACAAGTTGTAAATTAGGACAAGACAAGAAAAAATGACAACTA 

GACAAAATAGTTTTTGTTTAAAAAAAAAAAAAAAAAAAA 

>G993 Amino Acid Sequence (domain in AA coordinates: 69-134) 

MEYSCVDDSSTTSESLSISTTPKPTTTTEKKLSSPPATSMRLYRMGSGGSSVVLDSENGV 

ETESRKLPSSKYKGWPQPNGRWGAQIYEKHQRVWLGTFNEEEEAASSYDIAVRRFRGRD 

AVTNFKSQVDGNDAESAFLDAHSKAEIVDMLRICHTYADEFEQSRRKFVNGDGKRSGLETA 

TYGNDAVLRAREVLFEKTVTPSDVGKLNRLVIPKQHAEKHFPLPAMTTAMGMNPSPTKGV 

LINLEDRTGKWRFRYSYWNSSQSYVLTKGWSRFVKEKNLRAGDWCFERSTGPDRQLYI 

HWKTOSSPVQTVTOLFGWIFNVSNEKPNDVAVECVGKKRSREDDLFSLGCSKKQAIINI 

h* 

>G681 (1..804) 

ATGGGGAGGACGACATGGTTCGACGTCGACGGGATGAAGAAAGGAGAGTGGACGGCAGAG 
GAAGACCAGAAGCTCGGCGCTTACATCAACGAGCATGGCGTTTGTGATTGGCGTTCCCTC 
CCCAAAAGAGCTGGTTTGCAGAGATGTGGAAAGAGCTGCAGATTAAGGTGGCTTAACTAT 
CTAAAGCCTGGGATTAGAAGAGGCAAATTCACTCCTCAAGAAGAAGAAGAAATCATCCAA 
CTTCATGCTGTTCTCGGAAACAGGTGGGCAGCCATGGCGAAGAAGATGCAGAATCGAACA 
GACAATGATATCAAGAACCATTGGAACTCTTGTCTCAAGAAAAGACTTTCGAGAAAGGGA 
ATCGACCCTATGACCCACGAGCCCATCATCAAACACCTCACCGTCAATACCACTAACGCA 
GATTGTGGTAACTCTTCCACCACGACGTCCCCGTCGACGACGGAAAGCTCTCCTTCCTCC 
GGCTCGTCTCGTCTTCTTAACAAACTCGCCGCAGGTATCTCATCTAGACAACATAGTCTC 
GATAGGATCAAGTAeATCTTGTCGAATTCAATAATCGAAAGCAGTGATCAAGCAAAAGAG 
GAAGAAGAAAAAGAAGAAGAAGAAGAAGAAAGAGATTCAATGATGGGTCAGAAGATTGAC 
GGTAGTGAAGGAGAAGATATTCAGATTTGGGGCGAGGAGGAAGTTAGGCGTTTAATGGAG 
ATTGATGCAATGGATATGTACGAGATGACTTCGTACGACGCTGTCATGTACGAGAGTAGT 
CACATACTTGATCATCTCTTTTGACTTAATATAGTGTGACTGTGTGAGTGCATGCATGTT 

>G681 Amino Acid Sequence (domain in AA coordinates : 14 -120) 
MGRTTWFDVDGMKKGEWTAEEDQKLGAYINEHGVCDWRSLPKRAGLQRCGKSCRLRWIiNY 

LKPGIRRGKFTPQEEEEIIQLHAVLGNRWAAMAKKMQ 
IDPMTHEPIIKHLTVNTTNADCGNSSCT^ 

DRIKYILSNSIIESSDQAKEEEEKEEEEEERDSMMGQKIDGSEGBDIQIWGEEEVRRLME 
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IDAMDMYEMTSYDAVMYESSHILDHLF* 
>G1482 (1..996) 

ATGAAGATCAGGTGCGACGTCTGCGATAAAGAAGAAGCGTCGGTGTTTTGCACGGCCGAC 
GAAGCATCTCTCTGCGGCGGCTGCGACCACCAAGTCCACCACGCTAACAAACTCGCCTCT 
AAACATCTCCGTTTCTCTCTCCTTTATCCTTCTTCTTCCAACACCTCCTCTCCTCTCTGC 
GACATCTGTCAGGATAAAAAAGCTCTGTTGTTCTGTCAACAAGATAGAGCTATTTTATGC 
AAAGATTGCGATTCATCGATCCACGCTGCGAACGAACACACAAAGAAACACGATAGGTTT 
CTTCTTACAGGGGTTAAGCTCTCTGCAACATCGTCTGTTTACAAACCTACTTCGAAATCT 
TCTTCTTCTTCTTCAAGCAACCAAGATTTCTCTGTCCCTGGATCATCAATCTCTAATCCT 
CCTCCTCTCAAGAAACCTCTCTCAGCTCCTCCTCAGAGCAACAAGATCCAACCCTTTTCG 
AAGATCAACGGCGGTGATGCGTCGGTGAATCAGTGGGGATCCACAAGCACGATTTCTGAG 
TATTTGATGGATACGTTACCTGGTTGGCACGTTGAGGATTTCCTCGATTCCTCTCTTCCT 
ACTTATGGTTTCTCTAAGAGTGGTGATGATGATGGAGTGTTACCATATATGGAACCAGAA 
GATGACAACAACACTAAGAGAAACAACAACAACAACAACAACAACAACAACAATACAGTG 
TCACTTCCATCTAAGAATTTAGGGATTTGGGTCCCTCAGATTCCACAAACTCTTCCTTCT 
TCATACCCAAATCAATACTTTTCTCAAGACAACAACATACAGTTTGGGATGTACAACAAA 
GAAACATCACCAGAAGTAGTGTCTTTTGCTCCAATACAAAACATGAAACAACAAGGACAG 
AACAACAAGAGATGGTATGATGATGGTGGCTTCACTGTCCCACAGATCACTCCTCCTCCT 
CTTTCTTCTAATAAAAAGTTTAGATCTTTCTGGTAA 

>G1482 Amino Acid Sequence (domain in aa coordinates • 5-63) 

MKIRCDVCDKEEASVFCTADEASLCGGCDHQVHHANKLASKHLRFSLLYPSSSNTSSPLC 

DICQDKKALLFCQQDRAILCKDCDSSIHAANEHTKKHDRFLLTGVKLSATSSVYKPTSKS 

SSSSSSNQDFSVPGSSISNPPPLKKPLSAPPQSNKIQPFSKINGGDASVNQWGSTSTISE 

YLMDTLPGWHVEDFLDSSLPTYGFSKSGDDDGVLPYMEPEra^ 

SLPSKNLGIWVPQIPQTLPSSYPNQYFSQDNNIQFGMYNKETSPEWSFAPIQNMKOOGO 
NNKRWYDDGGFTVPQITPPPLS SNKKFRSFW* 
>G225 (157. .441) 

CTCTCTCTCTCACTCTTTTCTTTTCCGAGAACCCAACAAAAAAAAAGCTACTATTAATCC 

TTCCCCTCGTGAGGAAATCATTTCTTCTTGTTTCTCGAGATTTATTCTCTTTCTCTCTCT 

CrTTTCTCTGTGTGTTTCGTGTCTTCAGATTAGTTCGATGTTTCGTTCAGACAAGGCGGAA 

AAAATGGATAAACGACGACGGAGAC^GAGCAAAGCCAAGGCTTCTTGTTCCGAAGAGGTG 

AGTAGTATCGAATGGGAAGCTGTGAAGATGTCAGAAGAAGAAGAAGATCTCATTTCTCGG 

ATGTATAAACTCGTTGGCGACAGGTGGGAGTTGATCGCCGGAAGGATCCCGGGACGGACG 

CCGGAGGAGATAGAGAGATATTGGCTTATGAAACACGGCGTCGTTTTTGCCAACAGACGA 

AGAGACTTTTTTAGGAAATGATTTTTTTTGTTTGGATTAAAAGAAAATTTTCCTCTCCTT 

AATTCACAAGACAAGAAAAAAAGGAAATGTACCTGTCCTTGAATTACTATTTTGGAATGT 

ATAATTATCTATATATATAAGAAGAAAAAATTGCTTAGGAATTT 

>G225 Amino Acid Sequence (domain in AA coordinates- 39-76) 

MFRSDKAEKl^KRRRRQSKAKASCSEEVSSIEWEAVKMSEEEE 

AGRI PGRTPEE I ER YWLMKHG WFANRRRDFFRK* 

>G226 (10.. 348) 

CCAGTAGTTATGGATAATACCAACCGTCTTCGTCTTCGTCGCGGTCCCAGTCTTAGGCAA 

ACTAAGTTCACTCGATCCCGATATGACTCTGAAGAAGTGAGTAGCATCGAATGGGAGTTT 

ATCAGTATGACCGAACAAGAAGAAGATCTCATCTCTCGAATGTACAGACTTGTCGGT7UVT 

AGGTGGGATTTAATAGCAGGAAGAGTCGTAGGAAGAAAGGCAAATGAGATTGAGAGATAC 

TGGATTATGAGAAACTCTGACTATTTTTCTCACAAACGACGACGTCTTAATAATTCTCC^ 

TTTTTTTCTACTTCTCCTCTTAATCTCCAAGAAAATCTAAAATTGTAAAGAAATCAAAA^ 

AAAAGCTTTCAATCMAAAAGTAGAACAAATCTTGAATGTCTTCTCA 

>G226 Amino Acid Sequence (domain in AA coordinates: 28-78) 

MDNTNRLRLRRGPSLRQTKFTRSRYDSEEVSSIEWEFISMTEQEEDLISRMYRLVGNRWD 

LIAGRWGRKANEIERYWII^SDYFSHKRRRI^SPFFSTSPLNLQENLKL* 
>G9 (81. .1139) 

GTGTTTCTTCTTTCTGCTAAAAGGTOATAA 

AAGAAACTGAAACAAAGAAAATGGATTCTAGTTGCATAGACGAGATAAGTTCCTCCACTT 
CAGAATCTTTCTCCGCCACCACCGCCAAGAAGCTCTCTCCTCCTCCCGCGGCGGCGTTAC 
GCCTCTACCGGATGGGAAGCGGCGGGAGCAGCGTCGTGTTGGATCCCC^GAACGGC^^ 
AGACGGAGTCACGAAAGCTACCATCTTCAAAATACAAAGGTGTTGTTCCTCAGCCT 
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GAAGATGGGGAGCTCAGATCTACGAGAAGCACCAACGAGTATGGCTCGGGACTTTCAACG 

AGCAAGAAGAAGCTGCTCGTTCCTACGACATCGCAGCTTGTAGATTCCGTGGCCGCGACG 

CCGTCGTCAACTTCAAGAACGTTCTGGAAGACGGCGATTTAGCTTTTCTTGAAGCTCACT 

CAAAGGCCGAGATCGTCGACATGTTGAGAAAACACACTTACGCCGACGAGCTTGAACAGA 

ACAATAAACGGCAGTTGTTTCTCTCCGTCGACGCTAACGGAAAACGTAACGGATCGAGTA 

CTACTCAAAACGACAAAGTTTTAAAGACGTGTGAAGTTCTTTTCGAGAAGGCTGTTACAC 

CTAGCGACGTTGGGAAGCTAAACCGTCTCGTGATACCTAAACAACACGCCGAGAAACACT 

TTCCGTTACCGTCACCGTCACCGGCAGTGACTAAAGGAGTTTTGATCAACTTCGAAGACG 

TTAACGGTAAAGTGTGGAGGTTCCGTTACTCATACTGGAACAGTAGTCAAAGTTACGTGT 

TGACCAAGGGATGGAGTCGATTCGTCAAGGAGAAGAATCTTCGAGCCGGTGATGTTGTTA 

CTTTCGAGAGATCGACCGGACTAGAGCGGCAGTTATATATTGATTGGAAAGTTCGGTCTG 

GTCCGAGAGAAAACCCGGTTCAGGTGGTGGTTCGGCTTTTCGGAGTTGATATCTTTAATG 

TGACCACCGTGAAGCCAAACGACGTCGTGGCCGTTTGCGGTGGAAAGAGATCTCGAGATG 

TTGATGATATGTTTGCGTTACGGTGTTCCAAGAAGCAGGCGATAATCAATGCTTTGTGAC 

ATATTTCCTTTTCCGATTTTATGCTTTCGTTTTTTAATTTTTTTTTTTGTCAAGTTGTGT 

AGGTTGTGATTCATGCTAGGTTGTATTTAGGAAAAGAGATAAGACC 

>G9 Amino Acid Sequence (domain in AA coordinates: 62-127) 

MDSSCIDEISSSTSESFSATTAKICLSPPPAAALRLYRMGSGGSSVVLDPENGLETESRKL 

PSSKYKGWPQPNGRWGAQIYEKHQRVWLGTFNEQEEAARSYDIAACRFRGRDAWNFKN 

VLEDGDLAFLEAHSKAEIVDMLRKHTYADELEQl^ 

LKTCEVLFEKAVTPSDVGKLNRLVIPKQHAEKHFPLPSPSPAVTKGVLINFEDWGKVWR 
FRYSYVmSSQSYVLTKGWSRFVKEKNIiRAGDVVTFERSTGLERQLYIDWKVRSGPRENPV 
QVVVRLFGVDI FNVTTVKPNDWAVCGGKRSRDVDDMFALRCS KKQA I INAL * 
>G1040 (51.. 863) 

CTTTGATCTCCACTATTTAAGTAGACAAGAATCATAAAGAAAATAGTGAGATGATGATGT 

TAGAGTCAAGAAACAGTATGAGAGCTTCAAACTCAGTCCCAGATCTGTCTCTTCAGATCA 

GTCTTCCTAACTATCACGCCGGAAAACCTCTTCACGGCGGTGACCGGAGCTCCACAAGCA 

GTGATTCTGGAAGCAGCCTCAGTGACCTGAGCCATGAGAACAACTTCTTCAACAAACCTC 

TCTTGAGCTTAGGATTTGACCATCATCATCAAAGGCGCTCAAACATGTTCCAACCTCA 

TCTACGGTCGAGATTTCAAGAGAAGCTCATCATCAATGGTTGGTCTTAAACGAAGCATTC 

GTGCTCCAAG7^ATGAGATGGACTTCTACTCTTCATGCTCACTTCGTCCATGCTGTTCAAC 

TTCTTGGCGGCCATGAAAGAGCAACGCCTAAATCAGTGTTGGAGCTCATGAATGTGAAGG 

ATCTAACCCTAGCTCATGTCAAGAGTCACTTGCAGATGTATAGAACAGTGAAATGCACTG 

ATAAAGGATCACCAGGAGAAGGAAAGGTAGAGAAAGAGGCAGAGCAGAGGATAGAGGACA 

ATAATAATAATGAAGAAGCTGATGAAGGAACTGACACAAATTCGCCAAACTCATCATCTG 

TGCAAAAGACCCAAAGAGCTTCATGGTCATCGACAAAGGAAGTATCTAGGAGCATATCTA 

CACAAGCATATTCTCACTTGGGAACAACTCATCACACTAAGGCCAATGAAGAGAAAGAGG 

ATACCAACATTC^TCTCAATTTGGATTTCACATTGGGCGGCCTAGTTGGGGGATGGAATA 

TGCGGAACCCTCCAGTGATTTAACCCTTCTCAAGTGCTAATTGCCTTAAGCTACAACAAA 

TAAGTC^GCTTAGGTTACCAGTTITAACATAATTTTAACTTGTTTTGATC^TATGAGCTT 

CGGAAGAATCATATTATCATCATATATGAACTTCTTTCCAAGAATGTTCTATGAGTTTTT 

TGATATGTATAATCAAGAGAATCGTTTGAAGTAAAAA 

>G1040 Amino Acid Sequence (domain in AA coordinates: 109-158) 

MMMLESRWSMRASNSVPDLSLQISLPNYHAGKPLHGGDRSSTSSDSGSSLSDLSHENNFF 

NKPLLSLGFDHHHQRRSNMFQPQIYGRDFKRSSSSMVGLKRSIRAPRl^WTSTLHAH^ 

AVQLLGGHERATPKSVLELMNVKDLTLAHVKSHLQMYRTVKCTDKGSPGEGKVEK^ 

I EDNNNNEEADEGTDTNS PNS S SVQKTQRASWS STKEVSRS I STQAYSHLGTTHHTKANE 

EKEDTNIHLNLDFTiiGGLVGGWNMRNPPVI * 

>G2114 (64.. 1311) 

ATAAAACGAAACCCTATACATATAAACTAAGAGCGAGAAAGACAGCTAGAGAGAGAGAGA 
GAGATGAAGAAATGGTTGGGATTTTCATTGACACCTCCTTTGAGAATCTGCAATAGTGAA 
GAAGAAGAACTTAGGCATGACGGTTCCGATGTTTGGAGATATGATATTAACTTTGATCAT 
CATCATCATGATGAAGACGTTCCAAAGGTGGAAGATCTC 

GAGTATCCTATAAACC^TAACCAAACCAATGTCAACTGCACCACTGTGGTTAACAGGTTA 
AACCCACCCGGTTACCTTCTCCACGACCAAACCGTAGTTACACCACATTACCCGAACC 
GATCCGAACCITAGCAATGATTATGGAGGTTTTGAGAGGGTCGGTTCGGTCTCGGTTTTC 
AAATCTTGGTTAGAGCAAGGCACTCCAGCATTCCCACTCTCGAGTCATTACGTTACTGAA 
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GAGGCTGGTACGAGCAATAATATTA6TCATTTTAGTAACGAAGAGACTGGTTATAACACC 
AATGGCTCAATGCTATCATTGGCTTTGAGCCATGGGGCTTGTTCTGATTTGATCAACGAA 
TCGAATGTATCCGCACGGGTCGAAGAACCGGTTAAGGTAGATGAGAAGCGGAAGAGATTG 

AGAACTTCTCAGTATCGTGGAGTTACAAGGCATAGATGGACAGGGAGATATGAAGCTCAC 

TTATGGGATAATAGCTGTAAGAAGGAGGGACAGACAAGGAGAGGAAGACAAGTGTATCTT 

GGAGGGTATGATGAGGAGGAGAAAGCAGCGAGGGCATATGATTTAGCGGCTCTGAAGTAT 

TOGGGTCCTACCACTCACTTAAATTTCCCTTTGAGTAATTACGAAAAGGAGAT^ 

^™^ CATG ^ TCGGC ^ G ^ TG ^ GCCATGTTGAGGAGG ^TAGCAGCGGGTCT 

TCGAGGGGAGCTTCCGTGTATAGAGGAGTTACAAGGCATCATCAACATGGAAGGTGGCAA 

GCCAGAATTGGAAGAGTTGCTGGAAACAAGGACTTGTACCTTGGAACATTTAGCACGCAA 

GAAGAAGCAGCGGAGGCGTACGATATCGCGGCAATTAAATTCAGAGGCCTAAACGCTGTA 

ACCAATTTCGATATAAATAGATATGACGTGAAGAGGATATGTTCAAGCTCAACGATTGTT 

GATAGCGACCAGGCCAAACATTCTCCCACCAGCTCTGGCGCCGGCCACTAACCGACACCG 

TAAACTCCTCGCCGGAGAGACTATTCCCACGTACGGTTGGTOTGAGGAAATAAGTTcS 

CAGTCTGTTTAATCATTTATGGTTTAATAAACATATATTCCTAAGTAATTGAGG 

TACATATATACAACTTTTTTAGCAAATTAAGTTATCAGAATCCACTATATATTATTCTCT 

m^tL ° ACld S ^ uence (conserved domain in AA coordinates • 221-297 ,o,i 

fflBMFSLTPPLMCNSEEBEIMXjSDVWRlffllBFDHlfflHDEDVPKVEDLLSM ' ' 3 " 393> 

YPINHNQl^CTTVVNRLNPPGYLLHDQTVVTPHYPNLDPNLSNDYGGFERVGSVSVFK 

SWLEQGTPAFPLSSHYVTEEAGTSNNISHFSNEETGyNTNGSMLSLALSHGACSDLINES 

NVSARVEEPVKVDEKRKRLVVKPQVKESVPRKSVDSYGQRTSQYRGVTRHRWTGRYEAHL 

TONSCKKEGQTRRGRQWLGGYDEEEKAARAYDIiAALKYWGPTTHLNFPLSNYE^ 

NNMNRQEFVAMLRRNSSGFSRGASVYRGVTRHHQHGRWQARIGRVAGNKDLYLGTFSTQE 

EAAEAYDIAAIKFRGLNAVTOFDINRYDVKRICSSSTIVDSDQAKHSPTSSGAGH* 
>G450 (65 . . 751) 

GAGTTATCGAGAGAGAGAGAAAACATATTCTGATTTAAGACATATATAGACAGCAAGAAG 
AGATATGAACCTTAAGGAGACGGAGC^ 

TGAAAGTCCGGCCAAGTCGGGTGTTGGGAACAAGAGAGGCTTCTCCGAGACCGTTGATCT 
CAAACTTAATCTTCAATCTAACAAACAAGGACATGTGGATCTCAACACTAATGGAGCTCC 

caaggagaagaccttccttaaagacccttctaagcctcctgctaaagcacaagtcgS 
agaggaggcaatgagtagtggtggaggaaccgtcgcctttgtgaaggtttccatggatgg 

AGCTCCTTATCn^CGGAAGGTTGACCTCAAGATGTACACCAGCTACAAGGATCTCTCTGA 

agatttc^tgaacgagagtaaagtgatggatctgttgaacagttctgagtatgScc^ 

cgagtcatgcaaacgtttgcgcataatgaaaggatccgaagcaattggacttgctccaIg 

agcaatggagaagttcaagaacagatcatgaacaaaaaaaaaS 

atttttttttttttggtattgttatoatc^tg^ 

taggaaaatataattgtttacaaaaaaataac^^ 

aa^qtctgtgtttttgttttcatctcttaattagtagaaatc^ 

TTGTGATAGTAAATCTATAGAGTTCGTA MIA1 ° TAA 

>G450 Amino Acid Sequence (domain in AA coordinates- TBD) 
^^^? GLPGGTO ^ SPA ^ GVG ^ GPSE ^ L ^QSNKQGHVDLNTNGAPK 

^S^l^ SYroLSDALA ™ FSS ™ SyGA ^ MID ^ESKVMDLLNSSEYVPSY 

EDKDGDWMLVGDVPWPMFVESCKRLRIMKGSEAIGLAPRAMEKFKNRS* 
>G584 (40.. 1809) 

AAAAAGTCTTCTCTTTTATAACTACGTCAGAGAACTGTTATGTCTCCGACGAATGTTCAA 
GTAACCGATTACCATCTCAACCAATCAAAAACGGATACAACAAATCTCTGGTCAACCGAC 
GACGATGCATCGGTAATGGAAGCTTTCATCGGCGGCGGCTCCGATCAWCTOCTCTTTTT 

^ C 3Z^ TCGAAGGAG ™ CGAGAACTGGACT TACGCCGTGrrCTGGCAATCATCT 

S£™™ CCGGAGA ^^ 

G?™n AC ^ GGAGAAG ^ 

GCTGAACAAGAGCATCGTAAGAGAGTGATTAGAGAGCTCAACTCTTTAATCTCCGGTGGT 
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GTAGGAGGAGGAGATGAAGCTGGAGATGAAGAAGTTACAGATACTGAATGGTTCTTCTTA 

GTTTCAATGACACAGAGCTTTGTCAAGGGTACTGGTTTACCTGGTCAAGCTTTCTCAAAT 

TCAGACACGATTTGGTTATCTGGTTCTAATGCTTTAGCTGGATCAAGTTGTGAGAGAGCT 

CGTCAAGGTCAGATTTATGGGTTACAAACAATGGTGTGTGTAGCGACAGAGAATGGTGTC 

GTTGAGCTTGGTTCGTCGGAGATTATTCATCAAAGTTCAGATCTTGTTGATAAAGTTGAC 

ACCTTTTTCAATTTTAACAATGGTGGTGGTGAATTTGGTTCTTGGGCGTTTAATTTGAAT 

CCAGATCAAGGAGAGAATGATCCAGGTTTGTGGATTAGTGAACCTAATGGTGTTGACTCT 

GGTCTTGTAGCTGCTCCGGTGATGAATAATGGTGGAAATGACTCAACTTCTAATTCTGAT 

TCTCAACCAATTTCTAAGCTTTGTAATGGAAGCTCTGTTGAAAACCCTAACCCTAAAGTT 

CTGAAATCTTGTGAAATGGTGAATTTCAAGAATGGGATTGAGAATGGTCAAGAAGAAGAT 

AGTAGTAATAAGAAGAGATCACCGGTTTCGAATAATGAAGAAGGGATGCTTTCTTTTACC 

TCTGTTCTTCCATGTGACTCGAATCACTCTGATCTTGAAGCTTCAGTGGCTAAAGAAGCT 

GAGAGTAACAGAGTTGTGGTTGAACCGGAGAAGAAACCGAGGAAACGAGGGAGAAAACCG 

GCGAATGGAAGAGAAGAGCCTTTGAATCATGTAGAGGCAGAGAGACAGAGAAGAGAGAAG 

TTGAATCAGAGATTCTATTCTTTAAGAGCTGTGGTTCCTAATGTGTCTAAGATGGATAAA 

GCTTCTCTATTAGGAGATGCTATTTCGTATATCAGTGAGCTTAAGTCTAAGTTGCAAAAG 

GCTGAATCTGATAAAGAAGAGTTGCAGAAGCAGATTGATGTGATGAATAAAGAAGCGGGA 

AATGCGAAAAGTTCGGTAAAAGATCGAAAATGTTTGAATCAAGAATCGAGTGTGTTGATA 

GAGATGGAGGTTGATGTGAAGATTATTGGTTGGGATGCAATGATAAGGATTCAATGTAGT 

AAGAGGAATCATCCTGGTGCTAAGTTCATGGAAGCACTTAAGGAGTTGGATTTGGAAGTG 

AATCATGCGAGTTTATCGGTAGTGAATGATCTTATGATCCAACAAGCGACTGTGAAAATG 

GGGAATCAGTTTTTCACGCAAGATCAACTCAAGGTTGCTCTAACGGAGAAAGTTGGAGAA 

TGTCCATGAATTGAAGTCAGCATCTTTAGGGCTAATACACCGGAGAATACTGCGAT^AAGT 

CGAAAACAACGATCATAGTATAAGCCGCGGTAAAAAGTGTTA7VACCTTTCACACAAGTTT 

CTCTAGTGAATGTAGTTGTAAACTCTATTGTGTAAGGGTAATTTTGTAGTACCCACTTGT 

TGCTATTGAATGCTTGTTAGAGAGGATTCTTAGTGTAGTATATGATTAGGTTGGGGTTTG 

TTGTTTCATGAGATAAATAAATGTGTTTGATCAATGGTTAAGTCTTTGGTTTGTTGGTGT 

ATGTATGTAAATAAGGCTTTTGTTAGAAATAAGACAAATGGGACTGAAGTTGGAGTTTAA 

AA 

>G584 Amino Acid Sequence (domain in AA coordinates: 401-494) 

msptnvqvtdyhlnqsktdttnlwstdddasvmeafigggsdhsslfpplpppplpqvwe 
di^qqrlqaliegane™tyavfwqsshgfagednni^tvllgwgdgyykgeeeksrkk 
ksnpasaaeqehrkrvirelnslisggvgggdeagdeevtdtewfflvsmtqsfvkgtgl 
pgqafsnsdtiwlsgsnalagsscerarqgqiyglqtmvcvatengwelgsseiihqss 
dlvdkvdtffnfnngggefgswafnlnpdqgendpglwisepngvdsglvaapvmnnggn 
dstsnsdsqpi sklcngs s venpnpkvlks cemvnfkng i engqeeds snkkrs pvsnne 
egmlsftsvlpcdsnhsdleasvakeaesnrvvvepekkprkrgrkpangreeplnhvea 
erqrrek1nqrfyslravvpnvsi<mdkasllgdaisyiselksklqkaesdkeelqkqid 
vmnkeagnakssvkdrkclnqes s vli eme vdvki igwdam iri qcskrnhpgakfmeal 
keldlevnhasls vvndlm iqqatvto * 

>G668 (1..1056) 

ATGGGAAGACCACCTTGCTGTGAAAAGATTGGAGTGAAGAAAGGGCCATGGACACCAGAG 
GAAGACATCATCTTGGTTTCTTACATCCAAGAACATGGTCCTGGAAACTGGAGATCTGTC 
CC^CAC^CAmGGTTTAAGATGTAGCAAGAGCTGCAGATTGAGATGGACTAATTATCTT 
CGACCCGGTATTAAGCGTGGAAATTTTACTGAGCATGAAGAGAAGACAATTGTTCATCTT 
CAAGCCCTTTTAGGCAACAGATGGGCAGCCATAGCATCATACCTTCCAGAAAGGACAGAC 
AATGATATAAAGAACTATTGGAACACTCACTTGAAGAAGAAGCTCAAAAAGATTAATGAA 
TCTGGTGAAGAAGATAATGATGGTGTCTCTTCATCAAACACTAGTTCACAAAAGAACCAT 
CAAAGCACTAACAAAGGTCAATGGGAAAGAAGACTTCAGACAGACATTAACATGGCAAT^A 
CAAGCTCTTTGTGAGGCCTTGTCTTTAGACAAACCATCATCCACTCTTTCATCATCTTCA 
TCATTACCGACACCAGTAATCACACAACAAAACATCCGTAACTTCTCATCAGCTTTGCTT 
GACCGTTGTTATGATCCATCCTCTTCTTCT^ 

ACTACTAATCCATACCCATCAGGGGTATATGCGTCAAGTGCTGAGAACATCGCCCGGTTG 
CTTCAAGATTTCATGAAAGACACACCCAAGGCTTTAACTTTATCATCTTCATCTCCGGTT 
TCAGAGACTGGACCACTCACTGCTGCAGTCTCGGAAGAAGGTGGAGAAGGGTTTGAACAA 
TCTTTCTTCAGCTTCAATTCAATGGACGAAACTC^UVAACTTGACTCAGGAGACAAGCTTC 
TTCCATGATCAAGTGATCAAACCGGAAATAACT^TGGACCAAGATCATGGTCTAATATCA 
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CAAGGGTCTCTGTCTTTGTTTGAGAAATGGTTATTTGATGAGCAAAGCCACGAGATGGTT 
GGTATGGCACTAGCAGGACAAGAAGGGATGTTCTAG 

>G668 Amino Acid Sequence (domain in AA coordinates- 13-113) 
MGRPPCCEKIGVKKGPWTPEEDIILVSYIQEHGPGNWRSVPTHTGLRCSKSCRLRWTNYL 
RPGIKRGNFTEHEEKTIVHLQALLGNRWAAIASYLPERTDNDIKNYWNTHLKKKLKKINE 
SGEEDNDGVSSSNTSSQKNHQSTNKGQWERRLQTDINMAKQALCEALSLDKPSSTLSSSS 
SLPTPVITQQNIRNFSSALLDRCYDPSSSSSSTTTTTTSNTTNPYPSGVYASSAENIARL 
LQDFMICDTPKALTLSSSSPVSETGPLTAAVSEEGGEGFEQSPFSFNSMDETQNLTQETSP 

FHDQVIKPEITMDQDHGLISQGSLSLFEKWLFDEQSHEMVGMALAGOEGMF* 
>G1050 (23.. 1582) 

TTCCCCATTTCAGAAAATCAAAATGGGTGGTGGTGGTGATACAACAGATACCAATATGAT 

GCAGAGAGTTAATTCTTCTTCTGGTACATCGTCTTCTTCGATCCCTAAACACAATCTTCA 

CTTGAATCCTGCTCTTATCCGCTCTCACCATCACTTCCGTCACCCTTTCACCGGAGCTCC 

TCCACCGCCGATTCCACCCATTTCTCCTTACTCTCAGATCCCGGCGACTTTACAACCTAG 

ACATTCTCGCTCTATGTCGCAACCGTCTTCTTTCTTCTCCTTTGATTCATTGCCGCCGTT 

AAATCCTTCTGCTCCGTCGGTTTCGGTGTCGGTGGAGGAGAAAACCGGTGCCGGATTTAG 

TCCTTCGTTGCCTCCGTCACCGTTTACGATGTGTCATTCTTCTAGCTCTAGGAACGCCGG 

AGATGGAGAGAATCTACCTCCGAGAAAGTCGCATAGGCGTTCGAATAGTGATGTTACTTT 

TGGGTTTAGTTCAATGATGTCTCAGAATCAAAAGTCTCCTCCTTTGAGTTCTTTGGAGAG 

ATCGATCTCTGGTGAAGATACATCAGATTGGTCTAATTTGGTGAAGAAAGAACCGAGAGA 

AGGCTTCTACAAGGGAAGAAAACCAGAGGTTGAAGCAGCTATGGACGATGTTTTCACGGC 

TTATATGAATCTTGATAACATTGATGTCTTGAATTCTTTTGGAGGTGAAGATGGCAAGAA 

TGGGAATGAGAATGTGGAGGAGATGGAGAGTAGTAGAGGTAGTGGTACAAAGAAGACGAA 

TGGTGGAAGTAGTAGTGATTCTGAAGGAGATAGCAGTGCGAGTGGGAATGTGAAGGTTGC 

GTTGAGTTCTTCTTCTTCAGGCGTGAAGAGAAGAGCAGGTGGAGATATTGCTCCTACTGG 

TAGACATTACAGGAGTGTTTCTATGGACAGTTGTTTCATGGGGAAGTTGAATTTCGGCGA 

CGAATCATCGCTAAAGCTTCCGCCTTCTTCATCAGCTAAAGTTTCCCCAACCAATTCAGG 

TGAAGGGAATTCAAGTGCTTATAGTGTTGAATTTGGAAACAGTGAGTTTACTGCAGCTGA 

AATGAAGAAGATTGCAGCTGATGAGAAACTCGCTGAGATTGTAATGGCTGACCCTAAGCG 

TGTTAAAAGAATCTTGGCGAACCGCGTATCTCCTGCACGTTCAAAGGAGCGGAAGACGCG 

ATACATGGCAGAGTTGGAACACAAGGTGCAGACACTTCAGACTGAAGCTACTACATTATC 

GGCTCAGCTCACACATTTGCAGAGAGATTCTATGGGGTTGACAAACCAGAACAGTGAGCT 

GAAGTTTCGTCTTCAAGCTATGGAGCAGCAAGCACAACTCCGCGATGCTCTGTCAGAGAA 

ACTGAATGAAGAAGTCCAGCGGTTGAAACTGGTGATAGGGGAGCCGAACCGCAGGCAAAG 

TGGGAGCAGCAGCAGCGAATCAAAGATGTCACTAAACCCGGAGATGTTTCAGCAGCTTAG 

CATAAGTCAGTTACAACACCAACAGATGCAGCATTCCAATCAGTGTAGCACAATGAAAGC 

AAAGCACACTTCAAACGACTAGGGTAAGTAAAACTGCGATCCGCAGTTGTCTAGTTACAT 

ATATGATAAGAATCTTTTGTGCAGAGTTCTGTTTTTGGAAGTTTTAAAGAAACATATATA 

AAGATTATGTCCGGGAAATTTGATCATATTTCCTGAAACATACACACATATATATAGTGG 
TAATGGAGGACTTTCTTTCTGGACCA wiiAiAiAlAQTGG 

>G1050 Amino Acid Sequence (domain in AA coordinates: 372-425) 

MGGGGDTTDTNMMQRVNSSSGTSSSSIPKHNLHLNPALIRSHHHFRHPFTGAPPPPIPPT 

SPYSQIPATLQPRHSRSMSQPSSFFSFDSLPPLNPSAPSVSVSVEEKTGAGFSPSLPPSP 

FTMCHSSSSRNAGDGENLPPRKSHRRSNSDVTFGFSSMMSQNQKSPPLSSLERSISGEDT 

SDWSNLVKKEPREGFYKGRKPEVEAAMDDVPTAYMNLDNIDVLNSFGGEDGKNGNENVEE 

MESSRGSGTKKTHGGSSSDSEGDSSASGNVKVALSSSSSGVKRRAGGDIAPTGRHYRSVS 

MDSCFMGKLNFGDESSLKLPPSSSAKVSPTNSGEGNSSAYSVEFGNSEFTAAEMKKIAAD 

E^AEXVMADPKRViSILANRVSAARSKERKTRYMAELEHKVQTLQTEATTLSAQLTHLO 

RDSMGLTNQNSELKFRLQAMEQQAQLRDALSEKMEEVQRLKLVIGEPNRRQSGSSSSES 

KMSLNPEMFQQLS I SQLQHQQMQHSNQCSTMKAKHTSND * 

>G1463 (199.. 1209) 

TATCCTTCGCAAGACCCTTCCTCTATATAAGGAAGTTCATTTCATTTGGAGAGGACACGC 

TGACIAAGCTGACTCTAGCffiGATCTGGTACCGTCGACAGTTTGAGATrTGCTTCATCCGGT 
TramATTTTCTGOVAAATATGTCACTCTCT 

AAGTTTGATCAACTTAGTATGCGTrTCTTTTTCTCTCTAGTTCCTCTGTTTCTTGGTCGA 
TTTAGTTTCGTTATGGCGGACACACTGCTCAACGCAGAAGACGAAGTAATAATCTCACGT 
TATCTGAAGCCTATGATCG1TAACAGAGTATCATGGCCTGATCTCTTCATCGAAGACGCA 
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GACGTGTTCAACAAGGATCCATATGTGAAGTTCCATGCTGAGATCCCTAGCTTCGTGATC 

GTTAAACCACGAACAAAGGCTTGTGGTAAAACCGATGGATGTGATTCGGGTTGCTGGAGG 

ATCATTGGTCGTGATAAGCTGATAAAGTCGGAGGAGACTGGTAAGATTCTAGGGTTCAAG 

AAGATACTCAAGTTCTGCCTAAAGTGGAAACCTAGAGAATACAAGAGAAGTTTGGTAATG 

G AAGAGTATAGG CTTACCAATAACTTCAACTGGAAGCAAGATCATGTG ATTTG CAAGATT 

CGGCTTTTGTTTGAAGCAGAAATTAGTTTCTTGCTAGCCAAGCATTTCTACACTACATCA 

GACTCACTTCCTCGAAATGTGCTGTTGCCAGCTTATGGATTCTGTTCACCAGATAAACAA 

GAGGAGGACGAATTTTATCCGGTGACGATAATGATTTCAGAAGGAAAAGATTGGCCTAGC 

TACGTTACCAACAACGTGTATTGTCTGCATCCATCGGAGCTTGTGAATGTTCACGATGGG 

AAGTTTCATGATAACGGAATCTGCATCTTCGCTAACAGGACTTGTGGTGTAACCGATAAA 

TGCAATGAAGGTTACTGGAAGATTAAGCACCGTGAGAAGCTGATCATGTCACGGTACGGG 

CAGACCATTGGTTGGAAGAAAGTTTTTCAGTTTTATGAAACGGAGAAAGAAAGACATTTT 

GGTAATGGAGAAGAAGTGAAGGTAACTTGGACTCTAAAAGAGTATAGGCTTACCAGAAAA 

ATGAACAAGAATAAAGTGGTGTGCGTTATCAAGTATAAGGTAAAGTGTTTACCGAGGATA 

ACTAGCTAGGGACTTCTACTCTTGGTTTCATGATCGATGCGACCGCTCTAGACAGGCCTC 

GTACCGGATCCTCTAGCTAGAGCTTTCGTTCGTATCATCGGTTTCGACAACGTTCGTCA 

>G14 63 Amino Acid Sequence (conserved domain in AA coordinates : 9-156) 

MRFFFSLVPLFLGRFSFVMADTLLNAEDEVIISRYLKPMIVNRVSWPDLFIEDADVFNKD 

PYVKFHAEIPSFVIVKPRTKACGKTDGCDSGCWRIIGRDKLIKSEETGKILGFKKILKFC 

LK^KPREYK^SLVMEEYRLTNNFNWKQDHVICKIRLLFEAEISFLLAKHFYTTSDSLPRN 

VTiLPAYGFCSPDKQEEDEFYPVTIMISEGKDWPSYVTNNVYCLHPSELVNVHDGKFHDNG 

ICIFANRTCGVTDKCNEGYWKIKHREKLIMSRYGQTIGWKKVFQFYETEKERHFGNGEEV 

KVTWTLICEYRLTRKMNKNKVVCVIKYKVKCLPRITS* 

>G1944 (236. .1306) 

TCGACCTTCCTAATTTCCAACCTCTGTTCTTAGCAATATATTTTTTCTCCAAAAATAATT 
CTCAGTTTGATTTTCTTCTTCTAGCTCTTAAGTATATTTCTTTGTTGTTATTTATCTTTT 
AATCCTTTAATCTCATCTTTGTTTATCTTTAATCAAAACCCAAAATTTACATGGGTTCTT 
GAAAATCTAGAAGAAATAAAGGAAACATAACAAAAATAGAAAGAAAAAGAAGCTAATGGT 
CTTAAATATGGAGTCTACCGGAGAAGCTGTTAGATCAACCACCGGTAACGACGGTGGTAT 
TACGGTGGTTAGATCCGACGCGCCGTCAGATTTCCACGTAGCTCAAAGATCAGAAAGCTC 
AAACCAATCTCCCACCTCTGTCACTCCTCCTCCACCACAGCCATCGTCTCATCACACAGC 
TCCTCCGCCGCTGCAAATTTCGACGGTGACGACTACGACTACGACGGCCGCGATGGAAGG 
TATCTCCGGTGGACTGATGAAGAAGAAGCGTGGACGGCCAAGGAAGTATGGACCGGACGG 
GACTGTTGTAGCGTTATCTCCTAAACCGATTTCATCAGCGCCGGCGCCGTCGCATCTTCC 
GCCGCCGAGTTCACACGTCATCGATTTCTCCGCTTCTGAGAAACGTAGCAAAGTGAAACC 
AACGAACTCGTTTAACAGAACAAAGTATCATCACCAAGTTGAGAATTTGGGTGAATGGGC 
TCCTTGCTCCGTCGGTGGTAATTTCACACCTCATATAATCACAGTCAACACCGGCGAGGA 
TGTAACAATGAAGATAATCTCGTTTTCGCAACAAGGACCTCGCTCTATTTGTGTTCTGTC 
AGCAAACGGTGTTATTT<^GCGTTACACTTCGTCAGCCAGATTCCTCTGGCGGC^C^TT 
GA(^TACGAAGGTCGGTTTGAGATATTATC^TTATCCGGGTCATTCATGCCTAATGATTC 
AGGCGGAACACGAAGTAGAACGGGAGGAATGAGTGTATCGTTAGCAAGTCCCGATGGACG 
TGTAGTAGGCGGTGGCCTCGCCGGTTTACTAGTAGCCGCGAGTCCGGTTCAGGTGGTTGT 
AGGAAGTTTTTTAGCGGGC^CTGACCATCAAGATCAGAAACCGAAAAAGAACAAACATGA 
TTTCATGTTGTCGAGTCCTACCGCTGCAATTCCTATCTCTAGTGCAGCTGATCACCGGAC 
AATCCATTCGGTCTCGTCTCTTCCGGTCAATAATAATACATGGCAGACTTCTTTAGCTTC 
CGATCCAAGAAACTU^GCATACCGATATTAATGTCAATGTAACTTGAAATCCAATCTTTCT 
CTGTATTTTCTGTTAACAAGTTTGATTTGGTTGTTTATCTACATTAGGATTTTACTAAAA 
TGGTAGTATTATTTATAGGGTTTTAGGGTCTTTATTTTGGTTCC^CTGTTGTCACTTGTA 
GGATA 

>G1944 Amino Acid Sequence (domain in AA coordinates : 87-100) 

W\HliNMESTGEAVRSTTGNDGGIT\rTO 

TAPPPLQISTVTTTTTTAAMEGISGGLMKXKRGRPRK^ 

LPPPSSHVIDFSASEKRSKVKPTNSFNRTKifHHQVEWLGEWAPCSV 

EDVTMKIISFSQQGPRSICVLSANGVISSVTLRQPDSSGGTLTYEGRFEILSLSGSFMPN 

DSGGTRSRTGGMSVSLASPDGRWGGGLAGLLVAAS PVQVWGS FLAGTDHQDQKPKKNK 

HDFMLSSPTAAIPISSAADHRTIHSVSSLPVTTNNTWQTSL^ 

>G2383 (37.. 990) 
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GACCTCTTTGATCCCTTCATTCCCCATCAAACAACCATGTTTCCTTCTTTCATTACTCAC 

ATTCAAAGCCCTAATTCTCACCATCACTACTCTTCGCCTTCTTTTCCTTTCTCTTCCGAT 

TTTCTTGAGAGTTTTGATGAATCCTTCTTGATAAACCAATTCTTGTTACAGCAGCAAGAT 

GTAGCAGCAAATGTTGTTGAATCTCCTTGGAAATTTTGCAAGAAGCTTGAGCTTAAGAAG 

AAGAATGAGAAGTGTGTTGATGGAAGCACCTCACAAGAGGTTCAATGGAGAAGGACGGTC 

AAAAAAAGGGACAGGCATAGTAAGATCTGCACGGCTCAAGGTCCTAGAGACCGGAGGATG 

AGGCTGTCTCTTCAGATTGCTCGCAAGTTTTTCGATCTTCAAGACATGTTGGGTTTCGAC 

AAGGCGAGCAAGACGATTGAATGGCTTTTCTCCAAATCAAAGACTTCCATCAAACAACTT 

AAAGAAAGAGTGGCTGCATCGGAAGGAGGAGGAAAGGATGAACATCTCCAGGTTGATGAA 

AAGGAAAAGGATGAGACACTGAAGTTGAGAGTCTCAAAGAGAAGAACAAAGACTATGGAG 

AGCTCTTTTAAGACTAAAGAGTCGAGAGAGAGAGCTAGAAAGCGAGCAAGAGAGAGAACA 

ATGGCAAAGATGAAGATGAGATTATTTGAGACCTCGGAAACAATTTCAGATCCTCATCAA 

GAAACTAGAGAGATCAAGATAACCAATGGTGTACAATTACTAGAAAAGGAAAATAAAGAA 

CAAGAATGGAGTAATACTAATGATGTTCACATGGTAGAGTATCAAATGGATTCTGTGAGC 

ATCATAGAGAAGTTTCTTGGACTAACCAGTGACTCTAGCTCCTCTTCCATTTTTGGTGAC 

TCCGAGGAATGTTACACAAGTCTTAGTTCAGTAAGAGGTACAATTTCAGCAGCAGGTAAC 

AGCAATGTGTTAACTAAAAACCCTAATTGAGTAATGCAGTTTTGATTAATATTAGCTTTT 
TGGTAATTCCAGGAATGTCGACACCAAGGG 

>G2383 Amino Acid Sequence (conserved domain in AA coordinates • 89-149) 

MFPSFITHIQSPNSHHHYSSPSFPFSSDFLESFDESFLINQFLLQQQDVAANWESPWKF 

CKKLELKKKHEKCVDGSTSQETVQWRRTVKKRDRHSKICTAQGPRDRRMRLSLQIARKFFD 

LQDMLGFDKASKTIEWLFSKSKTSIKQLKERVAASEGGGKDEHLQVDEKEKDETLKLRVS 

KRRTKTMESSFKTKESRERARKRARERTMAKMKMRLFETSETISDPHQETREIKITNGVQ 

LLEKENKEQEWSNTNDVHMVEYQMDSVSIIEKFLGLTSDSSSSSIFGDSEECYTSLSSVR 
GTISAAGNSNVLTKNPN* 

>G571 (326.. 1708) 

TAGCCGACCTCTCTTCTCTCTTCTGAAAAAAACACCAAAGGAGCTTTAAATGCTCCGTTA 

CATAATCTCTATCTCTTTCCAAGAATATAGAGAAAGGAAAATAATATACAAGAATTAAAA 

GAAGGTATATCATCATCTCTCTAGCTAGTGATCAAAGCACCGTCATCATCATCATATATC 

ATCAGCTTGCCTCAGAGGAGAAGACCAACATAAGAGAGATCGAAGATCAAAATCTATCTC 

TCTTCATCATCTTCTGCTGTTACTATCATATCACACGCTCTCTCAAACATCATCCTATAT 

ATAGACTTCTCTTCATCATCATCAAATGCAAGGTCATCACCAGAATCATCATCAACACTT 

ATCATCATCCTCCGCCACGTCTTCCCATGGAAACTTCATGAACAAAGATGGGTATGATAT 

TGGAGAGATAGACCCATCACTCTTCCTCTATCTTGATGGACAAGGACATCATGATCCTCC 

ATCAACTGCTCCTTCTCCTTTACATCATCATCACACAACTCAGAATTTGGCGATGAGACC 

TCCAACATCGACGCTCAACATCTTTCCATCTCAGCCTATGCACATAGAGCCACCTCCTTC 

TTCTACACACAATACCGATAATACAAGATTAGTTCCGGCTGCTCAACCTAGTGGTTCCAC 

TCGACCAGCTTCTGACCCGTCCATGGACTTGACCAATCATTCTCAGTTTCATCAACCTCC 

TCAAGGTTCTAAATCCATCAAGAAGGAAGGGAACCGCAAGGGTCTTGCCTCATCGGACCA 

TGACATACCTAAATCGTCAGACCCTAAAACATTCAGAAGACTAGCACAAAACAGAGAAGC 

AGCAAGAAAAAGCAGATTACGTAAAAAGGCTTATGTTCAGCAACTCGAGTCATGTAGGAT 

CAAACTGACCCAACTAGAACAAGAGATTCAACGGGCCAGATCCCAAGGCGTATTCTTTGG 

AGGGTCTC1TATAGGAGGAGATCAACAGCAAGGTGGACTACCCATTGGCCCTGGCAACAT 

^^^ MGmG ^ GTGTOCGATATGG AATATGCGAGGTGGCTCGAGGAGCAGCAGAG 

GCTATTAAACGAACTAAGGGTGGCAACACAAGAACACTTCTCCGAGAACGAGCTTAGGAT 

GTTTGTGGACACATGTTTAGCTCATTATGACCATTTGATTAACCTCAAGGCTATGGTCGC 

TAAGACCGATGTCTTCCACCTCATTTCTGGAGCATGGAAAACTCCAGCTGAACGTTGCTT 

CTTGTGGATGGGTGGTTTCCGTCCATCGGAGATCATTAAGGTGATTGTGAACCAGATAGA 

ACCATTGACGGAGCAACAGATAGTTGGGATATGTGGGCTGCAACAGTCCACACAAGAGGC 

CGAGGAGGCTCTCTCGCAAGGCCTCGAGGCGTTGAATCAATCACTTTCCGATAGCATTGT 

^™ OTCCCTCCCGCCTCCCTCCGCACCAOTCCTC CTCATCTATCCAATTT^ 

ACACATGTCCTTAGCTCTCAACAAGCTCTCTGCTCTCGAGGGCTTCGTTCTCCAGGCGGA 

A ^™ TAGCCG ^ CGGACTACTOCCACCCTOT 
^ A ^^^ GTCGGCAAGATGGATAATACTAAAAC ^ 

A^CAAGAGAATAGGTTGATTAGTTAGCCGC^GOTGACCTCTTTATCATATATATC 
GTCTCTCTACTCAAATACAGTGCAATTAGGGAAAATTGTTTGGCT?TCTT1TTGGTATATG 
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ATTCTTACTATTATGTTTTTAATCAAGA 

>G571 Amino Acid Sequence (domain in AA cordinates : 160-220) 

MQGHHQNHHQHLSSSSATSSHGNFMNKDGYDIGEIDPSLFLYLDGQGHHDPPSTAPSPLH 

HHHTTQNLANRPPTSTLNIFPSQPMHIEPPPSSTHNTDNTRLVPAAQPSGSTRPASDPSM 

DLTNHSQFHQPPQGSKSIKKEGNRKGLASSDHDIPKSSDPKTLRRLAQNREAARKSRLRK 

KAYVQQLESCRIKLTQLEQEIQRARSQGVFFGGSLIGGDQQQGGLPIGPGNISSEAAVFD 

MEYARWLEEQQRLLNELRVATQEHLSENELRMFVDTCLAHYDHLINLKAMVAKTDVFHLI 

SGAWKTPAERCFLWMGGFRPSEIIKVIWQIEPLTEQQIVGICGLQQSTQEAEEALSQGL 

EALNQSLSDSIVSDSLPPASAPLPPHLSNFMSHMSLALNKLSALEGFVLQADNLRHQTIH 

RLNQLLTTRQEARCLLAVAEYFHRLQALSSLWLARPRQDG* 

>G636 (6.. 1814) 

CGATGATGCAACTGGGTGGTGGTACTCCGACCACTACAGCGGCGGCTACAACCGTCACAA 

CTGCTACAGCACCACCGCCACAATCAAACAACAACGATTCAGCGGCAACAGAAGCAGCGG 

CAGCAGCGGTTGGGGCGTTTGAGGTGTCGGAAGAGATGCACGACCGTGGGTTTGGAGGAA 

ATCGTTGGCCGCGGCAGGAAACGCTAGCGTTGTTGAAAATACGATCTGACATGGGAATAG 

CGTTTCGAGACGCTAGCGTTAAAGGTCCCTTATGGGAAGAGGTTTCTAGGAAAATGGCGG 

AGCATGGTTACATAAGAAACGCAAAGAAATGCAAAGAGAAATTCGAGAACGTTTACAAAT 

ACCACAAACGAACCAAAGAAGGTCGTACCGGAAAATCCGAAGGCAAAACTTATCGCTTCT 

TTGATCAATTAGAAGCTCTCGAGTCTCAATCTACAACCTCACTCCACCATCATCAACAAC 

AAACGCCTCTTCGACCACAGCAAAACAACAACAACAACAACAACAACAACAACAACAGCT 

CCATATTTTCAACTCCTCCTCCGGTAACGACAGTTATGCCGACGCTTCCTTCTTCATCAA 

TTCCTCCGTATACTCAGCAGATTAATGTACCTTCGTTTCCAAACATCTCCGGTGATTTTC 

TATCGGATAATTCTACATCGTCTTCGTCTTCTTATTCGACTTCTTCTGACATGGAGATGG 

GTGGTGGAACTGCGACTACAAGGAAGAAAAGGAAGAGGAAATGGAAGGTGTTTTTCGAGC 

GGTTGATGAAACAAGTAGTTGATAAACAGGAAGAGCTTCAACGCACATTCTTGGAAGCTG 

TTGAAAAGCGAGAACACAAGAGATTGGTTAGAGAAGAGTCTTGGAGAGTTCAAGAGATTG 

CCAGAATCAACCGCGAGCACGAGATCTTAGCTCAAGAACGCTCTATGTCCGCTGCAAAAG 

ACGCTGCTGTTATGGCCTTTCTTCAAT^AACTGTCAGAGAAACAACCGAATCAGCCACAAC 

CGCAGCCTCAGCCGC^CAAGTTCGACCATCMTGCAGCTTAATAACAACAATCAGCAGC 

AACCGCCTCAACGGTCTCCTCCACCGCAACCTCCTGCTCCGCTTCCGCAGCCAATTCAAG 

CGGTTGTGTCGACGTTAGACACAACGAAAACGCACAATCGTGGTGATCAGAATATGACTC 

CTGCAGCTTCAGCGAGCTCGTCGCGGTGGCCGAAAGTGGAGATAGAAGCATTGATAAAGC 

TGAGGACGAATCTTGATTCGAAATATCAAGAAAACGGACCAAAAGGACCATTGTGGGAAG 

AGATATCAGCGGG7VATGAGAAGGTTAGGATTCAACAGGAACTCAAAGAGATGCAAAGAGA 

AATGGGAAAACATAAACAAATACTTCAAGAAAGTCAAAGAGAGCAACAAGAAACGTCCCG 

AAGATTCCAAGACTTGCCCTTACTTTCACCAGCTTGATGCTTTATATAGAGAGAGGAACA 

AATTCCACAGCAACAACAACATTGCAGCTTCTTCTTCATCTTCCGGTCTTGTTAAACCGG 

ATAATTCTGTTCCCXTGATGGTCCAACCAGAGCAGCAATGGCCTCCGGCTGTAACGACTG 

CGACAACTACTCCCGCAGCGGCTCAGCCTGATCAGCAATCTCAGCCGTCGGAGCAGAACT 

TTGATGATGAAGAAGGTACAGATGAAGAGTACGACGATGAAGATGAGGAAGAGGAGAATG 

AAGAAGAGGAAGGAGGTGAGTTCGAGCTTGTGCCTAGCAATAACAACAACAACAAGACGA 

CGAATAATCTGTAATGATGATGATTCGAGTTCGAACCGGTTTGGTGGTGAAAGATTAGTA 

ATCTXTTTTTA AGTTTTGATACAGAACATGAGAATTTAAATATTGGAGGGTTT 

>G636 Amino Acid Sequence (domain in AA coordinates: 55-145, 405-498) 

MQLGGGTPTTTAAATTVTTATAPPPQSNNNDSAA 

WPRQETLALLKIRSDMGIAFRDASVKGPLWEEVSRKMAEH^^^ 

KRTKEGRTGKSEGKTYRFFDQLEALESQSTTSLHHHQQ 

FSTPPPVTTVMPTIiPSSSIPPYTQQINVPSFPNISGDFLSDNSTSSSSSYSTSSDMEMGG 

GTATTRKKRKRKWK^FFERLMKQVVDKQEELQRTFLEAVEKIiEHKRLVRE 

INREHEILAQERSMSAAKDAAVMAFLQKLSEKQPNQPQPQ 

PQRSPPPQPPAPLPQPIQAWSTLDTTKTHNRGDQNMTPAASASSSRWPKYEIEALIKLR 
TNLDSKYQENGPKGPLWEEISAGMRRIiGFNRNSKRCKEKW 

S KTCPYFHQLDALYRERNKFHSNNNIAASS S S SGLVKPDNSVPLMVQPEQQWPPAVTTAT 
TTPAAAQPDQQSQPSEQNFDDEEGTDEEYDDEDEEEENEEEEGGEFELVPSNN15NNKTTN 

NL* 

>G878 (197.. 1738) 

CAAAAAAAATCTCTCCCATTAAAAGACTGCCCAAAGAAATATTTTATACAAAATGAAAGA 
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GAGAAACACGACACGAATTTTGTATAATTAAGATTACACAAAAAAAAGTGTTAGAAAGAG 
AAATATCTTCTTCTTTTTTCTGTGTGAGTTGGGTTTGTTAAAGTTTTATCCTTTTTGTTC 
TCAAAATCAAGAATCGATGGCGGAGAAGGAAGAAAAAGAACCATCGAAGTTAAAATCATC 
CACCGGAGTTTCACGGCCAACGATTTCACTACCTCCTCGACCGTTTGGTGAAATGTTTTT 
TAGCGGTGGCGTTGGATTTAGTCCTGGACCAATGACTCTCGTCTCAAATTTATTCTCTGA 
TCCTGATGAGTTCAAGTCTTTCTCTCAGCTTTTAGCTGGAGCTATGGCTTCTCCGGCGGC 
AGCTGCTGTTGCCGCCGCTGCTGTGGTTGCTACTGCTCATCATCAGACACCTGTGAGCTC 
TGTCGGTGATGGCGGTGGAAGCGGTGGTGATGTTGACCCGAGGTTTAAGCAGAGTAGACC 
AACGGGATTGATGATAACTCAACCACCGGGGATGTTTACTGTACCGCCGGGGTTAAGTCC 
GQCTACTCTTTTGGATTCTCCGAGCTTCTTTGGTCTT-rrTTCACCTCTTCAGGGAACAS 
TGGTATGACACATCAACAAGCTTTAGCACAAGTCACTGCACAAGCAGTTCAAGGCAATAA 
TGTTCATATGCAGCAATCACAACAATCTGAATATCCTTCTTCTACACAACAACAACAACA 

acaacaacaacaagcttcaitgactgagattccatcattttcttctg^cctag^tctS 

GATTCGAGCCTCGGTTCAAGAAACATCGCAGGGTCAGAGAGAGACTTCGGAAATATCTGT 
CTTOGAGCATCGGTCACAGCCTCAAAATGCTGACAAACCAGCTOATGATGGATACAACTG 
GCGGAAATATGGGCAGAAGCAAGTGAAGGGGAGCGATTTTCCTCGGAGTTATTACAAATG 
TACGCATCCAGCTTGTCCTGTCAAGAAGAAAGTGGAGAGGTCACTCGATGGACAAGTAAC 
GGAAATCATCTACAAGGGTCAACACAATCATGAGCTTCCTCAAAAGCG^S 

CAAGAGTAAGAGGGACCAGGAAACAAGCCAAGTTACAACAACAGAGCAGATGTCTGAAGC 
AAGTGATAGCGAGGAGGTTGGGAATGCAGAGACTAGTGTGGGAGAAAGACATGAGGATGA 
GCCTGATCCCAAGCGAAGAAATACAGAAGTTCGGGITTCAGAACCAGTTGCTTCATCGC^ 
TAGAACTGTGACAGAGCOTAGGATTATTGTCCAAACGACGAGTGAAGTTGACCTCTTAGA 
TGATGGATATAGGTGGCGCAAGTATGGTCAGAAAGTAGTCAAAGGAAATCCTTATCCGAG 
GAGCTACTATAAGTGTACAACACCAGATTGCGGAGTAAGGAAACATGTAGAGAGAGCAGC 

™ AGMCCAGCAGCCATMGT ^^ 

GAGAAGAAGAATACGACGGCGCTTGAGCTITTGTGAGTTTAATGAATCTTCTTTTTGGTT 

^tgaacctgtttttgttgcc^^^ 

TTACAGTTTCAAAAGGTATGITCTTTTAITXCATGTTGGAATCTTCTGTGT^ 

AAGCTTTAGGAGGTAATGTAAAAAACCAGATTCAAAGTTATCCCCTTATGTGAAirCTOT 

TGTACATGGGATAAACAAAATTTACAGGTATCCTTTTTGTTCTTGTTGTAAAAAAAAA^ 

>G878 Amino Acid Sequence (domain in AA coordinates • 250-30S 4 m a-?k\ 

maekeekepsklksstgvsrptislpprpfgemffsggvgfspgpmtlSlpIdpdefk 
sfsqliagamaspaaaavaaaawatahhq^^^ 

tqppgmftvppglspatlldspsffglfsplqgtfgmthqqaiaqvtaqavqgSS 
sqqseypsstqqqqqqqc^aslteipsfssaprsqirasvqetsLretseis^^ 

GQHNHELPQKRGNNNGSCKSSDIANQFQTSNSSLNKSKRDQETSQVTTTEQMSEASDSEE 
VGNAETSVGERHEDEPDPKRRNTEWVSEPVASSHRlW 

^gqkvvkgnpyprsyykcttpdcgvrkhveraatopkavvtSg^ 

HQLRPNNQHNTSTVNFNHQQPVARLRLKEEQIT* ^^muvfaakTSS 
>G1134 (61.. 849) 

caacatccgtcggtagtagcgg cggtggtoacgacggaggaggcagaggagga 

^™ GC ^ CTAGA ^ 

^gottaotgaggaagatgaagaagagtctttgaaacctaatcttggtctcaccgat 
togcttaccgggaactcgaacgatttaccgacaagtcgcggctcgttcgagttcccga^ 

^^ CTOAGTGGTOOTGATGGATTTATCCAAAG 

s^^ CCTCAGmACTTCT ^ 
SS™. A f ATCAAGCATGTCGGATATGAACATGG ^^ 

SS A nn G ^ GGGCTAAACGTGGTTCCGCAACT ^ 

gtacgaaggacgcggattagtgatcggataaggaagctacaagagcttgtacctaacatg 
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GACAAGCAAACCAACACTGCAGACATGTTAGAAGAAGCAGTAGMTACGTGAAAGTTCTT 
CAAAGGCAGATCCAGGAGTTAACAGAAGAACAGAAGAGGTGCACATGCATACCTAAGGAA 
GAACAATAAGGTTTGCTCCTGATTTGTTTTATATTTGCTTAACGGCAATGATCTGATCGA 
AAAATTCGAAAGATGATCTTAGCTTGAATTTAGATGGATGTCATGTTGAAAAGTATATTA 
TTTGATAAATGGATGTAGGTGTAATATAAAATTTTTGTACAATAATGAAGAAAGTTAAAA 
AGAATTAATGAAAACATATATTCTTTATGATATAAAAAAAAAAA 

>G1134 Amino Acid Sequence (domain in AA coordinates: 198-247) 

MQPTSVGSSGGGDDGGGRGGGGGLSRSGLSRIRSAPATWLEALLEEDEEESLKPNLGLTD 

LLTGNSNDLPTSRGSFEFPIPVEQGLYQQGGFHRQNSTPADFLSGSDGFIQSFGIQANYD 

YLSGNIDVSPGSKRSREMEALFSSPEFTSQMKGEQSSGQVPTGVSSMSDMNMENLMEDSV 

AFRVRAKRGCATHPRSIAERWRTRISDRIRKLQELVPNMDKQTNTADMLEEAVEYVKVL 

QRQIQELTEEQKRCTCIPKEEQ* 

>G1008 (89.. 973) 

GCCTTTTTGACTCTTCTTTCTCTCTTCTACTTTTTTTCAGGCTCTCTCTCTATATCTCTA 
TCTTCTTCTCCGGTTAACTAAAAGAGAAATGAAAAGCCGAGTGAGAAAATCCAAGTACAC 
GGTTCACCGGAAAATCACATCCACACCGTTCGACGGTTTCCCGAAGATTGTCAAAATCAT 
AGTCACTGACCCATGCGCTACTGATTCTTCCAGCGATGAGGAAAACGACAACAAATCTGT 
TGCTCCGAGGGTGAAACGTTATGTGGATGAGATCAGGTTCTGTGACGAAGATGACGAACC 
TAAACCGGCGAGG7VAAGCGAA.GAAAAAGTCCCCGGCGGCTGCGGCGGAGAACGGTGGAGA 
TTTGGTAAAGTCTGTGGTGAAGTATAGAGGAGTGAGACAACGACCTTGGGGAAAATTTGC 
GGCGGAGATTCGTGATCCTTCGAGTCGTACTAGACTCTGGCTTGGGACTTTTGCGACGGC 
GGAGGAAGCTGCTATAGGTTACGATAGAGCCGCGATTCGAATCAAAGGTCATAACGCTCA 
GACGAATTTTCTCACTCCTCCTCCTAGTCCGACGACTGAGGTGTTACCGGAAACTCCGGT 
GATTGACCTTGAAACTGTCTCTGGTTGTGATTCGGCGAGGGAATCGCAT^ATCAGTCTGTG 
TTCTCCGACTTCTGTTCTCCGGTTTAGTCACAACGACGAAACAGAGTACAGAACAGAGCC 
AACGGAAGAACAAAATCCGTTTTTCTTGCCTGATTTGTTTCGCTCCGGAGATTATTTTTG 
GGATTCCGAAATTACCCCTGACCCTTTGTTTCTCGACGAATTCCACCAGTCCTTGTTACC 
AAACATCAACAACAACAACACAGTGTGTGATAAGGATACGAATCTGTCTGATAGTTTTCC 
GTTGGGAGTGATCGGAGATTTCAGCTCATGGGATGTTGATGAGTTTTTCCAAGATCATTT 
GTTGGATAAGTAATTTGATGAGTTCTTCCCCAGAATTTTTCTGGGTTTCTCTTTTTGGTT 
GTGTGAGTGAGATGAGTGGTTTGATGACAACGACGGGGATGAATCTTAGCCGTCCGTTTT 
CCATTTCGTGGACGGCTCCGATCAGCGGAAGAAGCGCAACGGAGTTTTTATTTATCTGTT 
TGAGAATTTTATAATTTAATTTGCGAGTAAATATAGTAATTAGTGTTAAGATTGTGAGAG 
TTTAAGTTAATTAGGGAGGGGTTTTGAATATTGGGGATTTTGGGAGGTTTTTGTTTGGTT 
TCTCTCCAAGTCTGTCACTATGCAAGGAAGCAGTATAAAGACCGTATATATATTTTATTA 
TTAATATTGATAAAAGTAAAAAAAAAAAAAAAAA 

>G1008 Amino Acid Sequence (domain in AA coordinates: 96-163) 

MKSRTOKSKYTVHRKITSTPFDGFPKIVKIIVTDPCATDSSSDEENDNKSVAPRVKRYVD 

EIRFCDEDDEPKPARKAKKKSPAAAAENGGDLVKSVVKYRGVRQRPWGKFAAEIRDPSSR 

TRLWLGTFATAEEAAIGYDRAAIRIKGHNAQTNFLTPPPSPTTEVLPETPVIDLETVSGC 

DSARESQISLCSPTSVLRFSHNDETEYRTEPTEEQNPFFLPDLFRSGDYFWDSEITPDPL 

FLDEFHQSLLPNINNl^TVCTKDTNLSDSFPLGVIGDFSSVroVDEFFQD 

>G1020 (132.. 689) 

CTGTTCACAAGATU^GCTCCCCAAAAGGAGCGTTGCTTTACTCTCCTATAAAAAGAAGCTC 
TTCTACTTCTTCTCGTTACCACAAAACTCTTTCACCGATCTTCTCGTTCCATTCTTCTTC 
CTAATTACACCATGCCCAACATCACCATGGGTTTGAAACCCGACCCGGTTGCTCCAACGA 
ACCCGACTCATCATGAGAGTAATGCTGCC7VAAGAGATTCGTTACAGAGGCGTTAGGAAAC 
GTCCATGGGGAAGA5PACGCCGCTGAGATCCGAGATCCGGTTAAGAAAACTCGAGTCTGGC 
TCGGTACGTTCGACACCGCTCAGCAGGCGGCGCGTGCTTACGACGCAGCCGCGCGTGACT 
TTCGTGGTGTTAAGGeTAAGACCAATTTCGGTGTTATCGTTGGTAGTAGTCCTACTCAGA 
GTAGCACCGTCGTCGACTCTCCCACGGCGGCACGGTTTATAACACCTCCGCACCTCGAGC 
TCAGCTTAGGCGGCGGCGGCGCGTGTCGTCGTAAGATCCCGCTTGTGCATCCGGTTTACT 
ACTATT^ACATGGCGACGTATCCAAAGATGACGACGTGTGGTGTCCAGAGCGAGTCTGAAA 
CGTCGTCGGTCGTTGATTTCGAAGGTGGAGCTGGGAAGATATCTCCGCCGTTAGATCTGG 
ATCTTAACTTAGCTCCTCCGGCGGAATAGGCCGTGAGTTTTTTTTTTCTTATGTCGTTTC 
TTTAGACAAAAAAAAATAACGTTTCCTTTTTTTTTCTGCCTAAGAAAAAAATATTATCCG 
TTTTTTAGAAGAAAAAAAAAAAAAAAAAAAAAA 
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>G1020 Ammo Acid Sequence (domain in AA coordinates -28-95) 

MPNITMGLKPDPVAPTNPTHHESNAAKEIRYRGVRKRPWGRYAAEIRDPVKKTRVWLGTF 
DTAQQAARAYDAAAJUSFRGVKAKTNFGVIVGSSPTQSSTVVDSPTAARFITPPH^LSLG 

>G1023 (252.. 1250) 

TCGTCTTCTTAATCGCTTTCTGCTCTGTTTTTCTCGTTCATCAAGCTACATCTACTAGCT 

ctctcagtgattgatttctcacagtttcatcgatttccatgcgtttaagacctaSgS 

CTTGTTCTGGGGTAAAGGACTTTTCTTGTTCTTGAGAGAGTTCATTTTGAGGCTTTTCTG 
GGAATTTTGAGAGGTTTTTTAGGGTTTAAGGGGGTTTGGTITTGAM 

TGTTCGATAAAATGGCTGAACGAAAGAAACGCTCTTCTATTCAAACCAATA^CCCA^A 

aaaaacccatgaagaagaaaccttttcagctaaatcacctcccaggttta^gSS^ 

* GAAGACTATGAGAA ^ 

caagcgaagaagaagaaaggagtcagagaaggaaacgttatgtctgtgagatcgatcSc 

^^^^ 

catctccggtcgitggacgttcttctactactgtctcgaagcctgttggtgttaggcag^ 

ggaaatggggtaaatgggctgctgagattagacatccaatcaccaaagtaagaact 
tgggtacttacgagacgcttgaacaag^^ 

TTGATGCTCTGGCTGCAHrranTTr-rr!OTr.^a-T.r,^^, «"w«^iibAbl 



CTATGATCTCAGCCTCAGGGTCAAGCATTGATCTTGACAAGAAGCTAGTTGATTCGACTC 
TTGATCAACAAGCTGGTGAATCGAAGAAAGCGAGTTTTGATTTCGACTTTGCAGATCTAC 
AG ™ CTGAAftTGGGTTGCCTCATTGATGACTCA ^^ 

ATTTTCTCTTAACAGAAGAGAACAACAACCAAATGTTGGATGATTACTGTGGCATAGATG 
ATCTGGACATCATTGGTCTTR a jnvym a prjOTvon » •» „ _ __^ CA1AtATG 



atcatatcgcaacaactactcccactcctcttaatatcgcgtgcccataagtSgcagc 

TAGGTGTTATTATTAGCTATAGGAGCAACGTA^ 
natr^ A ™r ATAGCAGAGGCAGTOAATCTCAAGG 

g^gcagat^agttttgtgtgttggtgttactaaagaaagttttgttgacataatggS 
ttgatgttgtggac^gatagagaggtgtgatcgaaattgtaaatctcagStgct?™ 

TTGAAGGCAATTGTTTOT(^TTTAGGGTTTTTTTCTATATGAGGATTGT 

c ™ G f GCTTTCTAA ^ 

>G1023 Amino Acid Sequence (conserved domain in AA coorriinah^ i,» noc . 
MAERKKRSSIQTNKPNKXPMKKKPFQLHHLPGLSEDLKTMRKLRFVW 

vgrssttvskpvgvrqrkwgkwaaeirhpitkvrtwlgtyetleqaadayatkklSdaI 
aaatsaassvlsnesgsmisasgssidldkklvdstldqqagesSmfdfdf^dlqipe 

IDLGLIGTTIDKYAFvDHIATTTPTPLNIACP* 
>G1053 (38.. 538) 

GAAACTC ™ CATACTCA ™^ 

ATATTTCCAATATCTATCACCGGAATACAACGTAATAAACATGCCTTCATCTCCAA^S? 
^P^ AAACTACCTAAACGA ^ ATCATCAACAACAACAA ™ 

CATGGTACTCGACC^GAGGAAACAGAGAAGGATGCmCGAACAGAGAATCT^GGAG 
ra^ G ^ GAGGAAACAGAGACATCTTGATGAACT ^^^ 

A ^ GGAGAACTC ^^ • 

™ AAA IT 

* nlno AGid Se, 3 uence (domain in AA coordinates: 74-120) 
vsetqncvlkensklkeeasdlrqlvcelksnknnnnsfprefedn* 
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>G1137 (202 . .1248) 

TACTTCAGACTTCTACTCAAACCAGTCACGTAGTTGGTTGGTGACATTTCGCTGCATTTT 

TCAATCTGTGATTGTTTTTCGTTCGTCTTTCTTTTACTATTTTCTCGAAAAGGACACAAG 

AAGTATTGCATTCACTCAGTTGAGCAACTTAACAATCGTGTTGTACTTTTTGAAGTTCCC 

TTGAGCTAAACTGCTAAGAGCATGCCTCTGGATAAGAGGCAACGGGATTTGCCTCTGGGC 

TTAAGTCCTCAAGCTTGCTTCAAGGATATAGTAGGTCGGTCTGTCCTTCCTAGAATTCCT 

CTCCCTGAGCTTGGGAAACTATATGCAGCTAAGCTTCAGGCTCGCTGTTTGCAGCCACCA 

CCATTCCAGTCTTTGCTGTGCAGTCATGATAAGGAGTCTTATGGAAAAAGATTCTCACGG 

TCTGACATGCGGTCTTGGTGCGCTGCTGCTACTACTACTACTACTCCACTTGGAGCATTA 

GAGTCTTCTCAGAAAAGACTTTTGATATTCGATCAGTCAGGAGACCAGACTCGTCTATTA 

CAATGTCCATTTCCTCTACGGTTTCCATCTCATGCGGCTGCAGAACCAGTGAAACTCTCT 

GAGTTACAAGGTATAGAGAAAGCTTTCAAAGAAGATGGTGAAGAGTTTCACAAGAGTGAT 

GGAACAGAGTCAGAAATGCATGAAGACACTGAGGAGATCAATGCATTGCTATATTCAGAT 

GATGATTATGATGATGATTGCGAGAGTGATGATGAAGTAATGAGCACTGGTCACTCTCCT 

TATCCAAATGAAGGAGTTTGCAACAAAAGGGAATTAGAAGAAATCGATGGTCCTTGTAAA 

AGGCAGAAACTACTGGATAAGGTCAACAACATCAGCGACTTATCATCACTTGTGGGCACT 

GAGAGCTCCACACAACTCAATGGATCTTCCTTTCTTAAGGACAAAAAGCTCCCTGAATCA 

AAAACCATATCGACCAAAGAGGACACTGGTTCTGGTCTGAGCAACGAGCAGTCGAAGAAA 

GACAAGATCCGCACAGCTCTGAAAATACTCGAGAGCGTAGTCCCTGGTGCAAAAGGAAAC 

GAAGCGCTCTTACTTCTGGACGAAGCAATTGATTACCTAAAGTTGCTGT^AACGAGACTTA 

ATCTCCACAGAGGTTAAGAACCAAAGCTCCACCACTCACAAGTCACCAATCTTGTTGCTT 

AAAGAGACAACATGGGG7^CAAG7y^ATCTGCAGACAGATAAGGCGTGAAAGATTCTGACG 

AGTTAAAACGTGTGAAGTGGGTTTTTGGGTACGTATCCTTGCACCAGCTTT 

>G1137 Amino Acid Sequence (domain in AA coordinates : 264-314) 

MPLDKRQRDLPLGLSPQACFKDIVGRSVLPRIPLPELGKLYAAKLQARCLQPPPFQSLLC 

SHDKESYGKRFSRSDMRSWCAAATTTTTPLGALESSQKRLLIFDQSGDQTRLLQCPFPLR 

FPSHAAAEPVKLSELQGIEKAFKEDGEEFHKSDGTESEMHEDTEEINALLYSDDDYDDDC 

ESDDEVMSTGHSPYPNEGVCNKRELEEIDGPCKRQKLLDKVNNISDLSSLVGTESSTQLN 

GSSFLKDKKLPESKTISTKEDTGSGLSNEQSKKDKIRTALKILESWPGAKGNEALLL.LD 

EAIDYLKLLKRDLISTEVKNQSSTTHKSPILLLKETTWGTRNLQTDKA* 

>G1181 (113.. 1012) 

CTCGATCTTTTAACCCCCATTATTACATATTACTCCTTCCTACATTATTCTTCTTCTGCT 

TTCGTGACTTTCAGGGGACACTTTTGTTTTTATAACTTACGCTTAAAATCCTATGAATTC 

GCCGCCGGTTGACGCAATGATTACCGGAGAATCATCGTCACAAAGATCTATCCCAACGCC 

GTTTCTCACAAAAACGTTTAACCTCGTTGAAGATAGTTCCATCGACGATGTTATCTCATG 

GAACGAAGATGGTTCCTCTTTCATCGTATGGAATCCGACAGATTTCGCTAAAGATTTGCT 

TCCTAAACACTTCAAACACAACAATTTCTCTAGTTTCGTTCGTCAGCTCAACACTTACGG 

ATTCAAAAAAGTTGTACCGGATCGATGGGAGTTTTCAAACGATTTCTTTAAGAGAGGAGA 

AAAACGTCTTCTCCGTGAGATCCAACGTCGGAAAATAACAACGACGCATCAAACAGTTGT 

TGCTCCTTCGTCGGAACAACGAAACCAGACGATGGTTGTATCACCGTCAAATTCCGGGGA 

AGATAATAATAATAATCAGGTGATGTCTTCGTCTCCGTCGTCGTGGTATTGTCATCAAAC 

GAAGACGACTGGGAATGGTGGTTTATCAGTGGAGTTATTGGAAGAGAACGAGAAGCTTCG 

GAGTCAAAACATTCAGCTAAACCGTGAGCTTACTCAGATGAAATCTATCTGCGATAATAT 

CTATAGTCTCATGTCGAATTACGTCGGATCTCAGCCCACTGATCGGAGTTATTCTCCCGG 

AGGTAGTAGTAGTCAACCGATGGAGTTTTTACCGGCGAAGCGGTTTTCGGAGATGGAGAT 

TGAAGAAGAAGAAGAAGCGAGTCCGAGGTTGTTTGGTGTTCCGATTGGGTTAAAACGGAC 

GAGAAGTGAAGGTGTTCAGGTG7UVGACGACGGCGGTGGTTGGGGAAAATTCCGATGAGGA 

GACGCCGTGGTTGASACATTATAATCGAACCAATCAGAGAGTTTGTAATTAAAAACGAAC 

GGTTTAGATTTGTGGTGTAGATATGTGCGCGAAGTAGACGATTACAGCTTTTTAAGACAA 

GCAGAGCACGTGTCCeATCTGTTTCAAGAAGTTTCTGCAATCTTGACTTCTTCTTTTAAC 

ACTTTGTGTTTTTTATTATTTAATTAATAACAATAAATGTTCTTTTTCA 

TTCAAAAATAGTTCGGCTGTTTCTAGACTTTCCTTTTTT 

>G1181 Amino Acid Sequence (domain in AA coordinates: 24-114) 
MNSPPVDAMITGESSSQRSIPTPFLTKTFNIiVEDSSIDDVISWNEDGSSFIVWNPTDFAK 
DLLPKHFKHNNFS S FVRQLNTYGFKKVVPDRWEFSNDFFKRGEKRLLREI QRRKITTTHQ 
TWAPS SEQRNQTMWS PSNSGEDITNNNQVMSSSPS SWYCHQTKTTGNGGLS VELLEENE 
KLRSQNIQLNRELTQMKSICDNIYSLMSNYVGSQPTDRSYSPGGSSSQPMEFLP7UCRFSE 
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MEIEEEEEASPRLFGVPIGLKRTRSEGVQVKTTAWGENSDKETPWLRHYNRTNQRVCN* 
>G1228 (63.. 1139) 

GCATTTATAATTACTCACTCATCTTCTTTTCATTACATTACATACCAAACAAGAGCTCTC 

AAATGGAAAGGTTTCAAGGACACATCAACCCCTGTTTCTTCGATCGAAAACCGGATGTGA 
'CAAGGATTTRPfinannrTpn a Ar'Pmrrmrir.mmm^* ^ ^ — 




>G1277 (51.. 512) 



ATTCTAAAGTCCTCCTCTCG6AAAGTAAGAGACTCAACTTCCGAGCCGCCATGGACGCCG 
GAGTAGCAGTAAAAGCTGACGTGGCAGTCAAAATGAAGAGAGAAAGACCATTCAAAGGGA 
TCAGAATGAGAAAATGGGGGAAATGGGTTGCGGAGATTCGAGAACCCAACAAGCGTTCAA 
GACTTTGGCTCGGCTCTTACTCTACTCCCGAAGCGGCGGCGCGTGCATACGACACGGCTG 
TCTTTTACCTCAGAGGACCAACTGCTACGCTCAACTTCCCGGAGCTTCTGCCGTGTACCT 
CCGCCGAGGATATGTCAGCGGCAACGATCAGGAAAAAGGCGACGGAGGTGGGAGCTCAAG 
TAGATGCGATAGGGGCGACGGTGGTGCAGAACAACAAACGCCGCCGCGTTTTTAGTCAAA 
AGCGTGACTTTGGCGGCGGGTTATTAGAGCTTGTTGACTTGAACAAGTTACCTGACCCGG 
AAAATCTCGATGATGATTTGGTGGGAAAATAGACTGAAAAATAATAATAAAATATCTTAC 

TACGGAAAGATTCCTCTGTTTCGTCaTTGTATTAAAATTTAATCCCACAAGTCAAA^— - 
CTGTACaTTATTCTTAATTTAGTATTTTrTT'aT«PaB'i.A-r^n,A^» TO , m -, m 



±ntw\± jl x-rtrtx ^L-U^tJAAGTCAAACATA 

CTGTACATTATTCTTAATTTAGTATTTTCTTATTAATATCTATCATTTGTTTGGTGAACA 

CCAGAATATTAGACTATTAATGTAACGAGTTTTTAATATTTCGATCATAATAACACCAAG 

CTAGTTAAAGGTTAATATCTTGTTACGAAGTCTTGAGTAAGTTCAATTGTCATATATATG 

TAACGGAAGAGGTTCGTTCGGGTCCCAAGTGAAGTGGATCAAAGGTGACTTCACATAAAA 
AATAAAAAAAA 

>G1277 Amino Acid Sequence (domain in AA coordinates- 18-85) 
mAGVAVKADVAVKMKRERPFKGIRMRKWGKWAEIREPNKRSRLWLGSYSTPEAAARAY 

^™ YLRG ™ MFPELLPCTSMDMS ^ TIR ^ TOVGA Q VD AIGATVVQ N NKRRRV 

FSQKRDFGGGLLELVDLNKLPDPENLDDDLVGK* 

>G1309 (53.. 859) 

^^ CCTCCTMTOMGACGAC " GAGAGAG ^GAAAGATACGTGGAAGATGACCAA 
ATOTGGAGAGAGACCAAAACAGAGACAGAGGAAAGGGTTATGGTCACCTGAAGAAGACCA 
GAAGCTCAAGAGTTTCATCCTCTCTCGTGGCCATGCTTGCTGGACCACTGTTCCCATCCT 
AGCTGGATTGCAAAGGAATGGGAAAAGCTGCAGArrAAGGTGGATTAATTACCTAAGACC 
^^^ AAAGAGGGGGTCGmAGTGAAGAAGAAGAAGAGAC ^TCTTGACm^ 

™CCTTGGGTAACAAGTGGTCTCGGATTGCAAAATATTTACCGGGAAGAACAGACAACGA 
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GATTAAGAACTATTGGCATTCCTATCTGAAGAAGAGATGGCTCAAATCTCAACCACAACT 
CAAAAGCCAAATATCAGACCTCACAGAATCTCCTTCTTCACTACTTTCTTGCGGGAAAAG 
AAATCTGGAAACCGAAACCCTAGATCACGTGATCTCCTTCCAGAAATTTTCAGAGAATCC 
AACTTCATCACCATCCAAAGAAAGCAACAACAACATGATCATGAACAACAGTAATAACTT 
GCCTAAACTGTTCTTCTCTGAGTGGATCAGTTCTTCAAATCCACACATCGATTACTCCTC 
TGCTTTTACAGATTCCAAGCACATTAATGAAACTCAAGATCAAATCAATGAAGAGGAAGT 
GATGATGATCAATAACAACAACTACTCTTCACTTGAGGATGTCATGCTCCGTACAGATTT 
TTTGCAGCCTGATCATGAATATGCAAATTATTATTCTTCTGGAGATTTCTTCATCAACAG 
TGACCAAAATTATGTCTAAGAAGAGTGAATATGATCGTAAGAGGAACATAAGCTAGTTAC 
TTGTGTTACAGC 

>G13 09 Amino Acid Sequence (domain in AA coordinates: 9-114) 
MTKSGERPKQRQRKGLWSPEEDQKLKSFILSRGHACWTTVPILAGLQRNGKSCRLRWINY 

lrpglkrgsfseeeeetiltlhsslgnrasriakylpgrtdneiknywhsylkkrwlksq 
pqlksqisdltespssllscgkrnletetldhvisfqkfsenptsspskesnnnmimnns 
nnlpklffsewissswphidyssaftdskhinetqdqineeevmminn™yssledvmlr 
tdflqpdheyanyyssgdffinsdqnyv* 

>G1314 (1..990) 

ATGGGAAGAGCTCCGTGTTGCGACAAGACAAAAGTGAAGCGAGGGCCTTGGTCGCCTGAA 

GAAGACTCTAAACTTAGAGATTACATTGAAAAGTATGGTAATGGTGGAAATTGGATCTCT 

TTCCCCCTCAAAGCCGGTTTGAGGAGATGTGGGAAGAGTTGTA6ACTGAGGTGGCTAAAC 

TATTTGAGACCAAACATAAAGCATGGTGACTTCTCTGAGGAAGAAGACAGGATCATTTTT 

AGTCTCTTCGCTGCCATAGGAAGCAGGTGGTCAATAATAGCAGCTCATCTACCGGGACGA 

ACAGACAACGACATAAAAAACTATTGGAACACAAAGCTAAGGAAGAAACTCTTGTCTTCT 

TCCTCTGATTCATCATCATCAGCCATGGCTTCTCCTTATCTAAACCCTATTTCTCAGGAT 

GTGAAAAGACCAACCTCACCAACAACAATCCCATCTTCTTCTTACAATCCGTATGCTGAA 

AACCCTAATCAATACCCAACAAAATCCCTCATCTCCAGCATCAATGGCTTCGAAGCTGGT 

GACAAACAGATAATTTCCTATATTAACCCTAATTATCCTCAAGATCTCTATCTCTCGGAC 

AGCAACAACAACACCTCGAACGCAAATGGTTTCTTGCTCAACCACAATATGTGTGATCAG 

TACAAGAACCACACCAGTTTTTCTTCAGACGTCAATGGGATAAGATCAGAGATTATGATG . 

AAGCAAGAAGAGATAATGATGATGATGATGATAGACCACCACATTGACCAGAGGACAAAA 

GGGTACAATGGGGAATTCACACAAGGGTATTATAATTACTACAATGGGCATGGGGATTTG 

AAGCAAATGATTAGTGGAACAGGCACTAATTCTAACATAT^ACATGGGTGGTTCAGGTTCA 

TCTTCTAGTTCGATAAGCAACCTAGCTGAGAACAAAAGCAGTGGTAGCCTCCTACTAGAA 

TACAAATGCTTGCCCTATTTCTACTCCTAG 

>G1314 Amino Acid Sequence (domain in AA coordinates: 14-116) 

MGRAPCCDKTKVKRGPWSPEEDSKLRIDYIEKYGNGGNWISFPLKAGLRRCGKSCRLRWLN 

YLRPNIKHGDFSEEEDRIIFSLFAAIGSRWSIIAAHLPGRTDNDIKNYWNTKLRKIOiLSS 

SSDSSSSAMASPYLNPISQDVKRPTSPTTIPSSSYNPYAENPNQYPTKSLISSINGFEAG 

DKQIISYINPNYPQDLYLSDSNIWTSNAN^ 

KQEE I^MMMIDHHIDQRTKGYNGEFTQGYYNYYNGHGDLKQMI SGTGTNSNINMGGSGS 

SSSSISNLAENKSSGSLLLEYKCLPYFYS* 

>G1317 (1..849) 

ATGGGAAGATCACCTTGTTGTGATAA7UVATGGAGTGAAGAAGGGACCATGGACTGCTGAG 
GAGGATCAGAAACTCATCGATTATATTCGATTTCATGGTCCTGGCAATTGGCGTACGCTC 
CCCAAAAATGCTGGACTCCATAGATGTGGAAAAAGCTGCCGTC^ 

CTAAGACCGGACATCAAGAGAGGAAGATTCTCGTTCGAGGAAGAAGAAACTATCATTCAG 

CTACACAGTGTTATGGGAAACAAGTGGTCAGCAATAGCCGCTCGTCTACCAGGGAGGACC 

GATAACGAAATAAAAAACCATTGGAACACTCACATCCGCAAGAGACTTGTAAGGAGTGGT 

ATCGACCCTGTTACTCATTCTCCACGCCTTGATCTTCTTGATTTGTCCTCACTTTTGAGT 

GCACTTTTCAACCAGCCAAACTTTTCAGCAGTTGCAACACATGCGT^ 

CCTGATGTATTGAGGTTGGCCTCTCTACTACTGCCACTTCyVAAACCCTAATCCAGTTTAC 

CCATCGAACCTCGACCAAAATCTTCAAACTCCAAATACATCATCAGAATCGTCTCAACCA 

C7VAGCTGAGACTAGTACAGTCCCAACAAACTATGAAACTTCATCATTGGAGCCTATGAAC 

GCAAGACTCGACGACGTTGGTCTTGCAGATGTATTACCACCTTTGTCAGAGAGTTTTGAC 

TTAGACTCGCTCATGTCAACGCCAATGTCTTCTCCACGACAAAATAGCATTGAAGCAGAA 

ACCAACTCCAGCACTTTCTTCGACTTTGGAATTCCGGAAGATTTCATCTTAGATGACTTT 

ATGTTTTAA 
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^ssssssssssssss sssssss 

PSNLDQNLQTPNTSSESSQPQAETSWPTNYETSSLEPMNARLDDVGLADVLPPLSEsS 
LDSLMSTPMSSPRQNSIEAETNSSTPFDFGIPEDFILDDFMF* ^ ADVLPPLSESFD 
>G1323 (49. .870) 

MAGCACCATGTTGTGACAAAACCAAAGTGAAGAGAGGACCATGGAGCCATGA^GAC 

wgaaactcatctctttcattcacaagaatggtcatgagaa^ggaS 

CAAGCTGGATTGTTGAGGTGTGGCAAGAGTTGTCGTCTGCGATGGATTAATTACCTCAGA 
CCTGATGTGAAACGTGGCAATTTCAGTGCAGAGGAAGAAGaJS 

cagagctttggtaacaagtggtcgaagattgcttctaagctgcctggaagaSg^St 

gaga ™ gaatg ^^^ 

aatgccgatgaagcgggttcaaaaggttctttgaatoaagaagagaaotctcS 

CAGATAAGTCAAATGTWGAGCACATTCTAAC^^ 

gaggtagacaaaccagagctgctggagatgccttttgatttagatcSga^^^ 
^catagatggttcagactcattccaacaaccagagaacagagctottSSag^S 

GAA ^ GAAGCTGATA ^^ 

GATAACCAACAACAACAACAGCATAAACAGGGAACAGAAGATGAACATTCATCA^ 

GAAAATGGAATTATTAGCTAACTTATTGGCATTATTAGTATATAAGCAAGATaGATA^ 
CGGA r TAGTAGCAACAACGAAGAAACGT ^AATTGTAGAC^ 

GTTGAA^GATTGTATTTTGCAAATGATTGCTTTGTAGTGAAATCAAGTOATC^ 
>G1323 Amino Acid Sequence (domain in AA coordinates- is nti 

^LRPDVKRGNFSAEEEDTIIKLHQSFGNKWSKIASKLPGRTDNEI^SSSs 
>G1332 (1..606) 

gaagatatgatattaaaaagctatgttgagactcatggtgaaggaaactgggSga^ 

CAAATATCAAAAGAGGAAGCAT6TCACC ACAAGAAC 

gS^" CTTGGAARCAGATGGTCGTOGATC 

cagaatgcacctgaatcaatcgtcggcgccactcctttcacggataagccSSS 



c 



3TTCCCAC 



acagaactgagaagaagccatggagaaggaggagaagaggagagcaat^ atggag 

GAGACCAACCACTTTGGCTATGACGTCCACGTAGGATCTCCCTTGCGACTl^ GATC =GAG 

oSS GSMSPQEQDL "^ 
>G1334 (76.. 885) 

atagctcccaactaataggaatctcaagcttctcactctctcttgtttttccattggact 
^^cataagctatgc^ctgaggagctto^^ 

atgta cacaaagcatcctcatgttgaacaat^^^^ 
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CAGAGGTCTTCGGGCCGAGTAATGATTCCACTGAAGATGGAGACAGAAGAAGATGGTACC 
ATCTATGTGAACTCAAAGCAGTACCATGGAATTATCAGGCGACGCCAGTCCCGAGCAAAG 
GCTGAAAAACTGAGTAGATGCCGTAAGCCATATATGCATCACTCACGCCATCTCCATGCT 
ATGCGCCGTCCTAGAGGATCTGGCGGGCGTTTCTTGAACACCAAGACAGCTGATGCGGCT 
AAGCAGTCTAAGCCGAGTAATTCTCAGAGTTCTGAAGTCTTTCATCCGGAAAATGAGACC 
ATAAACTCATCGAGGGAAGCAAATGAGTCAAATCTCTCGGATTCTGCAGTTACAAGTATG 
GATTACTTTCTAAGTTCGTCGGCTTATTCTCCTGGTGGCATGGTCATGCCTATCAAGTGG 
AATGCAGCAGCAATGGATATTGGCTGCTGCAAACTTAATATATGATCAGCAGATAGGGGA 
CAAGACATGATTGGTCACCAGTCCTTTTGTCTTGTCCCTTATCTTTCAGCCAAACGGAAA 
GAGAACTTGTGTCTTGGAAAAAAGACATTGAGTTTCCTTGGTTTATAAGATTGGTCCTTT 
TACCATCCGTTTGGCTGTAAACAGGCAAATCATCTTTGGCTCATGCTTCATCAAGTTCTT 
ATCTTCGTCTGTTTTCTTCTACGCATCTTCATAAGATCTCTGAACTAGTGAATAACATTT 
CCTAGCATCATGTTTCAACTAGTGTGTGTTGTT^AGAAACTCTGCCTTATTTCCAGATGAT 
GTATTGTGTGTAACGTGTTTATGAAACAAACGTAAGACTTTCAAGTTAAA7VAAAAAAAAA 
AAAAAAAAAAAAAA 

>G1334 Amino Acid Sequence (domain in AA coordinates: 18-190) 

MQTEELLSPPQTPWWNAFGSQPLTTESLSGEASDSFTGVKAVTTEAEQGWDKQTSTTLF 

TFSPGGEKSSRDVPKPHVAFAMQSACFEFGFAQPMMYTKHPHVEQYYGWSAYGSQRSSG 

RVMIPLKMETEEDGTIYWSKQYHGIIRRRQS^KAEKLSRCRKPYMHHSRHLHAMRRPR 

GSGGRFLNTKTADAAKQSKPSNSQSSEVFHPENETINSSREANESNLSDSAVTSMDYFLS 

SSAYSPGGMVMPIKWNAAAMDIGCCKLNI * 

>G1381 (32. .802) 

CAGCTTTAACACTACTCTCTCTCTCTCTC7^AATGGGAAAAC7y\ATCAACATAGAGAGTAG 

TGCTACTCATCATCAAGACAATATTGTTTCCGTTATAACAGCCACGATATCCTCCTCCTC 

CGTCGTAACGTCTTCGTCAGACTCTTGGTCTACCTCCAAAAGATCGTTAGTGCAAGACAA 

TGACTCCGGAGGGAAACGGCGGAAGAGCAACGTTAGTGATGATAACAAGAATCCGACGTC 

GTATAGAGGAGTGAGGATGAGGAGTTGGGGAAAATGGGTGTCGGAGATTAGAGAGCCGAG 

GAAGAAATCAAGAATATGGCTTGGCACTTATCCAACGGCAGAGATGGCAGCTCGTGCTCA 

TGATGTGGCGGCTTTAGCTATTAAAGGCAACTCCGGTTTTCTTAATTTCCCTGAATTATC 

CGGTTTGCTTCCTCGTCCGGTTAGCTGCTCTCCTAAGGATATACAAGCTGCAGCTACCAA 

AGCCGCCGAAGCAACCACGTGGCACAAACCGGTTATCGATAAGAAATTAGCTGATGAGCT 

AAGCCACTCTGAGTTGTTGTCTACCGCTCAGTC1TCGACTTCTAGTAGTTTCGTGTTTTC 

TTCGGACACGTCGGAGACTTCTAGTACGGACAAGGAAAGCAACGAAGAGACGGTGTTTGA 

TTTGCCGGACCTTTTCACGGACGGGCTTATGAACCCAAACGATGCGTTTTGTTTATGCAA 

CGGCACCTTTACGTGGCAGCTTTACGGAGAGGAGGATGTAGGGTTCAGGTTTGAAGAGCC 

GTTTAATTGGCAAAATGACTAAACCGCCCTCCACTTGCTTACTGTAATTACTAACATATA 

ATTTTCTTGATAAAGAACATATATTTCCATTACGGTATTAACTAATCTTTTCTATCCTTT 

TCTCTTTTCTTGTTTCTACATCTGAGTATATTGTCACTATGTGAAAAAATTGATCTCGTT 

ITGAATATTTACTTTTCAAAATTGAAGTAACGCAAGTGATTGATAAT^AAAAAAAAAA 

>G1381 Amino Acid Sequence (domain in AA coordinates: TBD) 

MGKQINIESSATHHQDNIVSVITATISSSSWTSSSDSWSTSKRSLVQDNDSGGKRRKSN 

VSDDNKNPTSYT^GVRMRSWGKWVSEIREPRKKSRIWLGTYPTM 

SGFLNFPELSGLLPRPVSCSPKDIQAAATKAAEATTWHKPVIDKKLADELSHSELLSTAQ 
SSTSSSFVFSSDTSETSSTDKESNEETVFDLPDLFTDGLMNPNDAFCLCNGTFTWQLYGE 
EDVGFRFEEPFNWQND* 
>G1382 (90.. 1763) 

CTCTCATTTCGCCATAGCTGAGAGCTTCTTCTACTTTCCCTTAGCTTCTTTTTTCCTTCA 
TTTTTGTTCTACCCTTGCGAATCTCTGAAATGAACCCTCAAGCTAATGACCGGAAGGAGT 
TTCAGGGAGATTGTTCGGCGACGGGAGATCTCACGGCAAAGCACGATTCAGCTGGAGGAA 
ACGGAGGTGGAGGTGCTAGGTATAAGCTGATGTCACCGGCCAAGCTTCCGATCTCGAGGT 
CGACTGATATCACGATTCCTCCTGGGTTGAGTCCGACTTCGTTTTTGGAATCTCCTGTTT 
TCATCTCCAACATCAAGCCAGAACCTTCCCCTACTACTGGTTCTTTGTTCAAGCCTCGAC 
CAGTGCACATTTCTGCTAGCTCAAGT^ 

TTACTGAGCAGAAGTCCAGTGAATTTGAGTTCAGACCTCCTGCATCAAATATGGTATATG 
CAGAGCTTGGCAAGATTAGAAGTGAGCCACCAGTACATTTTCAAGGCCAGGGCCATGGAT 
CCTCACACTCACCTTCTTCGATCAGTGATGCTGCAGGTTCCTCAAGTGAGCTAAGCCGGC 
CAACTCCTCCTTGTCAGATGACACCAACGAGCTCAGATATTCCGGCTGGATCTGATCAAG 
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MTCCAGACOTCCCAAAATGACTC ^GAGGAAGCACTCCATCCATCTTGGCTG 
ATGATGGTTATAACTGGAGAAAATATGGTCAA^^ 

GGAGCTATTATAAATGTACACATCCTAATTGTGAAGTGAAAAAGTTATTTGAAAGATCTC 
ATGATGGGCAGATCACCGATATTATATACAAGGGTACACATGACCATCCTAA^CCTCAAC 
CTGGTCGCCGAAACTGTGGTGGTATGGCTGCACAAGAAGAAAGGCTAGACAAGTATCCT? 
CTTCAACTGGCCGAGATGAGAAGGGATCTGGCGTCTACAACTTGTCTAACrcCAATGAAC 

aaactggtaaccctgaagtacctcctatctcagcatctgacgatggtggaS^ 

tggagggtgcgatggaaataactccactagtgaaaccc^tccgggagcgtcgggSSg 
ttcaaactctgagtgaggitgagattctggatcatggttatagatggcg^?^^ 

gcccagtgagaaaacacgtggagagagcatcacatgatccaaaaggtgtaataacaacat 
acgaaggcaaagacgatcatgatgttcccacttcaaagtctagcagcaatcacgaaatcc 

ca^ctgatggacctaaccacgcttccaacgaacatcagcacc^gaatcaacaacttctS 
accaaactcacccaaatggagtcaatttcaggtttgttcatgctagtcccatctcatcct 
actatgctagcttaaatagcggtatgaatcagtacggccagagagaaaSS 
ctcaaaatggtgacatctcgtccttgaacaattcatcttacccatatccgtccaacatcg 

GGAGA f ACAATCGGGTCCGT ^CAAAAAGTAAGCAACATTA TC ^^^ 

AGGTTAGGAATGGGACGAGGCCTTGTTCTATATAATTGCTATTTCTTCACAGAGAGCT^ 

TCTTGATTCAAACTATCTCCACCATATATATTTGTTTGTGTCACCTCTATTG^GTTC^A^ 

AAATGTTATGTAAAAATACACAACAAGATGTTAATGCTITTATTTAAACAA^S^^ 
ATATTACTACAAAAAAAAAAAAAAAAAA i i ARAUAAGAAACAGCA 

>G1382 Amino Acid Sequence (domain in AA coordinates- 2l0-?fi* ™* 

WPQAJTORKEPQGDCSATGDLTAJCHDS AGGNGGGG ARYKLMS PAKLPI SRSTD I TI PPGL ' ' 

sptsflespvpisnikpepspttgslfkprpvhisassssytgrgfhqntfSqSseS 

SSDIPAGSDQEESIQTSQ^SRGSTPSILADDGYNWRKYGQKHVKGSEFPRSYYKCTOPN 

gvynlsnpneqtgnpbvppisasddggeaaasnr^kdepddddpfskSrmegStpl 
^ireprvwqtlsevdilddgyrwrkygqkvvrgnpnprsyJSg^p^SSS 

SHDPKAVITTYEGKHDHJ5VPTSKSSSNHEIQPRFRPDETDTISLNLGWISSTOiro^^ 
>G1435 (8.. 904) 

^^ CATGGGGAAGGAAG " atggtgagcga ™cggtgacgaggacggagaagacgc 

GG f GGGCGGCGATGAATATAGGA ^CCGGAATGGGAAATTGGTT^ 

TTOACTCCGTTATOTMTATCTAGTCOTTCGATTCTCGCOTAGCOT 

CCCAGAACGAAGCCGTACAATTCACGACGTCAATCGCGCGTCGCAAATCAcSS^ 

gagtgttccaggatcagatccgaagaaacagaagaaatcggatggtggS 

GGTGGAGGATTCCACGGCGGAGGAAGGAGACTCCGGGCCTGAAGACGCGTCTGGGAArar 
™ C r ACTCGTOAGAACGOTGCGTCTC ™ GCAGA ^TATA TO 

gattcaaggattgacgacggaagaagatccttattcgtcotcggmcagSc^^c 

AACGCCGGTTCCTCCACAGAGCTTTC^VAGACGGCGGAGGAAGTAACGGAAAOTTGGGGGT 
TCOSGTTCCXKmCH!^ 

agg atattatcaacagtatagtaagcat^^ 

rr™™ Cra ^ GGTCTATAAT ™ CTCTACAGAGA ^ 

ggtttattttcccacttc^tgaggttgttgtgacttttaattctccatgtt^ 
^cm^cctttgtatagaaaa^^ 
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>G1435 Amino Acid Sequence (domain in AA coordinates: 146-194) 

MGICEVMVSDYGDDDGEDAGGGDEYRIPEWEIGLPNGDDLTPLSQYLVPSILALAFSMIPE 

RSRTIHDVWRASQITLSSLRSSTNASSVMEEWDRVESSVPGSDPKKQKKSDGGEAAAVE 

DSTAEEGDSGPEDASGKTSKRPRLVWTPQLHKRFVDWAHLGIKNAVPKTIMQLMNVEGL 

TRENVASHLQKYRLYLKRIQGbTTEEDPYSSSDQLFSSTPVPPQSFQDGGGSNGICLGVPV 

PVPSMVPIPGYGNQMGMQGYYQQYSNHGNESNQYMMQQNKFGTMVTYPSVGGGDVNDK* 

>G1537 (1..783) 

ATGGAAAACGAAGTAAACGCAGGAACAGCAAGCAGTTCAAGATGGAACCCAACGAAAGAT 
CAGATCACGCTACTGGAATOVTCTTTACAAGGAAGGAATACGAACTCCGAGCGCCGATCAG 
ATTCAGCAGATCACCGGTAGGCTTCGTGCGTACGGCCATATCGAAGGTAAAAACGTCTTT 
TACTGGTTCCAGAACCATAAGGCTAGGCAACGCCAAAAGCAGAAACAGGAGCGCATGGCT 
TACTTCAATCGCCTCCTCCACAAAACCTCCCGTTTCTTCTACCCCCCTCCTTGCTCAAAC 
GTGGGTTGTGTCAGTCCGTACTATTTACAGCAAGCAAGTGATCATCATATGAATCAACAT 
GGAAGTGTATACACAAACGATCTTCTTCACAGAAACAATGTGATGATTCCAAGTGGTGGC 
TACGAGAAACGGACAGTCACACAACATCAGAAACAACTTTCAGACATAAGAACAACAGCA 
GCCACAAGAATGCCAATTTCTCCGAGTTCACTCAGATTTGACAGATTTGCCCTCCGTGAT 
AACTGTTATGCCGGTGAGGACATTAACGTCAATTCCAGTGGACGGAAAACACTCCCTCTT 
TTTCCTCTTCAGCCTTTGAATGCAAGTAATGCTGATGGTATGGGAAGTTCCAGTTTTGCC 
CTTGGTAGTGATTCTCCGGTGGATTGTTCTAGCGATGGAGCCGGCCGAGAGCAGCCGTTT 
ATTGATTTCTTTTCTGGTGGTTCTACTTCTACTCGTTTCGATAGTAATGGTAATGGGTTG 

TAA 

>G1537 Amino Acid Sequence (domain in AA coordinates: 14-74) 

MENEVNAGTASSSRWNPTKDQITLLENLYKEGIRTPSADQIQQITGRLRAYGHIEGKOTF 

YWFQNHKARQRQKQKQERMAYFNRLLHKTSRFFYPPPCSNVGCVSPYYLQQASDHHMNQH 

GSVYTroLLHRNNVMIPSGGYEKRTVTQHQKQLSDIRTTAATRMPISPSSLRFDRFALRD 

NCYAGEDINWSSGRKTLPLFPLQPLNASNADGMGSSSFALGSDSPVDCSSDGAGREQPF 

IDFFSGGSTSTRFDSNGNGL* 

>G1545 (67.. 729) 

CATCACCAATCTTTTGAATCTAAGAGAGAGAAGAAGAAGAAGGTCTAGAGAACGAAAAGA 

AGAAACATGAATAACCAGAATGTAGATGATCATAATCTTCTACTCATTTCTCAATTGTAC 

CCTAATGTCTATACTCCATTAGTACCACAACAAGGAGGAGAAGCAAAACCAACACGGCGG 

AGGAAAAGGAAGAGCAAGAGTGTTGTGGTGGCAGAGGAGGGTGAAAACGAAGGCAATGGG 

TGGTTTAGAAAGAGAAAATTGAGTGATGAGCAAGTAAGAATGTTGGAGATTAGCTTTGAA 

GACGATCATAAGCTTGAATCCGAGAGGAAAGATCGGCTTGCTTCTGAGTTAGGGCTTGAT 

CCTCGTCAAGTCGCCGTCTGGTTCCAAAACCGCCGTGCACGGTGGAAGAACAAACGAGTC 

GAGGATGAATACACTAAACTCAAGAATGCATACGAAACCACCGTCGTTGAGAAATGTCGT 

CTTGATTCTGAGGTTATTCACCTAAAGGAACAACTTTACGAGGCTGAAAGAGAGATCCAA 

CGGCTTGCAAAAAGAGTTGAAGGAACTTTAAGTAACAGTCCTATCTCATCCTCTGTGACC 

ATTGAAGCCAATCATACGACACCGTTTTTTGGAGATTACGACATCGGATTTGACGGTGAG 

GCTGACGAGAACTTGCTCTACTCGCCAGATTACATTGATGGATTAGACTGGATGAGCCAA 

TTTATGTAAAAAACTATAAGCTAATCTATTTTCAGTCGTAGTATAG 

>G1545 Amino Acid Sequence (domain in AA coordinates: 54-117) 

MNNQNVDDHNLLLISQLYPNVYTO 

RKRKLSDEQVRMLEI S FEDDHKLESERKDRIiASELGLDPRQVAVWFQNRRARWKNKRVED 
EYTKLKNAYETTVVEKCRLDSEVIHLKEQLYEAEREIQRIiAKRVEGTLSNSPISSSVTIE 
ANHTTPFFGDYDIGFDGEADENIiLYSPDYIDGLDWMSQFM* 
>G1641 (1..867) 

ATGGAGGTTATGAGACCGTCGACGTCACACGTGTCAGGTGGGAACTGGCTCATGGAGGAA 
ACTAAGAGCGGCGTCGCAGCTTCTGGTGAAGGTGCCACGTGGACGGCGGCAGAGAACAAG 
GCATTCGAGAATGCTTTGGCGGTTTACGACGACAACACTCCTGATCGGTGGCAGAAGGTG 
GCTGCGGTGATTCCGGGGAAGACAGTGAGTGACGTAATTAGACAGTATAACGATTTGGAA 
GCTGATGTCAGCAGCATCGAGGCCGGTTTAATCCCGGTCCCCGGTTACATCACCTCGCCG 
CCTTTCACTCTAGATTGGGCCGGCGGCGGTGGCGGATGTAACGGGTTTAAACCGGGTCAT 
CAGGTTTGTAATAAACGGTCGCAGGCCGGTAGATCGCCGGAGCTGGAGCGGAAGAAAGGC 
GTTCCTTGGACGGAGGAAGAACACAAGCTATTTCTAATGGGTTTGAAGAAATATGGGAAA 
GGAGATTGGAGAAACATATCTCGGAACTTTGTGATAACGCGAACGCCAACACAAGTAGCT 
AGCCACGCCCAAAAGTACTTCATCCGGCAACTTTCCGGCGGCAAGGACAAGAGACGAGCA 
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AGCATTCACGACATAACCACCGTAAATCTCGAAGAGGAGGCTTCTTTGGAGACCAATAAG 

AGCTCCATTGTTGTTGGAGATCAGCGTTCAAGGCTAACCGCGTTTCCTTGGAACCAAACG 

GACAACAATGGAACACAGGCAGACGCTTTCAATATAACGATTGGAAACGCTATTAGTGGC 

GTTCATTCATACGGCCAGGTTATGATTGGAGGGTATAACAATGCAGATTCTTGCTATGAC 
GCCCAAAACACAATGTTTCAACTATAG 

>G1641 Amino Acid Sequence (domain in AA coordinates: 139-200) 
MEVMRPSTSHVSGGNWLMEETKSGVMSGEGATWTAAENKAFENALAVYDDNTPDRWQOT 
AAVIPGKTVSDVIRQYNDLEADVSSIEAGLIPVPGYITSPPFTLDWAGGGGGCNGFKPGH 
QVOTKRSQAGRSPELERKKGVPWTEEEHKLFLMGLKJCYGKGDWRNISRNFVITRTPTQVA 
SHAQKYFIRQLSGGKDKRRAS IHDITTVNLEEEASLETNKSS I WGDQRSRLTAFPWNQT 

DNNGTQADAFNITIGNAISGVHSYGQVMIGGYWNADSCYDAQNTMFQL* 
>G165 (19.. 699) 

CTTCAAAACATCTAAAAAATGGTGAAAAAAACTCTTGGTCGTAGAAAGGTAGAGATAGTG 

AAAATGACTAAGGAATCAAACCTTCAAGTCACATTTTCCAAGAGAAAAGCTGGTCTTTTT 

AAGAAGGCTAGTGAATTTTGCACATTATGTGATGCAAAAATTGCGATGATCGTGTTTTCA 

CCAGCTGGAAAAGTATTTTCTTTTGGTCATCCAAATGTTGATGTTCTGCTTGACCACTTT 

CGAGGGTGTGTTGTAGGACACAACAACACAAACCTTGATGAAAGCTACACAAAGCTTCAT 

GTTCAAATGCTCAACAAATCCTACACTGAGGTGAAGGCGGAAGTAGAAAAAGAACAAAAG 

AATAAGCAGTCGCGGGCTCAAAATGAAAGAGAAAACGAAAACGCTGAGGAGTGGTGGAGT 

AAGTCTCCATTAGAACTCAACTTAAGTCAATCAACCTGTATGATACGTGTTCTTAAAGAT 

TTGAAGAAGATAGTTGATGAAA7^AGCAATTCAATTAATCCATCAAACAAACCCAAACTTC 

TATGTTGGAAGTTCTAGCAATGCTGCTGCTCCAGCAACTGTTAGTGGTGGTAATATCTCC 

ACAAACCAGGGGTTCTTTGATCAAAACGGAATGACGACTAATCCTACTCAAACACTTCTG 

TTTGGATTTGATATTATGAATCGCACACCAGGAGTTTAAATAAGTCTATCCTCATTATGG 

GTCTTGGTACTATAAGTTCATCTCTCTCGTTGTTGACTTTTTAAGTCTCCAATAGTTTGT 
TGTG 

>G1 65 Amino Acid Sequence (conserved domain in AA coordinates • 7-62) 

IWKKTLGRRKTOIVKMTKESNLQVTFSKRKAGLFKKASEFCTLCDAKIAMIVFSPAGKVF 

SFGHPNVDVLLDHFRGCWGHNNraLDESYTKLHVQMLNKSYTEVKAEVEK^ 

QNERENENAEEWWSKSPLELNLSQSTCMIRVLKDLKKITOEKAIQLIHQTOPNFYVGSSS 

NAAAPAWSGGNISTNQGFFDQNGMTTNPTQTLLFGFDIMNRTPGV* 

>G1652 (77.. 10.78) 

AGCAAGTCCAAATCTCCCTCTCTCTCTCTCTATCTATCTCTCTATAGAAGATTTTTTAAC 
TAAGAAGCTAGCGATCATGGCCACAGCGATGAACGTTTTCTCTACCAAATGGTCCTCCGA 
ATTGGATATAGAAGAATATAGTATCATCCACCAATTCCACATGAACTCACTCGTCGGAGA 
TGTTCCACAGTCTCTCTCATCTCTTGATGATACCACCACTTGTTATAACCTTGATGCTTC 
TTGTAATAAAAGTTTGGTAGAAGAAAGACCTTCAAAGATCCTCAAGACCACTCACATATC 
ACCAAACTTACATCCTTTTTCTTCTTCTAATCCTCCTCCTCCAAAGCACCAGCCCTCTTC 
TAGGATTCmCTTTTGAAAAGACAGGTTTACATGTTATGAATC^ 

AATATTTAGCCCCAAGGACGAAGAAATTGGATTACCAGAGCATAAGAAAGCCGAGCTGAT 
AATAAGAGGGACAAAGAGAGCTCAATCCTTGACTCGAAGCCAATCAAATGCTCAAGATCA 
C71TACTGGCAGAGAGAAAACGGAGAGAGAAGCTTACTCAAAGATTTGTAGCTCTTTCCGC 
GCTAATTCCTGGCCTAAAGAAGATGGACAAGGCTTCTGTGTTGGGAGATGCAATAAAGCA 
TATAAAGTACCTCCAAGAGAGTGTGAAAGAGTATGAGGAACAAAAGAAGGAAAAGACAAT 
GGAATCAGTGGTTCTTGTAAAGAAGTCTAGTCTC 

ATCATCATCTTCCTCAGATGGAAATCGCAATAGCTCGAGCTCAAATCTTCCAGAAATAGA 
AGTTAGGGTTTCAGGAAAAGATGTTCTTATTAAGATCCTATGCGAGAAGCAAAAGGGTAA 
TGTGATCAAGATTA^GGGGGAGATTGAAAAGCTTGGTTTGTCTATCACCAACAGCAATGT 
CTTGCCOTTGGACCCACTTTTGACATCTCTATTATCGCTCAGAAGAATA^ 
TATGAAAATCGAGGATGTTGTG AAGAACTTGAGTTTTG G CTTATCAAAGCTCACTTAATT 

GGTTTCACGTTACATACATATACACATTCATCATCGATTTCTCCGATCGAAGAATCCAAA 

ATCAGTTTTTCCATGAAAGTGGTTTTTTAGTTGTTAAGTOT^ 

GTCATTTAAAGATCCTTGTTCTTGTC 

TGTTTAGTAATTATTTCTCTCCAGTTTCATTTGGGACGGAATTTTTO 

ATATATATTTCCTGC^ATGTAAAGCATTTCGTTAGTTTAATAAACGTCCGATATGTTTCT 
TTGAAAA 

>G1652 Amino Acid Sequence (domain in AA coordinates : 143-215) 
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MATAMNVFSTKWSSELDIEEYSIIHQFHMNSLVGDVPQSLSSLDDTTTCYNLDASCNKSL 

VEERPSKIIiKTTHISPNLHPFSSSNPPPPKHQPSSRILSFEKTGLHVMNHNSPNLIFSPK 

DEEIGLPEHKKAELIIRGTKRAQSLTRSQSNAQDHILAERKRREKLTQRFVALSALIPGL 

KKMDKASVLGDAI KHI KYLQESVKEYEEQKKEKTMES VVLVKKSSLVLDENHQPSSS S S S 

DGNRNSSSSNLPEIEVRVSGKDVLIKILCEKQKGNVIKIMGEIEKLGLSITNSNVLPFGP 

TFDISIIAQKNNNFDMKIEDWKNIiSFGLSKLT* 

>G1655 (132.. 755) 

TTTCTAACTAGTCACATTGAGAGAGAGAGAGAGAGAAAGAGAGACTCTCAGAATCTGAAG 
AAGAAGAAGAGATTGTTGTTTTTGCCTTTTATCATCGGTTTCTTTGAATCTCTGGTTTTA 
AATCGGATTTAATGGTGGAGTCTCTGTTCCCGAGCATCGAAAACACAGGTGAATCGTCTC 
GAAGAAAGAAGCCGAGGATATCAGAGACGGCGGAGGCGGAGATAGAGGCACGACGTGTCA 
ACGAAGAAAGCTTGAAGAGATGGAAAACGAATCGTGTGCAACAGATCTACGCTTGTAAGC 
TCGTCGAAGCTTTACGCCGAGTTCGTCAGAGATCTTCCACCACCAGCAACAACGAGACCG 
ATAAACTCGTCTCCGGCGCGGCGAGGGAGATACGTGATACGGCGGATCGAGTTCTAGCTG 
CGTCCGCTCGTGGTACGACTCGGTGGAGCAGAGCGATTTTAGCGAGTCGCGTCCGAGCGA 
AGCTGAAGAAACATAGAAAGGCGAA7VAAGTCAACGGGAAATTGTAAATCGAGAAAAGGTC 
TCACGGAGACGAATCGGATTAAGTTACCGGCGGTTGAGAGAAAACTGAAGATTCTTGGCC 
GTTTGGTTCCTGGTTGCCGGAAAGTCTCTGTACCGAATCTTTTAGATGAAGCGACCGATT 
ACATCGCAGCGTTAGAGATGCAGGTTCGAGCCATGGAGGCTCTCGCCGAACTTTTAACCG 
CAGCCGCACCACGGACGACGTTGACCGGAACTTAACGGCGGCAGTTAGTTTGTCAGTTGT 
TAATTAGCTTTTCTTTTACCTTTTTACCCCTTTATTTTGGCTTCAAGTGTTTTTTTTTTC 
TCGTCGACGCGATTTTAATTTATTAAATTCA 

>G1655 Amino Acid Sequence (domain in AA coordinates: 134-192) 

MVESLFPSIENTGESSRRKKPRISETAEAEIEARRVNEESLKRWKTNRVQQIYACKLVEA 

LRRWQRSSTTSNNETDKLVSGAAREIRDTADRVLAASARGTTRWSRAILASRVMK1,KK 

HRKAKKSTGNCKSRKGLTETNRIKLPAVERKLKILGRLVPGCRKVSVPNLIiDEATDYIAA 

LEMQVRAME ALAELLT AAAPRTTLTGT * 

>G1671 {188. .751) 

TCCCACTATCCTTCGCAAGACCCTTCCTCTATATAAGGAAGTTGATTTCATTTGGAGAGG 

ACACGCTGACAAGCTGACTCTAGCAGATCTGGTACCGTCGACCCTCTCTATATAATCTTC 

TTCTACACACACACACACACGCAACCATATACGTACATGTGAAGTAGTGAGATCAATATC 

GTTAGCAATGAATCTACCACCGGGATTTAGGTTTTTTCCGACCGATGAAGAGCTCGTCGT 

TCACTTCCTCCACCGGAAAGCTTCCCTCTTGCCTTGTCACCCTGATGTCATCCCCGACCT 

TGATCTTTACCATTACGATCCTTGGGACCTTCCCGGGAAAGCTTTGGGAGAAGGGAGGCA 

ATGGTACTTCTATAGTAGAAAGACACAAGAGAGAGTGACAAGCAATGGGTATTGGGGATC 

AATGGGAATGGACGAGCCAATCTACACAAGCTCCACACACAAGAAAGTGGGAATCAAAAA 

GTATCTAACTTTCTATCTCGGAGATTCTCAGACTAATTGGATCATGCAAGAATATTCCCT 

CCCGGATTCCTCTTCTTCATCTAGTCGATCTTCTAAGAGATCAAGCCGTGCTTCTAGTTC 

TAGTCACAAACCCGATTATAGCAAGTGGGTGATATGCAGAGTGTATGAGCAAAATTGCAG 

TGAGGAGGAAGACGATGATGGGACAGAACTCTCATGTTTGGATGAAGTGTTTTTGTCTTT 

AGATGATCTTGACGAAGTAAGCTTACCGTAATAAAGACAGAAGCACCCAAGAAGAGAAAA 

AAAAAAAAAGGGTTTAGTGGGCAATTATTTCTAAGCGACCGCTCTAGACAGGCCTAGTAC 

CGGATCCTCTAGCTAGAGCTTTCGTTCGTATCATCGGTTTCGACAACGTTCGTCAAGT 

>G1671 Amino Acid Sequence (domain in AA coordinates: TBD) 

MNLPPGFRFFPTDEELWHFLHRKASLLPCHPDVIPDLDLYHYDPWDLPGKALGEGRQWY 

FYSRKTQERVTSNGYWGSMGMDEPIYTSSTHKCTGIKKYLTFYtGDSQTNWIMQEYSIiPD 

SSSSSSRSSKRSSRASSSSHKPDYSKWVICRVYEQNCSEEEDDDGTELSCLDEVFLSIiDD 

LDEVSLP* -r- 
>G1756 {71.. 1003) 

ATATGTACTTGTACAeCAACCCACCAAAAGAGATAAAAGAGGAAACAAAAACTCGAAAAG 

AGAGAGATATATGGGTGAGGTGGCTTATATGGACGAAGGAGACCTAGAAGCAATAGTCAG 

AGGCTACTCCGGCTCCGGAGACGCGTTTTCCGGCGAAAGTTCCGGTACGTTTTCACCTTC 

GTTTTGCCTACCGATGGAGACGTCTAGTTTCTACGAACCGGAGATGGAGACAAGTGGCTT 

AGATGAGCTCGGTGAACTTTACAAACCCTTTTACCCTTTCTCCACACAAACGATCCTCAC 

AAGCTCGGTCTCTCTCCCTGAAGATTCAAAACCTTTCCGAGATGACAAGAAACAACGATC 

A(^TGGTIX3TCTTTTATCCAACGGATCAAGAGCTGATCATATCCGAATTTCA 

ATCAAAGAAAAGCAAGAAGAATCAACAGAAGAGAGTTGTTGAGCAAGTGAAAGAAGAGAA 
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TCTGTTGTCGGACGCATGGGCGTGGCGTAAATACGGGCAGAAACCCATCAAAGGATCTCC 

ATACCCAAGGAGTTATTACAGATGCAGTAGCTCAAAAGGGTGTTTGGCAAGAAAACAAGT 

CGAAAGAAATCCTCAAAACCCGGAGAAATTCACCATAACATACACTAATGAGCACAATCA 

TGAACTACCAACCCGGAGAAACTCATTAGCCGGTTCGACTCGAGCAAAAACTTCCCAACC 

CAAACCAACCTTAACCAAAAAATCCGAAAAAGAAGTTGTTTCTTCCCCTACAAGTAATCC 

TATGATCCCATCCGCTGATGAATCTTCTGTTGCGGTTCAAGAAATGAGCGTTGCGGAAAC 

GAGTACGCACCAAGCGGCTGGAGCAATCGAGGGCCGCCGCTTGAGTAACGGTTTACCATC 

GGATTTGATGTCCGGGAGCGGAACTTTTCCAAGTTTTACCGGTGACTTCGATGAACTATT 

GAATAGCCAAGAGTTCTTCAGTGGGTATTTATGGAATTACTAGAGAGCATTAGGTGTATG 
TATATATATAT 

>G1756 Amino Acid Sequence (domain in AA coordinates- TBD) 
MGEVAYMDEGDLEAIVRGYSGSGDAFSGESSGTFSPSFCLPMETSSFYEPEMETSGLDEL 
GELYKPFYPFSTQTILTSSVSLPEDSKPFRDDICKQRSHGCLLSNGSRADHIRISESKSKK 
SKKNQQKRVVEQVKEENLLSDAWAWRKYGQKPIKGSPYPRSYYRCSSSKGCLARKQVERN 
PQNPEKFT I TYTNEHNHELPTRRNSLAG S TRAKTS QPKPTIiTKKSEKE WS S PTSNPM I P 

SADESSVAVQEMSVAETSTHQAAGAIEGRRIjSNGLPSDLMSGSGTFPSFTGDFDELLNSQ 
EFFSGYLWNY* ' 

>G1757 (250.. 1224) 

ATCACCAATCCTATAACACTCTCATTCTCATCATATCATTCTTCAATCTATATAACCCAT 

TCTTAATTATACTCAACACACATTATATTTTTCTGATCATATCATTCTTTCAGTCCATCT 

ATATAACCAATTCTTGATTTATACTTAAAACACACATTATACATCTTTCTCATCATAGTT 

TGTATCAATTTCCTAGAGTAAACTACCTAAAGGAAAAAAAAAATCTATTTTGGGAATCAT 

ATACTAAAAATGGAAGGAAGAGATATGTTAAGTTGGGAGCAAAAGACATTGCTAAGCGAG 

CTTATCAATGGATTTGATGCGGCCAAAAAGCTTCAGGCACGACTTAGAGAAGCTCCGTCG 

CCGTCGTCATCATTTTCATCACCGGCGACGGCTGTTGCTGAGACTAACGAGATTCTGGTG 

AAGCAGATAGTTTCTTCCTACGAGAGATCTCTTCTTCTGCTAAACTGGTCATCCTCACCG 

AGCGTACAACTTATTCCGACGCCGGTTACTGTAGTCCCGGTGGCAAATCCCGGCAGTGTT 

CCAGAATCTCCGGCATCGATAAACGGAAGTCCGAGAAGTGAAGAGTTTGCCGATGGAGGA 

GGTTCTAGCGAGAGTCATCATCGCCAAGATTACATTTTCAATTCAAAGAAAAGAAAGATG 

TTACCAAAGTGGTCAGAAAAAGTGAGAATAAGCCCAGAGAGAGGCTTAGAAGGACCTCAA 

GATGATGTCTTTAGCTGGAGAAAATATGGTCAAAAAGACATTTTAGGCGCCAAATTCCCA 

AGGAGTTATTACAGATGCACACATCGTAGCACACAAAACTGTTGGGCAACGAAACAAGTC 

CAGAGATCAGACGGGGATGCTACGGTTTTCGAAGTGACGTACAGAGGAACACACACTTGT 

TCGCAGGCGATCACAAGAACACCACCATTAGCCTCGCCGGAGAAGCGACAAGACACCAGA 

GTCAAACCAGCCATTACCCAAAAGCCAAAGGATATTCTCGAGAGTCTTAAATCCAACTTA 

ACCGTTCGAACCGATGGGCTTGATGATGGTAAAGACGTTTTCTCGTTCCCTGATACGCCG 

CCGTTTTACAATTACGGAACTATCAACGGCGAGTTCGGCCACGTGGAGAGTTCTCCGATC 

TTCGACGTTGTTGACTGGTTCAATCCAACGGTCGAGATTGACACAACTTTCCCCGCGTTT 

TTACACGAGTCGATTTATTATTAATTAAAATTTGTAACAGAGAAATAGATAGTAACTAGT 

AAGTAATGATCAGCGAGAGTTAAAACATAAAAGTACTTAGAGTAATCTAACGATGCATAA 

TAAGGAATGTTCAACAGGACTTGAACATGATTTCAATACTAAGAGAGATTTATCTAGCTA 

CTGGTAGTAGCCGCAGACTTCTTGTTGTAGCTTCACTTNCTTTTTGTTGCTT 

ZZl 7! t 7 Amln ° Acid Sec ^ eDCe (domain in AA coordinates: 158-218) 

MEGRDMLSWEQKTLLSELINGFDAAKKLQARLREAPSPSSSFSSPATAVAETNBILVKQI 

VSSYERSLLLLNWSSSPSVQLIPTPVTWPVANPGSVPESPASINGSPRSEBFADGGGSS 

ESHHRQDYIFNSKKRKMLPKWSEKVRISPERGLEGPQDDVFSWRKYGQKDILGAKFPRSY 

YRCTHRSTQNCWATKQVQRSDGDATVFEVTYRGTHTCSQAITRTPPLASPEKRQDTRVKP 

AITQKPKDILESLKSNLTVRTDGLDDGKDVFSFPDTPPFYNYGTINGEFGHVESSPIFDV 

VDWFNPTVEIDTTFPAFLHESIYY* 

>G1782 (1..927)- 

ATGCAAGTGTTTCAAAGGAAAGAAGATTCATCTTGGGGAAACTCAATGCCTACAACAAAT 
^^^ CMGGATCTCMTCmCAGCT TGACTAAGGATATGATAATGTCTACAACA 
CAATTACCCGCGATGAAACATTCGGGTTTGCAGCTGCAAAATCAAGATTCAACCTCATCA 
C^CTACTGAAGAAGAATCAGGCGGCGGTCAAGTTGCAAGCTTTGGAGAATATAAGCGT 
TATGGATGCAGCATTGTTAATAACAATCTCTCAGGTTACATCGAAAACTTGGGAAAGCCT 
ATTGAAAATTATACTAAGTCAATTACTACCTCGTCGATGGTGTCTCT^GACTCTGTGTTT 
CCTGCTCCTACTTCTGGTCAAATATCTTGGTCTCTTCAATGTGCTGAAACGTCACATTTC 
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AATGGTTTCTTGGCTCCTGAATATGCATCMCACCAACGGCGCTGCCACATTTAGAGATG 
ATGGGTTTGGTTTCTTCAAGAGTGCCATTGCCTCATCACATTCAAGAGAATGAACCAATA 
TTTGTCAATGCGAAACAGTATCATGCGATTCTCCGTCGCAGGAAGCACCGTGCTAAACTC 
GAAGCTCAGAACAAACTCATCAAATGCCGTAAACCGTACCTTCATGAGTCTCGCCATCTT 
CATGCTTTAAAGAGAGCTAGAGGCTCCGGTGGACGTTTCCTCAATACAAAGAAGCTTCAA 
GAATCATCAAACTCACTGTGTTCTTCTCT^AATGGCAAATGGACAAAATTTCTCTATGAGC 
CCTCACGGTGGTGGAAGCGGAATCGGGTCTAGTTCGATCTCACCGAGCTCCAATTCAAAC 
TGTATCAACATGTTCCAAAACCCGCAGTTCAGATTCTCAGGTTATCCGTCAACACACCAT 
GCCTCAGCTCTCATGTCAGGGACTTGA 

>G1782 Amino Acid Sequence (domain in AA coordinates: 166-238) 

MQVFQRKEDSSWGNSMPTTNSNIQGSESFSLTKDMIMSTTQLPAMKHSGLQLQNQDSTSS 

QSTEEESGGGEVAS FGEYKRYGCS I VNNNLSGY IENLGKP IENYTKS I TTS SMVSQDS VF 

PAPTSGQISWSLQCAETSHFNGFLAPEYASTPTALPHLEMMGLVSSRVPLPHHIQENEPI 

FWAKQYHAILRRRKHRAKLEAQNKLIKCRKPYLHESRHLHALKRARGSGGRFLNTKKLQ 

ESSNSLCSSQMANGQNFSMSPHGGGSGIGSSSISPSSNSNCISMFQNPQFRFSGYPSTHH 

ASALMSGT* 

>G184 (327.. 1937) 

TGAATTCTAGCCTTTTTGTAGGCGAATCATCTGGACCGGTAAGAGACTCTCTCATCGATA 

ATAACCACATAATTTAATCAAACTCTTTCTCTCTCTTTCTAAGATCTTTTGCTTTGCTCT 

TTTCCTTTTTGATCTTCCTATATATGGAGAAGCACCAAAACGGTACTTACTATACGATAC 

TGTACGGATCCATCAAACTGGATTAATTATCAAAACGTACATTTTTATCTTACCTGGCAA 

GTTACATTCCTAGGGTTTTGGAGAATCCAATCAACAACAAAGAAAATAATCATCGTTACA 

ATAATCAGTATCACGCACAGACTTAGATGTTCCGGTTTCCAGTGAGTCTAGGCGGTTCAC 

GTGACGAAGACCGTCACGATCAGATCACACCGTTGGATGACCATCGTGTGGTGGTTGATG 

AGGTTGACTTCTTCTCAGAGAAGAGAGATAGGGTTTCACGTGAGAACATCAACGACGACG 

ACGACGAAGGCAATAAGGTTCTCATCAAAATGGAGGGTTCACGAGTTGAAGAAAACGATC 

GTTCCAGAGATGTCAATATCGGTCTGAATCTTCTGACCGCGAATACGGGAAGCGATGAGT 

CAACGGTGGATGATGGACTATCAATGGATATGGAAGATAAACGTGCAAAGATTGAGAACG 

CACAACTACAAGAAGAGCTCAAGAAGATGAAAATAGAGAATCAAAGGCTAAGAGATATGT 

TGAGCCAAGCGACGACCAACTTCAATGCCTTACAAATGCAACTTGTTGCCGTCATGAGGC 

AACAAGAACAACGTAACTCTTCACAAGATCATCTCCTGGAGAGCAAAGCAGAAGGAAGGA 

AACGGCAGGAACTGCAAATCATGGTGCCAAGGCAGTTCATGGACCTTGGGCCGTCGTCTG 

GAGCAGCAGAGCATGGAGCCGAAGTGTCATCTGAAGAGAGGACAACGGTTCGTTCAGGTT 

CTCCTCCTTCGCTTCTAGAAAGTTCCAATCCCCGAGAGAACGGAAAGAGGTTGCTTGGAA 

GAGAAGAAAGCTCAGAGGAATCAGAGTCTAACGCCTGGGGAAACCCTAACAAAGTCCCCA 

AACATAATCCATCCTCTAGCAATAGCAATGGAAACAGAAACGGAAATGTTATTGATCAGT 

CGGCCGCAGAAGCCACCATGCGGAAAGCCCGTGTCTCAGTTCGTGCCCGATCTGAAGCTG 

CCATGATAAGCGATGGATGTCAATGGAGAAAGTACGGACAAAAAATGGCTAAAGGAAACC 

CGTGTCCGCGGGCTTATTATCGTTGCACAATGGCCGGTGGATGTCCAGTTCGCAAGCAAG 

TGCAGCGTTGCGCAGAAGACAGATCTATTCTCATAACCACCTACGT^AGGAAACCACAACC 

ATCCACTCCCACCAGCCGCTACGGCCATGGCCTCAACAACCACCGCAGCTGCAAGCATGC 

TCCTCTCGGGCTCAATGTCGAGTCAAGACGGTTTAATGAACCCAACAAACCTCCTAGCTC 

GAGCTATCTTGCCTTGCTCCTCAAGC^TGGCTACAATCTCAGCCTCCGCACCATTCCCAA 

CCATCACATTGGACCTCACCAATTCACCCAACGGTAACAACCCTAATATGACCACTAATA 

ACCCGTTGATGCAGTTCGCTCAACGGCCCGGTTTCAACCCGGCAGTTTTGCCTCAAGTGG 

TTGGTCAAGCTATGTACAATAACCAAGAACAGTCCAAGT^ 

CTCAGCCACTGCAGATCGCGGCCACTTCCTCGGTGGCCGAGAGCGTTAGTGCTGCCAGTG 

CAGCAATTGCGTCCGATCCAAACTTTGCGGCGGCTCTAGCGGCAGCGATCACGTCCATTA 

TGAACGGTTCCAGTCATCAAAATAATAACACCAATAATAATAATGTCGCTACGAGCAACA 

ATGACAGTAGGCAATAAGAGTTTTCATTTTGATGGTCGATTTTTTl"lU"rTGGOT 

>G184 Amino Acid Sequence (domain in AA coordinates: 295-352) 

MFRFPVSLGGSRDEDRHDQ'ITPLDDHRVVVDEVDFFSEKRDRVSRENINDDDDEGNKVLI 

KMEGSRVEENDRSRDVNIGLNLLTANTGSDESTVDDGLSMDMEDKRAKIENAQLQEELI^ 

MKIENQRLRDMLSQATTNFNALQMQLVAAHyiRQQEQRNSSQDHLLESKAEGRKRQELQIW 

PRQFMDLGPSSGAABHGAEVSSEERTTVRSGSPPSLLESSNPRENGKRLLGREESSEESE 

SNAWGNPNKVPK^PSSSNSNGNI^GNVIDQSAAEATMRKARVSVI^SEAAMISDGCQW 

RKYGQKMAKGNPCPRAYYRCTMAGGCPWKQVQRCAEDRSILITTYEGIIHNHPLPPAATA 
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MASTTTAAASMLLSGSMSSQDGLMNPTNLLARAILPCSSSMATISASAPFPTITLDLTNS 

PNGNNPNMTTNWPLMQFAQRPGFNPAVLPQWGQAMYNWQQQSKFSGLQLPAQPLQIAAT 

SSVAESVSAASAAIASDPNFAAALAAAITSIMNGSSHQNNNTNNNOTATSNNDSRO 
>G1845 {111.. 989) 

AAGACATAATTTTCTCTGTTTTCCTAGCTCTCTCCTCTCAAATTCTTCCATTGCTCTCTG 

TTTTGGCAAATCGTGAACTGCCACGTCTTTAAGGCATCAGTGAAGCAAAGATGGACTTTG 

ACGAGGAGCTAAATCTTTGTATTACGAAAGGTAAAAATGTTGATCATTCTTTTGGAGGAG 

AAGCTTCTTCCACGTCCCCAAGATCTATGAAGAAAATGAAGAGTCCTAGTCGTCCTAAAC 

CCTATTTCCAATCCTCTTCTTCTCCTTATTCGTTAGAGGCTTTCCCTTTTTCTCTCGATC 

CAACACTTCAGAATCAGCAACAACAACTCGGATCATACGTTCCGGTACTTGAGCAACGAC 

AAGACCCGACAATGCAAGGCCAGAAGCAAATGATCTCCTTTAGTCCTCAACAACAACAAC 

AGCAGCAGCAGTATATGGCCCAGTACTGGAGTGACACATTGAATCTGAGTCCAAGAGGAA 

GAATGATGATGATGATGAGCCAAGAAGCTGTTCAACCTTACATCGCAACGAAGCTGTACA 

GAGGAGTGAGACAACGTCAATGGGGAAAATGGGTCGCAGAGATCCGTAAGCCACGAAGCA 

GGGCACGTCTTTGGCTTGGTACCTTTGATACAGCTGAAGAAGCTGCCATGGCCTACGACC 

GCCAAGCCTTCAAATTACGAGGCCACAGCGCAACACTGAATTTCCCGGAGCATTTTGTGA 

ATAAGGAAAGCGAGCTGCATGATTCAAACTCGTCGGATCAGAAAGAACCTGAAACGCCAC 

AGCCAAGCGAGGTTAACTTGGAGAGCAAGGAACTACCGGTGATTGATGTTGGGAGAGAGG 

AAGGTATGGCTGAGGCATGGTACAATGCCATTACATCGGGATGGGGTCCTGAAAGTCCTC 

TTTGGGATGATTTGGATAGTTCTCATCAGTTTTCATCAGAAAGCTCATCTTCTTCTCCTC 

TCTCTTGTCCTATGAGGCCTTTCTTTTGAAAAAGTTTATAAACCCACATTGTGTTGTAGG 

TTATAGTTTAGGGTTATGCTCATTGGCATTTGGATGGAGGCAATTTTTGTGATCTCCCAT 

TCCACCACATATCAGTCATTATATGTGTCTACCTTTTCTCTGTATTTCTATCATTATCAT 

TGTTTTTATTATGTGTCTGTATGTGTTTCCCTATTGCTACATACATAGATGTCCTCTTTG 
TTCAAAAAAAAAAAAAAAAAAAAAAA 

>G1845 Amino Acid Sequence (domain in AA coordinates: 140-207) 

MDFDEELNLCITKGKNVDHSFGGEASSTSPRSMKKMKSPSRPKPYFQSSSSPYSLEAFPF 

SLDPTLQNQQQQLGSYVPVLEQRQDPTMQGQKQMISFSPQQQQQQQQYMAQYWSDTLNLS 

PRGRMMMMMSQEAVQPYIATKLYRGVRQRQWGKWVAEIRKPRSRARLWLGTFDTAEEAAM 

AYDRQAFKLRGHSATLNFPEHFVNKESELHDSNSSDQKEPETPQPSEVNLESKELPVIDV 

GREEGMAEAWYNAITSGWGPESPLWDDLDSSHQFSSESSSSSPLSCPMRPFF* 
>G1879 (3. .917) 

AAATGCCCTTAGAGGCTGTCGTATACCCGCAAGATCCATTCGGATATCTCTCCAATTGCA 

AAGATTTTATGTTCCACGACTTATACTCTCAAGAAGAGTTCGTAGCTCAAGATACGAAGA 

ACAACATTGATAAGTTAGGGCATGAACAGAGCTTTGTGGAACAAGGTAAGGAGGACGATC 

ATCAATGGCGAGACTATCATCAGTATCCTTTGTTGATCCCTTCGTTGGGAGAAGAGCTTG 

GTCTTACCGCCATTGATGTGGAGAGTCATCCTCCTCCACAGCACCGGAGGAAGAGGAGGA 

GAACGAGAAACTGCAAGAACAAGGAAGAGATCGAGAACCAGAGAATGACTCACATCGCCG 

TCGAGAGAAATCGCCGGAAACAGATGAACGAGTATCTGGCTGTGCTCCGTTCTCTAATGC 

CGTCGTCGTATGCTCAAAGAGGAGATCAAGCGTCGATAGTAGGAGGAGCTATAAACTACG 

TGAAGGAGTTAGAGCATATTTTACAATCTATGGAGCCGAAGAGAACTAGGACTCATGATC 

CCAAAGGAGACAAGACTAGCACTAGCTCGTTAGTGGGTCCATTCAC^GATTTTTTCAGCT 

TCCCACAATATTCTACAAAGTCATC^TCAGATGTACCGGAAAGCTCATCTTCACCGGCGG 

AGATAGAGGTTACGGTGGCAGAAAGCCATGCGAACATCAAGATAATGACGAAGAAGAAAC 

CGAGGCAGCTTCTTAAGCTCATAACTTCTTTACAAAGCCTAAGGCTCACTCTTCTTCATC 

TCAATGTCACCACTCTCCACAACTCCATTCTCTACTCCATCAGCGTCAGGGTTGAAGAAG 

GAAGCCAACTGAATACCGTGGACGACATTGCAACAGCTTTGAATCAAACCATAAGGAGGA 

TTCAAGAAGAGACA'TAATTCAGCAAATAGATTATAATTAACTTGT^ 

TTTTGAAATAACTGAAATCAGTTTTCTAATTTTTTTTTTTTTTCAC 

TCCCTATGTAAGTTGCATTTTTGTCTCTTGTAATGAATCAATGGTCATAAAGATCTGAAC 
AAAAAAATTGAATAAAAGAAAATGGTT 

>G1879 Amino Acid Sequence (domain in AA coordinates: 107-176) 
MPLEAVVYPQDPFGYLSNCKDFMFHDLYSQEEFVAQDra 

QWRDYHQYPLLIPSLGEELGLTAJDVESHPPPQHRRKRRRTRNCKNKEEIENQRMTHIAV 
ERimRKQmEYLAVLRSLMPSSYAQRGDQASIVGGAINYVKELEHILQSMEPKRTRTHDP 
KGDKTSTSSLVGPFTDFFSFPQYSTKSSSDVPESSSSPAEIEVTVAESHANIKI^!TKKKP 
RQLLKLITSLQSLRLTLLHLNVTTLHNSILYSISVRVEEGSQI^TVDDIATAI^QTIR^ 
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QEET* 

>G1888 (1.-729) 

ATGAAGATTTGGTGTGCTGTTTGTGATAAAGAAGAAGCTTCGGTGTTTTGTTGTGCGGAT 
GAAGCAGCTCTTTGTAATGGTTGCGATCGCCATGTTCATTTCGCCAATAAACTAGCCGGG 
AAACATCTCCGGTTCTCTCTCACTTCTCCTACTTTCAAAGATGCTCCTCTTTGTGATATT 
TGCGGGGAGAGGCGTGCATTATTATTTTGCCAAGAAGACAGAGCAATACTATGCAGAGAA 
TGTGACATTCCAATACATCAAGCTAATGAGCACACTAAGAAACACAATAGATTCCTCCTT 
ACCGGCGTTAAGATCTCTGCCTCCCCGTCAGCCTACCCAAGAGCCTCCAATTCCAACTCT 
GCTGCTGCATTTGGTCGAGCCAAAACCCGACCAAAATCAGTATCGAGCGAGGTCCCGAGC 
TCGGCCTCCAATGAGGTATTTACGAGCTCTTCTTCGACGACCACGAGCAATTGCTATTAT 
GGGATAGAAGAAAACTACCATCACGTGAGCGATTCGGGGTCGGGATCGGGTTGTACAGGT 
AGTATATCCGAGTATTTGATGGAGACATTACCGGGTTGGAGAGTGGAGGATTTGCTTGAA 
CACCCTTCTTGTGTCTCCTATGAGGATAACATTATTACTAATAACAATAACAGTGAGTCT 
TATAGGGTTTATGATGGTTCTTCACAATTCCATCATCAAGGGTTTTGGGATCACAAACCC 
TTCTCTTGA 

>G1888 Amino Acid Sequence (domain in aa coordinates: 5-50) 

MKIWCAVCDKEEASVFCCADEAALCNGCDRHVHFANKLAGKHLRFSLTSPTFKDAPLCDI 

CGERRALLFCQEDRAILCRECDIPIHQANEHTKKHNRFLLTGVKISASPSAYPRASNSNS 

AAAFGRAKTRPKSVSSEVPSSASNEVFTSSSSTTTSNCYYGIEENYHHVSDSGSGSGCTG 

S I SE YLMETLPG WRVEDLLEHPSCVS YEDN 1 1 TNNNN S ES YRVYDGS SQFHHQGFWDHKP 

FS* 

>G189 (34.. 987) 

CCACAACTCTCTCCTTGTAGAGAGAGAGATTTTATGGCGGTGGAGCTCATGACTCGGAAT 
TACATCTCCGGCGTCGGAGCTGATAGCTTCGCCGTTCAAGAAGCAGCTGCTTCAGGACTC 
AAAAGTATCGAAAATTTCATCGGTTTAATGTCTCGTGATAGCTTTAACTCTGATCAGCCA 
TCTTCTTCTTCCGCCTCCGCCTCCGCCTCCGCCGCCGCAGATCTTGAATCAGCTCGTAAC 
ACAACGGCGGACGCGGCTGTTTCAAAGTTTAAAAGAGTCATATCTCTCTTAGATCGAACT 
CGAACCGGACACGCCCGGTTTAGACGTGCTCCGGTTCATGTTATTTCTCCGGTTCTTTTA 
CAAGAAGAACCAAAAACGACGCCGTTTCAGTCTCCTCTTCCTCCTCCGCCGCAAATGATC 
CGAAAAGGTTCGTTTTCTT.CATCGATGAAAACGATTGATTTCTCATCTCTCTCCTCTGTA 
ACAACGGAATCAGACAACCAGAAGAAGATTCATCATCATCAACGTCCCTCTGAAACGGCG 
CCGTTTGCGTCTCAAACTCAAAGCCTCTCCACGACGGTCTCGTCTTTCTCAAAATCAACA 
AAGAGAAAATGTAACTCTGAGAATCTTCTCACCGGAAAATGCGCTTCCGCTTCTTCCTCC 
GGTCGTTGTCATTGCTCGAAGAAAAGAAAGATAAAACAGAGGAGAATAATTAGGGTTCCG 
GCGATAAGTGCAAAAATGTCCGATGTACCACCGGACGATTATTCATGGAGGAAATACGGA 
CAAAAACCAATTAAAGGATCTCCACATCCAAGAGGATATTATAAGTGTAGTAGCGTAAGA 
GGTTGTCCAG CACGTAAACATGTTGAGAGAGCAG CTGATGATTCGTCCATGTTGATTGTT 
ACTTATGAAGGAGATCATAATCATTCTCTCTCCGCCGCTGATCTCGCCGGAGCCGCCGTT 
GCTGATCTTATTTTGGAATCGTCTTGAAAAGAACAAATCTTTATTTAAGGCTTTTATAAT 
ATAAATTTAGATCCTTACTTAGTGAAGTACTCAAACTATGAATGAAATCAATGTAATCAA 
AATCAAAAAGCTTTTGCTAAAAAAAAAAAAAAAAA 

>G189 Amino Acid Sequence (domain in AA coordinates: 240-297) 
MAVELMTRNYISGVGADSFAVQEAAASGLKSIENFIGLMSRDSFNSDQPSSSSASASASA 
AADLES ARNTTADAAVSKFKRVI SLLDRTRTGHARFRRAPVHVI S PVLLQEEPKTTPFQS 
PLPPPPQMIRKGSFSSSMKTIDFSSLSSVTTESDNQKKIHHHQRPSETAPFASQTQSIiST 
TVS S FS KSTKRKCNSENLLTGKCAS AS S SGRCHCS KKRKI KQRRI IRVPAI S AKMSDVPP 
DDYSWRKYGQKPIKGSPHPRGYYKCSSVRGCPARKHVERAADDSSMblVTYEGDHNHSLS 

AADIAGAAVADLILBSS* 
>G1939 (92.. 844) 

AATCATTAGCTTCTTeTCTTCTCTTCTCTCACAGAGAGAGTAATCACAAGCCAAGTGAGA 
AAAAGAAAACACTAAACCCAGATCGAAAACCATGTCTATTAACAACAACAACAACAACA^ 
CAACAATAACAACGATGGTCTTATGATCTCATCAAACGGAGCTTTAATCGAACAACAACC 
ATCAGTCGTTGTGAAGAAACCACCGGCGAAAGATCGACATAGCAAAGTCGATGGAAGAGG 
GAGAAGAATCCGTATGCCGATTATATGTGCTGCTCGTGTTTTTCAGCTAACGAGAGAGCT 
TGGTCATAAGTC^GATGGCCAAACAATTGAATGGTTACTTCGTCAAGCAGAGCCTTCTAT 
TATAGCTGCAACAGGAACTGGTACAACTCCAGCGAGTTTCTCAACTGCTTCTGTCTCTAT 
CCGTGGAGCCACCAATTCTACTTCTTTAGATCATAAACCC^CTTCTTTACTTGGTGGTAC 
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GTCACCGTTTATACTTGGGAAACGTGTTAGAGCTGATGAGGATAGTAATAATAGTCATAA 
TCATAGTTCTGTTGGTAAAGATGAGACCTTTACGACAACACCAGCTGGGTTTTGGGCTGT 
TCCGGCGAGGCCGGATTTTGGACAAGTTTGGAGTTTTGCTGGAGCTCCACAAGAGATGTT 
TTTACAACAACAACATCATCATCAGCAACCATTGTTTGTTCATCAGCAACAGCAACAACA 
AGCTGCAATGGGTGAAGCTTCTGCTGCTAGAGTTGGGAATTATCTTCCGGGTCATCTTAA 
TTTGCTTGCTTCTTTATCCGGTGGATCTCCCGGGTCGGATCGAAGAGAGGAAGATCCACG 
TTAATGGTTTAAGCCCTTTTAGGTTTGAGGGCAAAATTTGGTATATATATTTATTATCTT 
CTCTTCTCTATTGTTGTCATTGTTTCTCTATGTGTGTGTTTTAGTGTTGTTAGAGATTGA 
TTTGGTTTCAG AATCTCTG C AAGTGATTTG AG AGTTTTCGTTAG CTTTAAGTAAGTTAAA 

GACGGTTGTTTTTGATTAGGGTTAAATTAGGGTTTAAGAATCTGTTGTTTTTTTGGAGGG 
AGATCGATTTCTTATCGGATCCAAGATTACTTTTAGGAAAAAAGGGAAAATTTCAGAAAC 
CACGGTGGTTTCTTTTCCTCTTTTTTTTTTTG 

>G1939 Amino Acid Sequence (domain in AA coordinates: 40-102) 
MSINNWNNWNN^ 

ARVFQLTRELGHKSDGQTIEWLLRQAEPSIIAATGTGTTPASFSTASVSIRGATNSTSLD 

HKPTSLLGGTSPFILGKRVRADEDSNNSHNHSSVGKDETFTTTPAGFWAVPARPDFGQVW 

SFAGAPQEMFLQQQHHHQQPLFVHQQQQQQAAMGEASAARVGNYLPGHLNLLASLSGGSP 
GSDRREEDPR* 

>G194 (192.. 1205) 

TCTTTCTTCTCTCTCTATCTCTCCTCTTTGAACCCTAAAAACTCTTTCTTTACAAGGATT 

GATCTTTTTGTATTTTTGATTTTGACATTTGCTTTGTGTTCGATCTCTGTTTTGATGCGA 

TTTCTCTGTTTTTAAAGCCATTTGATAGATTGTTTCCGGTAAAGCTCAGCGAGAGAAGAA 

GAAG7VACAACAATGGAGTTTACAGATTTCTCAAAGACGAGTTTTTACTACCCGTCGTCAC 

AAAGCGTTTGGGATTTCGGAGATTTAGCGGCGGCGGAGAGGCATTCTTTAGGGTTCATGG 

AGTTATTAAGTTCTCAGCAGCATCAAGACTTTGCTACTGTTTCTCCTCATTCCTTCCTTC 

TCCAAACGTCTCAACCGCAAACGCAAACGCAACCATCGGCGAAGCTGTCTTCAAGTATCA 

TTCAAGCTCCACCGTCAGAGCAATTAGTGACGTCAAAGGTGGAGTCTTTGTGTTCGGATC 

ATTTGTTGATAAACCCACCGGCGACTCCTAACTCGTCATCGATTTCGTCTGCTTCAAGCG 

AGGCTCTAAATGAAGAGAAACCGAAAACAGAAGACAATGAAGAAGAAGGAGGTGAAGATC 

AACAAGAGAAGAGTCATACTAAGAAACAGTTGAAAGCAAAGAAGAATAATCAGAAGAGAC 

AGAGAGAGGCAAGAGTCGCATTCATGACAAAGAGTGAAGTTGATCATCTCGAAGATGGTT 

ATCGCTGGCGAAAATATGGT^AAAAAGCTGTCAAAAACAGTCCTTTTCCCAGGAGTTACT 

ACCGTTGCACAACGGCTTCATGTAACGTGAAGAAGAGAGTGGAGAGATCATTCAGAGATC 

CAAGCACTGTGGTTACAACCTACGAAGGTCAACACACTCACATTAGTCCACTCACGTCTC 

GTCCTATTTCCACTGGAGGTTTCTTCGGATCGTCAGGAGCTGCTTCGAGTCTCGGTAATG 

GTTGCTTTGGGTTTCCTATTGATGGCTCCACGTTAATCTCTCCTCAGTTCCAACAGCTTG 

TCCAATACCATCACCAACAGCAGCAACAAGAACTCATGTCTTGTTTTGGAGGAGTCAACG 

AGTACCTTAATAGCCACGCTAATGAGTATGGTGATGATAATCGTGTGAAGAAGAGTCGAG 

TTTTGGTTAAAGATAATGGACTTCTGCAAGATGTTGTTCCGTCTCATATGTTGAAGGAAG 

AGTAGTAGTATATATATAGTCTTATAGTTTTAATCTAGTTTTTTTTTGTATAATTGTCTA 

AAAGAAACGGATCTTTTGTTCTGATGiu^GAAGATGTTTTCTTATGGTTCTGAAATCGTAA 

GGTAATGATGATTGTACCAAGCCGAGAAAGTACTTGTGATTTTCACCATTGAATCACTAT 
AAATGTAATTTTTATTTACTGTGAAAAAAAAAAAAAAA 

>G194 Amino Acid Sequence (domain in AA coordinates- 174-230) 
MEFTOFSKTS FYYPS SQS VWDFGDLAAAERHSLGFMELLS S QQHQDFATVS PHS FLLQTS 
QPQTQTQPSAKLSSSIIQAPPSEQLVTSKVESLCSDHLLINPPATPNSSSISSASSEALN 
EEKPKTEDNEEEGGEDQQEKSHTKKQLKAKKNNQKRQREARVAFMTKSEVDHLEDGYRWR 
KyGQKAVKNSPFPRSYYRCTTASCOTKKRVERSFRDPSTWTTYEGQHTHISPLTS 
TGGFFGS SGAAS SLGNGCFGFPIDGSTLI S PQFQQLVQYHHQQQQQELMSCFGGVNE YLN 
SHANEYGDDNRVKKSRVLVKDNGLLQDWPSHMLKEE * 
>G1943 (137.. 1858) 

ACATTTGTTTCTAATCTCAGACATAAATAATTTTTGTTCCC 
ATTATATCATTCCACATTCATTTTCTT^ 

AGAAAATCCATCTATCATGGGTGAAGATGATATAGTGGAGCTCTTATGGAAGAGTGGCCZA 
AGTCGTTAGAAC CAGTCAAACACAGAGACCCTCCTCCAATACAC CACCATCTCTTCCTCC 

ACCACCCATTCTTCGTGGTAGCGGAAGCGGCAACGGAGAAGAAAATGCCCCGCTTCCACT 
TCCACAGCCTTCACCTCCCCTCCATCATCAGAATCTTTTCATTCTGGAAGACGAAATGTC 
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TTCTTGGCTTCACCATTCTCACCCCGGCGTTACGTCCACCCCGGCTTCTTCTGTCTCCCT 

GCCACCACCACCCAATGCTCCGCGTGAAGATGATATAGTGGAGCTTTTATGGCAAAGCGG 

CCAAGTAGTTGGAACCAACCAAACACATAGACAATCCTACGATCCTCCTCCCATTCTCCG 

CGGCAGCGGAAGTGGCAGAGGAGAAGAAAATGCTCCCCTTTCACAACCTCCGCCTCACCT 

GCATCAGC AAAATCTCTTC ATTC AAGAAGG CG7\AATGTATTCGTGGCTAC AC CATTCTTA 

CCGCCAAAACTATTTCTGCTCAGAACTTCTCAACTCCACTCCGGCTACTCACCCGCAAAG 

TTCCATCTCTCTGGCACCACGTCAGACTATCGCCACGAGAAGGGCGGAAAACTTTATGAA 

CTTCTCGTGGCTAAGAGGGAACATATTTACCGGCGGTAGAGTTGATGAAGCTGGACCGTC 

GTTTTCGGTGGTAAGAGAATCGATGCAGGTAGGCTCGAACACGACCCCCCCTTCTTCTTC 

TGCCACTGAATCATGTGTAATACCAGCTACAGAGGGCACCGCGAGTCGAGTGTCGGGAAC 

TTTGGCAGCTCATGATCTTGGTCGGAAGGGAAAGGCGGTGGCGGTTGAGGCGGCCGGAAC 

ACCATCTTCAGGAGTGTGCAAGGCCGAAACAGAGCCGGTTCAGATACAACCAGCAACGGA 

GTCGAAGCTAAAAGCGAGAGAAGAAACCCATGGAACTGAAGAAGCTCGTGGTTCAACGTC 

TAGAAAGAGATCACGAACTGCAGAAATGCATAACCTCGCCGAAAGGAGAAGGAGAGAAAA 

GATCAACGAGAAGATGAAGACTCTGCAACAACTCATTCCTCGCTGCAACAAGGTTGAATC 

TGATTCTGTTTCTACTCTGATCAGTCTACTAAAGTTTCAACGCTGGATGATGCTATCGAG 

TACGTCAAATCGTTACAGAGCCAAATACAAGTATGCTCTTCAAAACAGAATGTGTTTTAA 

ACCAATGGTTCAACATGGAAAGAGTTCATATGTATCTAGTTTTGTTGAGATGATGTCGAC 

GGGACAGGGTATGATGTCGCCAATGATGAATGCCGGGAATACGCAACAGTTCATGCCCCA 

TATGGCCATGGATATGAACCGACCTCCTCCATTCATACCTTTCCCCGGCACATCTTTTCC 

TATGCCGGCTCAAATGGCAGGTGTAGGTCCATCATATCCAGCACCGCGCTACCCTTTTCC 

CAACATTCAGACCTTTGACCCATCCAGAGTCCGTTTACCAAGCCCGCAGCCTT^ACCCGGT 

GTCGAACCAGCCTCAGTTTCCGGCTTACATGAATCCCTATAGCCAGTTTGCTGGTCCCCA 

CCAGTTGCMCAACCTCCTCCTCCTCCATTTCAGGGTCAAACAACATCACAACTGAGTTC 

CGGGCAGGCAAGTAGTAGCAAGGAACCTGAGGATCAGGAGAACCAACCAACAGCTTAGTT 

AAAGTGTGGAGCTGAAACGGATCAGTTCTTCAAGCAAATTACT^ACTTTGAAGATTVAACCA 

GAGTTGTAACATGTAGATTTTGTCTGTTAAGTTTAATGTAAGTACTTTTTAGTTAATGGG 

AAAGATACTGACAGGTTGCAAGGTGGTCAGTATTTGTGCATCACGCTTAAGATTCCTCGA 

TGTGGCCAGTATCTCCCTTTTCTAGCATGTGAGGTCCCTACTCTCTGGTTCTACGGAGAC 

CAAATGTTCGACTGATTAAACACACAATGACTTACCAAAAGTACACGCGGCCCATCCTCG 

TCTTTATGTTCCAAGTGCGACTGTTTGTTTATTTGTAAGCATTTTTCTTATAATAATAAA 

ACAGCTCTATCTTCGTTAAAAAAAA 

>G1943 Amino Acid Sequence (domain in AA coordinates: 335-406) 

MGEDDIVELLWKSGQWRTSQTQRPSSNTPPSLPPPPILRGSGSGNGEENAPLPIiPQPSP 

PLHHQNLFILEDEMSSWLHHSHPGVTSTPASSVSLPPPPNAPREDDIVELLWQSGQWGT 

NQTHRQSYDPPPILRGSGSGRGEENAPLSQPPPHLHQQNLFIQEGEMYSWLHHSYRQNYF 

CSELLNSTPATHPQSSISLAPRQTIATRRAENFMNFSWLRGNIFTGGRVDEAGPSFSWR 

ESMQVGSNTTPPSS SATES CVI PATEGTASRVSGTLAAHDLGRKGKAVAVEAAGTPS SGV 

CKAETEPVQIQPATESKLKAREETHGTEEARGSTSRKRSRTAEMHNLAERRRREKINEKM 

KTLQQLIPRCNKVESDSVSTLISLLKFQRWMMLSSTS1TOYRAKYKYALQNRMCFKPMVQH 

GKS S YVS S FVEMM STGQGMMS PMMNAGNTQQFMPHMAMDMNRPPPF I PFPGTS FPMPAQM 

AGVGPSYPAPRYPFPNIQTFDPSRVRLPSPQPNPVSNQPQFPAYMNPYSQFAGPHQLQQP 

PPPPFQGQTTSQLSSGQASSSKEPEDQENQPTA* 

>G21 (79.. 966) 

TGTGGAGGAATATTAATACAGCCCACTTCACATCTATTTTTGTGCAACCATCTCTCTAAA 
GCTTCTTCTCTCATAACAATGGCAAGACAAATCAACATAGAGAGTAGTGTTTCTCAAGTT 
ACCTTTATCTCCTCCGCCATCCCCGCCGTATCTTCCTCCTCCTCCATCACCGCTTCCGCC 
TCATTGTCCTCTTCACCTACTACATCTTCCTCTTCTTCGTCATCAACAAATTCTAACTTC 
ATTGAGGAAGACAACTCTAT^AAGAAAAGCATCTCGAAGATCATTGTCATCGTTAGTCTCC 
GTTGAAGACGATGATGATCAAAACGGTGGAGGTGGGAAACGGCGAAAGACCAACGGTGGA 
GATAAACATCCGACGTATAGAGGAGTGAGGATGAGGAGTTGGGGAAAATGGGTGTCGGAG 
ATTAGAGAGCCGAGAAAGAAATCAAGAATCTGGCTCGGGACTTATCCAACGGCTGAGATG 
GCAGCTCGAGCTCATGACGTAGCGGCTTTAGCCATTAAAGGTACAACGGCTTACCTCAAT 
TTTCCCAAGTTAGCCGGCGAGCTTCCTCGTCC^GTCACAAATTCrCCTAAAGACATTC^ 
GCCGCCGCCTCTTTAGCGGCCGTTAACTGGCAAGATTCGGTCAACGATGTGAGTAATTCT 
GAAGTGGCTGAAATAGTTGAAGCCGAGCCGAGTCGAGCCGTGGTGGCTCAGTTGTTTTCT 
TCGGACACAAGCACGACGACGACGACTCAGAGTCAAGAGTATTCGGAAGCTTCGTGTGCT 
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TCGACTTCGGCGTGTACGGACAAAGACAGTGAGGAAGAGAAGCTGTTTGATTTGCCGGAT 

TTGTTTACCGATGAGAATGAGATGATGATACGAAACGATGCGTTTTGCTACTACTCGTCC 

ACGTGGCAGCTTTGTGGAGCCGATGCTGGGTTTCGGCTTGAAGAGCCGTTTTTTCTATCT 

GAATGACTAAAGTACCCCTCTCGAGAGAGCTCTCACTAACACT 

>G21 Amino Acid Sequence (domain in AA coordinates: 97-164) 

MARQINIESSVSQVTFISSAIPAVSSSSSITASASLSSSPTTSSSSSSSTNSNFIEEDNS 

KRICASRRSLSSLVSVEDDDDQNGGGGKRRKTNGGDKHPTYRGVRMRSWGKWVSEIREPRK 

KSRIWLGTYPTAEMAARAHDVAALAIKGTTAYLNFPKLAGELPRPVTNSPKDIQAAASLA 

AVNWQDSVNDVSNSEVAEIVEAEPSRAVVAQLFSSDTSTTTTTQSQEYSEASCASTSACT 

DKDSEEEKLFDLPDLFTDENEMMIRNDAFCYYSSTWQLCGADAGFRLEEPFFLSE* 

>G2132 (42.. 1031) 

ATTCTGTTACTTAGTACCGGAGTTTAGTCGGAGAGAGAACAATGATCAGTTTCAGAGAAG 
AGAACATCGATCTCAACTTGATTAAAACAATTAGTGTAATCTGTAATGATCCAGACGCCA 
CCGATTCCTCTAGCGACGATGAATCTATCTCCGGCAATAATCCTCGCCGTCAGATCAAAC 
CAAAACCACCGAAACGTTACGTCTCAAAGATCTGTGTCCCGACGCTGATCAAAAGGTATG 
AGAACGTTTCGAATTCTACAGGGAATAAAGCAGCCGGAAACCGGAAAACGTCGTCGGGTT 
TCAAAGGCGTACGACGGAGGCCGTGGGGGAAATTTGCGGCGGAGATAAGAAATCCGTTTG 
AGAAGAAGAGAAAGTGGCTTGGAACGTTTCCTACTGAAGAAGAAGCAGCAGAAGCTTACC 
AAAAGAGT7UU^GAGAGTTTGATGAACGATTGGGTTTAGTTAAACAGGAAAAAGACCTAG 
TAGATTTGACCAAGCCGTGCGGTGTACGTAAACCAGAAGAGAAGGAAGTTACTGAGAAGT 
CGAATTGCAAAAAGGTAAATAAGAGAATTGTTACTGATCAGAAGCCATTTGGTTGTGGTT 
ATAACGCTGATCATGAAGAAGAGGGAGTGATTAGTAAAATGTTGGAAGATCCGTTGATGA 
CATCGTCAATTG C TGATATTTTTGGTG ATTCGG CTGTTG AAG CAAATG ATATTTGGGTGG 

ATTACAATTCAGTGGAATTTATTTCCATTGTAGATGATTTCAAGTTTGATTTTGTGGAGA 

ATGATAGAGTAGGAAAGGAGAAAACATTTGGATTTAAGATTGGGGATCACACTAAAGTTA 

ATCAACATGCCAAAATCGTATCGACCAATGGGGACTTATTCGTCGATGATTTACTTGATT 

TTGATCCGTTGATAGATGATTTTAAGTTAGAAGATTTTCCTATGGATGATCTTGGATTAT 

TAGGAGATCCAGAGGATGATGATTTTAGTTGGTTTAATGGTACTACTGATTGGATCGATA 

AGTTTTTATGAATACTTTCTTGACACGGCCAACGGTATTAGTAC 

>G2132 Amino Acid Sequence (domain in AA coordinates: TBD) 

MISFREENIDLNLIKTISVICNDPDATDSSSDDESISGNNPRRQIKPKPPKRYVSKICVP 

TLIKRYENVSNSTGNKAAGNRKTSSGFKGVRRRPWGKFAAEIRNPFEKKRKWLGTFPTEE 

EAAEAYQKSKREFDERLGLVKQEKDLVDLTKPCGVR^ 

KPFGCGYNADHEEEGVISKMLEDPLMTSSIADIFGDSAVEANDIWVDYNSVEFISIVDDF 
KFDFVETORVGKEKTFGFKIGDHTK^NQHAKIVSTNGDLFVDDLLDFDPLIDDFKLEDFP 
MDDLGLLGDPEDDDFSWFNGTTDWIDKFL* 
>G2145 (1..777) 

ATGGACGTTTTTGTTGATGGTGAATTGGAGTCTCTCTTGGGGATGTTCAACTTTGATCAA 
TGTTCATCATCTAAAGAGGAGAGACCGCGAGACGAGTTGCTTGGCCTCTCTAGCCTTTAC 
AATGGTCATCTTCATCAACATCAACACCATAACAATGTCTTATCTTCTGATCATCATGCT 
TTCTTGCTCCCTGATATGTTCCCATTTGGTGC^^ 

CTTGATTCTTGGGATCAAAGTGATCACCTCCAAGAAACGTCTTCTCTTAAGAGGAAACTA 

CTTGACGTGGAGAATCTATGCAAAACTAACTCT71ACTGTGACGTCACAAGACAAGAGCTT 

GCGAAATCCAAGAAAAAACAGAGGGTAAGCTCGGAAAGCAATACAGTTGACGAGAGCAAC 

ACTAATTGGGTAGATGGTCAGAGTTTAAGCAACAGTTCAGATGATGAGAAAGCTTCGGTC 

ACAAGTGTTAAAGGCAAAACTAGAGCCACCAAAGGGACAGCCACTGATCCTCAAAGCCTT 

TATGCTCGGAAACGAAGAGAGAAGATTAACGAAAGGCTCAAGACAOTACAAAACCTTGTG 

CCAAACGGGACAAAAGTCGATATAAGCACGATGCTTGAAGAAGCGGTCCATTACGTGAAG 

TTCTTGCAGCTTCAGATTAAGTTGTTGAGCTCGGATGATCTATGGATGTACGCACCATTG 

GCTTACAACGGCCTGGACATGGGGTTCCAT(^C7^CCTTTTGTCTCGGCTTATGTGA 

>G2145 Amino Acid Sequence (domain in AA coordinates : 166-243) 

MDVFVDGELESLLGMFNFDQCSSSKEERPRDELLGLSSLYNGHLHQHQHHNNVLSSDHHA 

FLLPDMFPFGAMPGGNLPAMLDSWDQSHHLQETSSL^ 

AKSKKKQRVSSESNTVDESNTNWVDGQSLSNSSDDEKASVTSVKGKTRATKGTATDPQSL 
YARKRREKINERLKTLQNLVPNGTKVDI STMLEEAVHYVKFLQLQI KLLS SDDLWMYAPL 
AYWGLDMGFHHNLLSRLM* 
>G23 (22.. 732) 
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TATCAAACGAGAGTACAAAAGATGACGTCACTCAACAGCTCTGCATCACCAACATCATCG 
TCATCAGACCAATCTGATGCAACTACTACAACAAGCACCCACTTGTCTGAAGAAGAAGCT 
CCACCCAGAAACAACAACACAAGAAAGAGAAGGAGAGATTCTTCTTCTGCTTCTTCATCT 
TCTTCAATGCAACATCCTGTTTACAGAGGTGTGCGGATGAGAAGTTGGGGCAAATGGGTC 
TCCGAGATCCGACAACCTCGTAAGAAAACTCGTATTTGGCTCGGCACTTTTGTCACCGCT 
GATATGGCTGCTCGTGCTCACGACGTCGCTGCTCTCACCATCAAAGGCTCCTCCGCCGTC 
TTAAATTTCCCTGAGCTTGCTTCTCTCTTCCCTCGTCCGGCGTCATCATCGCCGCATGAT 
ATCCAGACAGCCGCCGCAGAAGCCGCCGCCATGGTGGTCGAAGAAAAACTGTTAGAGAAG 
GATGAGGCTCCGGAGGCCCCACCTTCGTCGGAATCTTCTTACGTGGCGGCGGAGTCAGAG 
GATGAGGAGAGGTTGGAGAAAATTGTGGAGCTGCCTAACATTGAAGAAGGAAGTTATGAC 
GAGAGTGTGACATCACGTGCTGATCTGGCTTATTCTGAGCCGTTCGATTGTTGGGTGTAT 
CCTCCGGTTATGGATTTTTATGAAGAAATATCGGAGTTTAATTTCGTGGAATTGTGGAGC 
TTTAATCACTAATTAAGTTAGGA7UVGTGCATTATATTGCAATATTGCATCATAGATAACA 
TTTGTATTTCTTTTCTTTTTGTACGGATACGTAGCATATGCTACTATACTAGGGCTAGTG 
TACCAAATATTGTAAAATATACTTATTAATATTTATGTAAATGTGTAATATATATAACAT 
ACAATTATTGTAAGTTTGGAAATTGGAAACTATCGTTACGCAATGTTCTTGTAAAAAAAA 
AAAAAAAAAA 

>G23 Amino Acid Sequence (domain in AA coordinates: 61-117) 

MTSLNSSASPTSSSSDQSDATTTTSTHLSEEEAPPRNNNTRKRRRDSSSASSSSSMQHPV 

YRGTOMRSWGK^A/SEIRQPRKKTRIWLGTFVTADMAAR^DVAALTIKGSSAVLNFPEIA 

SLFPRPASSSPHDIQTAAAET^AAMVVEEKLLEKDEAPEAPPSSESSYVAAESEDEERLEK 

IVELPNIEEGSYDESVTSRADLAYSEPFDCWVYPPVMDFYEEISEFNFVELWSFNH* 

>G2313 (104.. 724) 

CGTCGACACAATCGCTCTTCCGTAACATATTCCACAAAACGATCTTCTTGTTTCTTGAAT 
TTTTAGCCATCTCTTTTTTTTTTTTCTCATTTTCTCGGATACTATGGCTTCGAGTCCACG 
CTGGACGGAGGACGACAACAGGCGTTTTAAGTCAGCTCTGTCGCAATTCCCTCCGGATAA 
CAAGCGTTTGGTGAATGTCGCCCAGCATCTGCCGAAACCTTTGGAGGAGGTGAAGTACTA 
CTACGAAAAGTTGGTCAACGATGTTTATCTGCCGAAACCTTTAGAGAATGTCACCCAGCA 
TCTGCAGAAACCTATGGAAATGGAGGAGATGAAGTACATGTACGAAAAGATGGCCAACGA 
TGTTAATCAGATGCCCGAGTACGTACCACTGGCGGAATCGAGTCAGTCCA7VACGCAGGAA 
GAAGGATACGCCAAATCCTTGGACAGAAGAGGMCACAGATTGTTTCTGCAAGGATTGAA 
AAAGTATGGGGAAGGAGCTTCGACGTTGACATCAACGAATTTTGTGAAGACAAAGACTCC 
ACGGCAAGTGTCAAGCCATGCACAGTATTACAAAAGGCAAAAATCGGACAATAAGAAGGA 
GAAACGCCGGAGTATTTTTGACATAACTTTGGAGTCTACCGAGGGCAATCCAGATTCTGG 
AAATCAGAACCCTCCGGATGATGATGATCCGTCCCAAGGTCAAGGCACTTGTCTTGGAGT 
TTAGATGTTGGAAGATAGAAGAATGGTGTGAAAGC 

>G2313 Amino Acid Sequence (domain in AA coordinates: TBD) 

MASSPROTEDDNRRFKSALSQFPPDNKRLVOTAQHLPKPL 

ENVTQHLQKPMEMEEMKYMYEKMAIJDWQMPEYVPLAESSQSKR 

FLQGLKKYGEGASTLTSTNFVKTKTPRQVSSHAQYYKRQKSDNKKEKRRSIFDITLESTE 

GNPDSGNQNPPDDDDPSQGQGTCLGV* 

>G2344 (1..573) 

ATGACTTCTTCAATCCATGAGCTTTCTGATAACATTGGAAGTCATGAGAAGCAAGAACAG 
AGAGATTCTCATTTCCAACCACCAATCCCTTCTGCAAGAAATTATGAATCAATTGTTA^ 
AGTTTAGTCTACTCAGACCCGGGGACTACAAATTCCATGGCACCTGGACAATATCCATAT 
CCAGATCCTTACTACAGAAGCATATTTGCACCGCCTCCACAACCGTATACCGGGGTACAT 
CTACAGTTGATGGGAGTGCAGCAACAAGGCGTTCCTTTACCATCTGATGCAGTCGAGGAA 
CCTGTTTTTGTTAACGCAAAG CAATACCACGGTATACTAAGG CG CAG ACAATCAAGAG CA 
AGACTTGAGTCTC^GAATAAAGTCATC^GTCACGTAAGCCGTATTTGCATGAATCTCGG 
CATTTG^TGCGATAAGACGACCAAGAGGATGTGGCGGGCGGTTTCTAAATGCCAAGAAG 
GAGGATGAGCATCACGAAGACAGTAGTCATGAAGAAAAATCCAACCTTAGCGCTGGTAAA 
TCCGCCATGGCTGCTTCTAGTGGTACATCTTGA 

>G2344 Amino Acid Sequence (domain in AA coordinates: TBD) 

MTSSIHELSDNIGSHEKQEQRDSHFQPPIPSARNYESIVTSLVYSDPGTTNSMAPGQYPY 

PDPYYRSIFAPPPQPYTGVHLQLMGVQQQGVPLPSDAVEBPVFVNAKQYHGILRRRQSRA 

RLESQNKVIKSRKPYLHESRHLHAIRRPRGCGGRFLNAKKEDEHHEDSSHEEKSNLSAGK 

SAMAASSGTS* 
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>G2430 {69.. 1907) 

AACTTCAACATACACATAATCTCTCACTTAAAAATATCTCTCTCTCTCTCTCTACAAAAT 
CAATTCCAATGTTGGTGGGAAAGATAAGTGGATATGAAGATAATACTCGCTCTTTGGAGC 
GAGAAACATCTGAAATCACTTCTCTTCTCAGCCAATTTCCGGGGAATACTAATGTCCTTG 
TTGTTGACACCAATTTCACCACTCTACTCAACATGAAACAAATCATGAAACAATACGCTT 
ATCAAGTGTCTATTGAGACAGATG CAGAAAAAGCTCTTGCGTTTTTGACAAG CTGCAAAC 

ATGAAATCAATATTGTGATTTGGGATTTTCATATGCCTGGAATTGATGGACTTCAAGCTC 

TCAAGAGCATTACTTCAAAGTTGGATTTACCTGTAGTGATTATGTCTGATGATAATCAAA 

CGGAATCTGTGATGAAAGCAACATTTTACGGTGCTTGTGACTATGTTGTGAAACCGGTT^ 

AAGAAGAGGTAATGGCCAATATATGGCAACACATTGTACGGAAGAGGCTGATCTTTAAAC 

CGGATGTTGCTCCACCGGTTCAATCAGATCCGGCTCGCTCTGACCGTTTAGACCAAGTCA 

AAGCTGATTTCAAGATCGTAGAAGATGAACCAATAATCAATGAGACACCGCTGATCACAT 

GGACCGAAGAAATTCAACCGGTTCAGTCAGATCTGGTTCAAGCCAACAAGTTCGACCAAG 

TGAATGGCTATTCCCCAATCATGAACCAAGATAACATGTTCAACAAAGCACCACCTAAAC 

CGCGAATGACGTGGACAGAAGTTATTCAACCGGTTCAATCAAATCTGGTTCAAACAAAAG 

AGTTCGGCCAACTCAATGACTATTCCCA7UVTCATGAACGAAGATAGCATGTACAACAAAG 

CAGCAACCAAACCACAATTGACGTGGACCGAAGAAATTCAACCGGTTCAATCAGGTCTGG 

TTCAAGCCAACGAGTTCAGCAAAGTGAATGGATATTCCCAAAGCATGAACCAAGATAGCA 

TGTTCAACAAATCAGCAACCAACCCGCGATTGACATGGAACGAATTACTTCAACCGGTTC 

AATCAGATCTGGTTCAATCCAATGAGTTTAGCCAATTCAGTGACTATTCTCAAATCATGA 

ACGAAGATAACATGTTCAACAAAGCAGCAAAGAAACCGCGGATGACATGGAGTGAAGTAT 

TTCAACCGGTTCAATCACATCTGGTTCCGACTGACGGTTTAGACCGAGACCACTTTGATT 

CCATAACCATAAACGGAGGTAACGGCATACAAAACATGGAAAAGAAACAAGGAAAAAAAC 

CACGGAAGCCGCGGATGACGTGGACCGAAGAGCTTCACCAAAAATTTCTGGAAGCCATCG 

AAATAATTGGTGGTATCGAAAAAGCTAACCCAAAGGTACTTGTCGAATGCTTGCAAGAAA 

TGAGGATAGAAGGAATTACTAGAAGCAATGTGGCAAGTCATCTTCAGAAACACCGTATCA 

ATCTTGAAGAAAACCAAATTCCTCAACAAACACAAGGGAATGGTTGGGCCACTGCGTATG 

GTACACTAGCTCCCTCTCTCCAAGGTTCAGACAATGTCAACACAACAATACCATCGTACC 

TTATGAATGGTCC7\.GCCACTTTGAACCAAATCCAGCAGAATCAATATCAAAATGGTTTCT 

TGACAATGAACAACAACCAGATGATAACCAATCCTCCGCCTCCTTTGCCCTATTTGGACC 

ATCATC^CCAACAGCAACATCAGTCTTCTCCTCAATTTAATTACCTGATGAAC^TGAAG 

AACTTCTTCAAGCCTCTGGCCTCTCTGCGACAGATCTTGAACTCACTTATCCAAGTTTAC 

CATATGATCCACAAGAGTATCTAATCAATGGCTACAATTATAATTAGTCATATAGCCCTT 

CTCTTTACTTAAGGCAGTCTATGTATGACAAATAATATGCGACTTCCCTTGTGAGTCACA 
ATATTGTTTCATTATTC 

>G2430 Amino Acid Sequence (domain in AA coordinates : 425-478) 
MLVGKISGYEDNTRSLERETSEITSLLSQFPGNTNVLVVDTNFTTLLNMKQIMKQYAYQV 
SIETDAEKALAFLTSCKHEINIVIWDFHMPGIDGLQALKSITSKLDLPWIMSDDNQTES 
VMKATFYGACDYVVKPVKEEVMANIWQHIVRKRIjIFKPDVAPP 

FKI VEDEP I INBTPL ITWTEE I QPVQSDLVQANKFDQVNGYSP IMNQDNMFNKAPPKPRM 

TWTEVIQPVQSNLVQTKEFGQLNDYSQIMNQDS^KAATKPQLTWTEEIQPVQSGLVQA 

NEFSKWGYSQSMNQDSMFNKSATNPRLTWNELLQPVQSDLVQSNEFSQFSDYSQIMNED 

NMFNKAAIGCPRMTWSEVFQPVQSHLVPTDGLDRDHFDSITINGGNGIQNMEK^ 

PRMTWTEELHQKFLEAIEIIGGIEKANPKVLVECLQEMRIEGITRSNVASHLQK^ 

ENQ I PQQTQGNGWATAYGTLAPSLQG SDNVNTTI PS YLMNGPATLNQI QQNQYQNGFLTM 

NNNQIITNPPPPLPYLDHHHQQQHQSSPQFNYLMNNEEIJjQASGLSATDLELTYPSLP 
PQEYLINGYNYN* 

>G2517 (66.. 899) 

TCCTCACTCTCTCTCTTTTTCTCTAACC^TAAAATCTCTTTGATCTCTTTCTCTGTGTTT 
TGATAATGGAAAATGTTGGTGTTGGGATGCCGTTTT 

ACCCACTCTTGTCTGATTTCCACGATTTATCGGCGGAGAGGTATCCGGTAGGGTTCATGG 
ATTTACTGGGTGTTCATCGTCATAC7^CCCACCCATACGCCGTTGATGCATTTTCCGAC^ 
CACCTAACTCGTCCTCGAGCGAAGCTGTGAATGGAGATGACGAAGAAGAAGAAGATGGAG 
AAGAACAGCAGCATAAGACAAAGAAGCGGTTTAAATTCACTAAAATGAGTAGAAAGCAGA 
CGAAGAAGAAGGTGCCAAAAGTGTCATTCATCACGAGGAGTGAGGTTCTTCATCTAGATG 
ATGGTTATAAGTGGAGAAAATACGGTCAAAAACCTGTCAAAGACAGCCCTTTTCCAAG^ 
ATTATTACCGTTGCACAACAACTTGGTGTGACGTGAAGAAGAGAGTAGAGAGATCATTCA 
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GTGATCCAAGCAGTGTAATCACCACTTACGAAGGTCAACATACTCATCCTCGTCCACTAC 
TCATCATGCCCAAAGAAGGCAGCTCTCCATCCAATGGCTCAGCTTCTAGGGCCCACATTG 
GCCTCCCTACACTCCCTCCTCAGCTTTTAGATTACAACAACCAACAACAACAAGCGCCGT 
CTTCTTTTGGAACCGAGTACATTAACAGGCAAGAAAAAGGAATTAATCATGATGATGATG 
ACGATCATGTTGTGAAGAAGAGTCGAACTCGGGATCTGCTGGATGGAGCTGGTTTAGTCA 
AAGATCATGGCCTTCTTCAGGATGTTGTTCCCTCTCATATCATTAAGGAAGAGTATTAGT 
TAATCGCATAATTATGTAGCTAGCTAGCTAG 

>G2517 Amino Acid Sequence (domain in AA coordinates: TBD) 

MENVGVGMPFYDLGQTRVYPLLSDFHDLSAERYPVGFMDLLGVHRHTPTHTPLMHFPTTP 

NSSSSEAWGDDEEEEDGEEQQHKTKKRFKFTKMSRKQTKKKVPKVSFITRSEVLHLDDG 

YKWRK^GQKPVKDSPFPRNYYRCTTTWCDVKKRVERSFSDPSSVITTYEGQHTHPRPLLI 

MPKEGSSPSNGSASRAHIGLPTLPPQLLDYNNQQQQAPSSFGTEYINRQEKGINHDDDDD 

HWKKSRTRDLLDG AGLVKDHGLLQD WPSHI I KEEY * 

>G2521 (103. .768) 

ATTCTCCACAATTTCATAACTTTCTTCCGCTCAACTTCAGATAAATTCGGATTCTGTAGC 
TCTTTCAATACGACTGCGGAGATCAGAGCCAATTATTTGGTTATGGCGTCTCTGATCTCA 
GATATTGAACCGCCGACGAGTACTACTTCAGATCTCGTTCGGAGAAAGMGAGATCCTCT 
GCTTCATCCGCCGCATCGTCTCGTTCAAGCGCATCTTCCGTCTCCGGTGAGATTCACGCG 
CGATGGCGATCGGAGAAGCAACAACGGATCTACTCAGCCAAACTGTTCCAAGCGCTCCAA 
CAAGTCCGCCTCAACTCTTCCGCCTCAACATCATCATCTCCAACGGCTCAGAAACGAGGA 
AAGGCCGTCCGTGAAGCCGCCGATCGAGCTCTTGCCGTTTCCGCTCGGGGAAGAACACTC 
TGGAGCAGAGCGATCTTAGCTAATCGGATCAAACTGAAATTTCGTAAACAGAGACGTCCT 
CGAGCTACGATGGCGATTCCGGCCATGACTACGGTGGTTAGTAGCAGCAGCAACAGATCG 
AGAAAACGGAGAGTGTCGGTGTTGAGATTGAATAAGAAGAGTATACCGGATGTTAACCGG 
J\AAGTACGTGTTCTAGGCCGGTTAGTTCCCGGTTGCGGTAAACAATCCGTACCGGTGATT 
CTAGAAGAAGCAACTGATTATATTCAGGCTCTGGAGATGCAAGTGAGAGCCATGAACTCT 
TTAGTTCAGCTTCTCTCCTCCTACGGCTCAGCTCCTCCACCGATTTGATGAGGTTAAAAT 
CGTCTTTTTAATTCTACCATCTCTCGATCTTTCACAGCTTATGTGTATATAGAAGATTCG 
GTTTGATTATAATCTGTAACTACTCTTCCCAACCGCTGATTCTTCTCTGCTACAAGTAAA 
AGTAAATTTTGAACCGAGTCTTCCCATTTTTACGATCCTCAAGTCTAAATTAAGTATATG 
ATTGATTAATAAAGTCTTTACCATTAGGGTTC 

>G2521 Amino Acid Sequence (domain in AA coordinates: 145-213) 
MASLI SDIEPPTSTTSDLVRRKKRS S AS SAAS SRS S ASS VSGEIHARWRSEKQQRI YSAK 
LFQALQQVRLNSSASTSSSPTAQKRGKAVREAADRALAVSARGRTLWSRAILANRIKLKF 
RKQRRPRATMAI PAMTTWS S S SNRS RKRRVS VLRLNKKS I PDVNRKVRVLGRLVPGCGK 
QSVPVILEEATDYIQALEMQVRAMNSLVQLLSSYGSAPPPI* 
>G258 (60.. 983) 

AGTGACCACCCTGCTGGTTAATCAACACCAAGAGACCTTGTAATATATAAGTTAGGAAGA 

TGAGAGAGAAGTGGGAAATGAAAAGAGATGAAATGGGACATCGATGTTGTGGAAAACACA 

AAGTGAAGAGAGGTCTTTGGTCTCCAGAGGAAGACGAGT^AGCTTCTTCGTTATATCACCA 

CTCATGGTC^TCCTAGTTGGAGTTCCGTTCCAAAGCTTGCCGGGTTGCAGAGATGTGGGA 

AGAGTTGCAGATTAAGGTGGATAAACTATCTAAGGCCTGATCTGAGGAGAGGTTCGTTTA 

ATGAGGAAGAAGAGCAGATTATCATCGACGTACATCGTATTCTTGGTAACAAATGGGCTC 

AGATTGCTAAGCACTTACCTGGACGCACTGATAATGAAGTCAAGAACTTTTGGAAC 

GC^TTAAGAAGAAACTTCTTTCTCAAGGCTTAGATCCTTCTACACATAATCTTATGCCTT 

CACACAT^AAGATCTTCTTCTTCAAAGAATAATAATATCCCCAAGCCAAACAAAACGACGT 

CCATCATGAAGAACCCTACTGATCTTGATCAATCAACCACTGCTTTTTCAATCACAAACA 

TCAATCCACCCACTTCCACTAAACCAAACAAACTTAAATCTCCTAACCAGACTACAATCC 

CATCTCAAACCGTGATCCCTATCAATGATAACATGTCAAGTACTCAAACCATGATCCCTA 

TCT^TGATCCCATGTCAAGTCTTTTAGATGATGAGAATATGATTCCTCACTGGTCAGATG 

TTGATGGAATGGCGATCCACGAAGCTCCGATGTTGCCTAGTGATAAGGCAGTAGTGGGAG 

TGGATGATGATGATCTCAACTVTGGAC^TTCT^ 

ATCCTGATTTTGCTTCCATTTTCTCCTCTGCAATGTCTATCGATTTCAATCCCATGGATG 
ATCTTGGCAGCTGGACCTTTTAGCTTTTACTCTACAGC 

>G258 Amino Acid Sequence (domain in AA coordinates: 24-124) 
MREKWEMKRDEMGHRCCGKHKyKRGLWSPEEDEKLLRYITTHGHPSWSSVPKLAGLQRCG 
KSCRLRWINYLRPDLRRGSFNEEEEQIIIDVHRILGNKWAQIAKHLPGRTDNEVKNFWNS 
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CIKKKLLSQGLDPSTHNLMPSHKRSSSSNNNNIPKPNKTTSIMKNPTDLDQSTTAFSITN 
INPPTSTKPNKLKS PNQTTI PSQTVIPINDNMSSTQTM I P INDPMSSLLDDENMI PHWSD 

VDGMAIHEAPMLPSDKAWGVDDDDLNMDILFNTPSSSAFDPDFASIFSSAMSIDFNPMD 

DLGSWTF* 

>G280 (108. .722) 

AAGTTAATATGAGAATAATGAGAAAACCACTTTCCCAAATTGCTTTTTAAAATCCCTCCT 
CACACAGATTCCTTCCTTCATCACCTCACACACTCTCTACGCTTGACATGGCCTTCGATC 
TCCACCATGGCTCAGCTTCAGATACGCATTCATCAGAACTTCCGTCGTTTTCTCTCCCAC 
CTTATCCTCAGATGATAATGGAAGCGATTGAGTCCTTGAACGATAAGAACGGCTGCAACA 
AAACGACGATTGCTAAGCACATCGAGTCGACTCAACAAACTCTACCGCCGTCACACATGA 
CGCTGCTCAGCTACCATCTCAACCAGATGAAGAAAACCGGTCAGCTAATCATGGTGAAGA 
ACAATTATATGAAACCAGATCCAGATGCTCCTCCTAAGCGTGGTCGTGGCCGTCCTCCGA 
AGCAGAAGACTCAGGCCGAATCTGACGCCGCTGCTGCTGCTGTTGTTGCTGCCACCGTCG 
TCTCTACAGATCCGCCTAGATCTCGTGGCCGTCCACCGAAGCCGAAAGATCCATCGGAGC 
CTCCCCAGGAGAAGGTCATTACCGGATCTGGAAGGCCACGAGGACGACCACCGAAGAGAC 
CGAGAACAGATTCGGAGACGGTTGCTGCGCCGGAACCGGCAGCTCAGGCGACAGGTGAGC 
GTAGGGGACGTGGGAGACCTCCGAAGGTGAAGCCGACGGTGGTTGCTCCGGTTGGGTGCT 
GAATTAATCGGTACTTATGCAATTTCGGAATCTTTAGTTACTGAAAAATGGAATCTCTTA 
GAGAGTAAGAGAGTGCTTTAATTTAGCTTAATTAGATTTATTTGGATTTCTTTCAGTATT 
TGGATTGTAAACTTTAGAATTTGTGTGTGTGTTGTTGCTTAGTCCTGAGATAAGATATAA 
CATTAGCGACTGTGTATTATTATTATTACTGCATTGTGTTATGTGAAACTTTGTTCTCTT 
GTTGAAAAAAAAAAAAAAAAAAA 

>G280 Amino Acid Sequence (domain in AA coordinates: 97-104,130-137-155-162 185- 
192) 

MAFDLHHGSASDTHSSELPSFSLPPYPQMIMEAIESLNDKNGCNKTTIAKHIESTQQTLP 

PSHMTLLSYHLNQMKKTGQLIKVK1JNYMKPDPDAPPKRGRGRPPKQKTQAESDAAAAAW 

AATWSTDPPRSRGRPPKPKDPSEPPQEKVITGSGRPRGRPPKRPRTDSETVAAPEPAAQ 

ATGERRGRGRPPKVKPTWAPVGC* 

>G3 (16.. 477) 

GTTTGTCTTTTATCAATGGAAAGAGAACAAGAAGAGTCTACGATGAGAAAGAGAAGGCAG 

CCACCTCAAGAAGAAGTGCCTAACCACGTGGCTACAAGGAAGCCGTACAGAGGGATACGG 

AGGAGGAAGTGGGGCAAGTGGGTGGCTGAGATTCGTGAGCCTAACAAACGCTCACGGCTT 

TGGCTTGGCTCTTACACAACCGATATCGCCGCCGCTAGAGCCTACGACGTGGCCGTCTTC 

TACCTCCGTGGCCCCTCCGCACGTCTCAACTTCCCTGATCTTCTCTTGCAAGAAGAGGAC 

CATCTCTCAGCCGCCACCACCGCTGACATGCCCGCAGCTCTTATAAGGGAAAAAGCGGCG 

GAGGTCGGCGCCAGAGTCGACGCTCTTCTAGCTTCTGCCGCTCCTTCGATGGCTCACTCC 

ACTCCGCCGGTAATAAAACCCGACTTGAATCAAATACCCGAATCCGGAGATATATAGTCA 

ATTTATATACATGTAGTTTGTTTTGTTTGATTAGAAGATTACATTTAC^TACAAGATA^ 

CATAGATACTGGAAAATATAGGTATGTATACATTCATAAATTATCTTATGTATCAAAGAA 

TTTTATAGATTCTGATTAGCTTTTTGTTTTTGTTTTTGATAAGAACT 

CGGAGACAAAACCGGCTAAGAGCAATCCATGAGAAGCTAGCGAGTGTTTTTTAGTTCAAG 

TTGTAATATAAATGCATATTAATTCTTTAGTAATTTTGT 

>G3 Amino Acid Sequence (domain in AA coordinates: 28-95) 
MEREQEESTMRKRRQPPQEEVPNHVATRKPYRGIRRRKWGKWVAEIREPNKRSRLWLGSY 
TTD I AAARA YD VAVF YLRG P S ARLNFPDLLLQEEDHLS AATT ADM P AAL IRE KAAEVG AR 
VDALLAS AAPSMAHSTPPVI KPDLNQI PESGDI * 
>G343 (1..795) 

ATGGACGTCTATGGeTTATCTTCACCAGACTTACTTCGAATCGACGACCTTCTTGATTTC 
TCCAACGAAGACATCTTCTCCGCTTCTTCTTCCGGTGGTTC 

TCTTCTTTCCCTCCTCCTCAAAACCCTAGTTTCCACCACCACCATCTCCCTTCCTCCGCC 
GATC^TCACTCCITCCTCCACGACATTTGCGTTCCCAGTGATGACGCAGCTC^ 
TCGCTTTCGCAATTCGTGGACGATTCTTTCGCTGATT 
ACTATC&ACTTCTGTCAAAACTGAAACTTCCTTTC 

AGAGCTCCTGCTCCTTTCGCCGGAACATGGTCTCCGATGCCACTGGAATCCGAGCATCAG 
CAGCTTCACTCCGCCGCCAAATTCAAGCCAAAGAAAGAACAATCCGGCGGAGGAGGAGGA 
GGAGGAGGAAGACATCAGTCATCGTCATCGGAGACTACGGAAGGAGGAGG^TGAGGAGA 
TGTACTCACn^TGCATCGGAGAAAACGCCACAGTGGAGGA(^GGACCACTTGGACCTAAA 
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ACACTATGTAACGCTTGTGGAGTCCGGTTTAAATCCGGTAGACTTGTACCGGAATATAGA 
CCGGCTTCGAGTCCTACTTTTGTTTTGACTCAGCATTCAAACTCTCACCGGAAAGTGATG 
GAGCTTCGACGGCAGAAAGAAGTTATGAGACAACCACAACAAGTTCAACTTCATCACCAC 

CACCACCCGTTTTAG 

>G343 Amino Acid Sequence (domain in AA coordinates: 178-214) 

MDVYGLSSPDLLRIDDLLDFSNEDIFSASSSGGSTAATSSSSFPPPQNPSFHHHHLPSSA 

DHHSFLHDICVPSDDAAHLEWLSQFVDDSFADFPANPLGGTMTSVKTETSFPGKPRSKRS 

RAPAPFAGTWSPMPLESEHQQLHSAAKFKPKKEQSGGGGGGGGRHQSSSSETTEGGGMRR 

CTHCASEKTPQWRTGPLGPKTLCNACGVRFKSGRLVPEYRPASSPTFVLTQHSNSHRKVM 

ELRRQKEVMRQPQQVQLHHHHHPF * 

>G363 (1..780) 

ATGAGACCAATATTAGACCTCGAAATTGAAGCTTCATCGGGCAGTAGTAGCAGCCAAGTG 

GCCTCAAACTTGTCTCCGGTTGGGGAAGATTACAAACCAATCTCGCTGAATCTTAGCCTC 

AGTTTCAACAACAACAACAACAATAATCTGGATCTTGAATCATCGTCTTTGACGCTGCCA 

CTTTCGAGCACGAGTGAGAGTAGTT^ACCCGGAGCAGCAGCAGCAACAACAACCATCTGTA 

TCAAAGAGAGTCTTCTCTTGTAACTACTGCCAAAGGAAGTTCTATAGCTCTCAAGCGCTA 

GGTGGTCACCAAAACGCTCACAAACGTGAGAGAACACTCGCCAAACGCGCTATGCTATGG 

GTCTTGCTGGGGTCTTCCCCGGTAGAGGATCAAGTAGCAATTATGCGGCTGCTGCCACAG 

CAGCCGCTCTCGTGTTTGCCGCTTCACGGAAGCGGAAACGGGAACATGACATCGTTCAGG 

ACTTTGGGAATCCGGGCACATTCCTCGGCGCACGACGTCAGCATGACAAGGCAGACACCA 

GAAACACTTATTAGAAACATTGCCAGGTTCAACCAGGGGTATTTCGGTAATTGTATACCT 

TTTTACGTGGAGGACGACGAGGCCGAGATGCTCTGGCCGGGGAGTTTCCGGCAAGCTACG 

AATGCGGTTGCGGTTGAAGCGGGTAATGATAATTTAGGTGAAAGAAAAATGGATTTCTTG 

GACGTCAAGCAAGCGATGGATATGGAAAGTTCTCTTCCAGATCTAACCTTGAAGCTTTGA 

>G363 Amino Acid Sequence (domain in AA coordinates: 87-108) 

MRPILDLEIEASSGSSSSQVASNLSPVGEDYKPISLNLSLSFNNNNNNNLDLESSSLTLP 

LSSTSESSNPEQQQQQQPSVSKRVFSCNYCQRKFYSSQALGGHQNAHKRERTLAKRAMLW 

VLLGSSPVEDQVAIMRLLPQQPLSCIiPLHGSGNGNMTSFRTLGIRAHSSAHDVSMTRQTP 

ETL I RNI ARFNQG Y FGWC I PFYVEDDE AEMLWPG S FRQATNAV AVE AGNDNLGERKMDFIj 

DVKQAMDMESSLPDLTLKL* 

>G370 (1..774) 

ATGGACGAAACCAACGGACGAAGAGAAACTCACGATTTCATGAACGTCAACGTTGAATCC 

TTCTCTCAGCTTCCTTTCATCCGCCGTACTCCTCCCAAAGAAAAAGCCGCCATTATTCGT 

CTCTTCGGCCAAGAGCTCGTCGGTGATAACTCCGACT^ACTTATCCGCAGAACCTTCTGAT 

CATCAAACCACTACCAAGAACGATGAGAGCTCTGAGAATATCAAGGACAAAGACAAAGAA 

AAAGATAAGGACAAAGACAAAGATAACAACAACAACAGGAGATTCGAGTGTCACTACTGC 

TTCAGAAACTTCCCAACTTCTCAAGCCCTAGGTGGACATCAAAACGCTCACAAACGTGAA 

CGTCAACACGCCAAACGCGGTTCCATGACATCATACCTTCATCATCATCAGCCTCATGAC 

CCTCACCACATCTACGGCTTCCTCAACAACCACCACCACCGTCACTATCCGTCTTGGACG 

ACGGAAGCTAGATCATACTACGGCGGAGGGGGACATCAAACGCCGTCGTACTACTCAAGG 

AATACTCTTGCTCCTCCTTCTTCTAACCCACCGACAATCAACGGAAGTCCTTTAGGTTTG 

TGGCGTGTACCGCCTTC^CGTCAACAAATACTATTCAAGGCGTTTACTCATCTTCACCA 

GCTTCAGCGTTTAGGTCGCATGAGCAAGAGACTAATAAGGAGCCTAATAACTGGCCGTAC 

AGATTGATGAAACCCAATGTGCAAGATCATGTGAGTCTCGATCTTCATCTCTGA 

>G370 Amino Acid Sequence (domain in aa coordinates: 97-117) 

MDETNGRRETHDFMNWVESFSQLPFIRRTPPKEKAAIIRLFGQELVGDNSDNLSAEPSD 

HQTTTKNDESSENIKDKDKEKDKDKDK^ 
RQHAKRGSMTSYLIBfflQPHDPHHIYGFLNNOflHHRHYPSW 

WTLAPPSSNPPTINGSPLGLWRVPPSTSTOTIQGVYSSSPASAFRSHEQETNKEPN1W 

RLMKPNVQDHVSLDLHL* 

>G385 (37. .2202) 

TAGGGTTTGCTTTCAGTTTCCGGAGTATAAGAAAAG 

GCGGCTATGAACAACGCAGACAGCAATAACCACAACTACAACCACGAAGACAACAATAAT 
GAAGGATTTCTTCGGGACGATGAATTCGACAGTCCGAATACTAAATCGGGAAGTGAGAAT 
CAAGAAGGAGGATCAGGAAACGACCAAGATCCTCTTCATCCTAACAAGAAGAAACGATAT 
CATCGACACACCCAACTTCAGATCCAGGAGATGGAAGCGTTCTTCAAAGAGTGTCCTCAC 
CCAGATGACAAGCAAAGGAAACAGCTAAGCCGTGAATTGAATTTGGAACCTCTTCAGGTC 
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gacgaac^ccaactccgtctcgaaaatg^ 

GCAATCGCAGCTAAATACGTAGGCAAGCCAGTCTCAAACTATCCACTTATGTCTCCTCCT 
CCTCTTCCTCCACGTCCACTAGAACTCGCCATGGGAAATATTGGAGGAGAAGCTTATGGA 

^aatccaaacgatctccttaagtccatcac^ 

ATCATCGACTTATCCGTGGCTGCAATGGAAGAGCTCATGAGGA^ 

CCTCTGTGGAAGAGTTTGGCTTTAGACGAAGAAGAATATGCAAGGACCT^CCTAG^GGG 
ATCGGACCTAGACCGGCTGGATATAGATCAGAAGCTTCGCGAGAAAGCGCGGTTGTGATr 

GCGGGGATGGTTTCTAGAGCAATGACATTAGCGGTTTTATCGACAGGAGTTGCAGGlSr 
TATAATGGAGCTCTTCAAGTGATGAGCG^ 



TCTTTTTGGATCCCTGTTCCTCCAAAGCGAGTCTTTGACTTCCTCAGAGAcS 

aacgggagggataccggaaactgtgtttctcttcttcgggtaaatagtgcaSScS 

CAGAGCAATATGCTGATCCTACAAGAGAGCTGCATTGATCCTACAGCTTCCTTTGTGATr 

g t £:s^ 



GTGGCTCTGCTTCCATCAGGTTTTGCTAOTCTTCCTGATGGTAATGCCAATAGTGGAG^ 

I 



CCTACGGCTAAGCTGTCTCTTGGCTCTGTTGCAACTGTCAATAATCTAATAGCTTGCACT 



rGACTCAGTT 



>G385 Amino Acid Sequence (domain in AA coordinates- «n 



^ERHENSHLRABNEKLRNDNLRVEBALANASCPNCGG PTAIGEMS FDEHQLrSnmJ 
REEIDRISAIAAKYVGKPVSNYPLMSPPPLPPRPLELAMGNIGGEAYGNNPNDLL^MA 



o^ff*^ IIDLSVAAMEELMRMV Q VD EPLWKSLALDEEEYARTFPRGIGPRPAGYKSPA 
SRESAVVIMNHVNIVEILMDVNQWSTIFAGMVSRAMTIAVLSTGVA^ 

Q^splvptretyfaryckqqgixsswavvdislds^^^ 

ol™ SAATS ™ IPVPP ^ VroFL ^SRNEWDI LS NGGWQE^IANGRDTCNOT^^^ 

^sanssqs^ilqescidptasfviyapvdivajwiSg^dySSIgfa^ 

>G439 (128.. 967) 
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ACAGATATTGTCTCCGTCTATCAACGCAAAGATCGAATCCATCTGCAATAGTTCTGATCT 
TCCACTGCCTCAGATCGAGAAACAGAACAAAACAGAGGAGGTGCTCTCTGGTTTTTCCAA 
ACCGGAGAAAGAACCGGAATTTGGGGAGATATACGGATGCGGATACTCGGGCTCATCTCC 
TGAGTCGGATATAACGTTGTTGGATTTCTCAAGCGACTGTGTGAAAGAAGATGAGAGTTT 
CTTGATGGGTTTGCACAAGTATCCTTCTTTGGAGATTGATTGGGACGCTATAGAGAAACT 
CTTCTGAATCCATTTTATCTTTTTGATTCATTTGTCTCTAAATTGTAGAATTTTATTTTC 
AGAGCTTTGTAAGGGAAGTTCTTGAATGAGAGTTGCAGAGGACTAGTGGAACCTAACTCT 
GTTTTCTTTTGTAAGTATTGTTTATAATGGGCCGTTGAATGGGCCTTATTGATTTAAACA 
GCCCAAGTTTTTAAAAAAAAAAAAAAAAAAAAAAAAA 

>G439 Amino Acid Sequence (domain in AA coordinates: 110-177) 

mamalnmnayvdefmealep™ 

GLNQLTPTQILQIQTELHLRQNQSRRRAGSHLLTAKPTSMKKIDVATKPVKLYRGVRQRQ 

WGKWVAEIRLPKNRTRLWLGTFETAQEAALAYDQAAHKIRGDNARLNFPDIVRQGHYKQI 

LSPSINAKIESICNSSDLPLPQIEKQNKTEEVLSGFSKPEKEPEFGEIYGCGYSGSSPES 

DITLLDFSSDCVKEDESFLMGLHKYPSLEIDWDAIEKLF* 

>G440 (237.. 1301) 

AAAAAATCACTGTTTCATAACACGTTTTTCTCTCTCACCCACCAAAAAAAAATCTTTTGT 

TCTTGTTACCAAAAAATCTCGTGATAAATCTCTTCAAACTTTGTTTTATTTTCTTCTTGA 

TTCTCTCGAAATCTCTCTCAACAAACCCAGAAACTTTCCTTGATTCGCAAGCTTTTCTTC 

CTTTTATATTCTTCATTTTGATGCGAATATAGAGAGAGTCCATAAAAGAAACAGTAATGG 

ACGAATATATTGATTTCCGACCATTGAAGTACACAGAGCACAAGACTTCAATGACTAAAT 

ACACCAAAAAGTCATCGGAAAAACTTTCCGGTGGTAAGTCATTGAAAAAGGTTAGTATTT 

GTTATACTGATCCTGACGCAACAGATTCATCAAGTGACGAAGACGAAGAAGATTTCTTGT 

TTCCTCGCCGGAGAGTCAAAAGATTCGTTAACGAGATCACTGTTGAGCCTAGCTGTAACA 

ACGTCGTCACCGGAGTTTCGATGAAAGATAGAAAGAGACTCTCTTCTTCCTCCGATGAAA 

CTCAATCTCCGGCGTCGAGTCGTCT^ACGTCCTAATAACAAAGTTTCAGTCTCCGGTCAGA 

TAAAGAAGTTCCGTGGTGTTAGACAACGGCCATGGGGGAAATGGG CGG CGGAG ATTAG AG 

ATCCGGAGCAACGTCGGAGGATTTGGCTCGGGACTTTTGAGACGGCGGAGGAAGCTGCCG 

TGGTTTATGATAACGCCGCTATAAGACTCCGTGGACCGGACGCTTTAACTAATTTCTCCA 

TACCGCCTCAAGAAGAGGAAGAAGAAGAAGAACCGGAACCGGTTATTGAGGAGAAACCGG 

TTATTATGACGACGCCAACACCAACAACATCGAGTTCTGAATCAACTGAAGAAGATTTAC 

AACATCTCTCATCTCCTACTTCGGTTCTCAATCACCGGTCAGAAGAGATTCAACAAG.TAC 

AACAACCGTTTAAATCAGCTAAACCCGAACCGGGGGTTTCAAATGCACCATGGTGGCATA 

CCGGGTTTAATACCGGTTTAGGTGAATCAGACGATTCATTTCCTTTGGATACTCCGTTTC 

TTGACAACTATTTCAATGAATCACCACCAGAGATGTCAATATTTGACCAACCAATGGATC 

AAATTTTCTGTGAAAATGATGATATCTTCAATGATATGTTGTTCTTGGGTGGTGAAACTA. 

TGAACATTGAAGATGAGTTAACAAGTTCTAGTATCAAAGATATGGGTTCAACGTTTAGTG 

ATTTTGATGATTC^TTGATATCAGATCTATTAGTTGCTTAATATGATGATGAGAGTGAAG 

AAGAAACCATCAAGCAAATATCTATGGTGTGACTGAAAAATTTTGGTGTTACTTTTTTTT 

CTTTCATAAGTTCATGAGCTTTTTTGTTT^ 

GGAGCTTGTAAAACAGTTTTGGAGAAATAGTGGAAAAATAGTTTAATTAAAAAAAAAAAA 
AAAAAAA 

>G440 Amino Acid Sequence (domain in AA coordinates: 122-189) 
MDEYIDFRPLKYTEHKTSMTKYTKKSSEKLSGGKSLKKVSICYTDPDATDSSSDEDEEDF 

LFPRRRVKRFVNEITVEPSQJNVVTGVSM 

QIKKFRGVRQRPWGKWAAEIRDPEQRRRIWLGTFETAEEAAWYDNAAIRLRGPDALTNF 
SIPPQEEEEEEEPEPVIEEKPVIMTTPTPTTSSSESTEEDLQHIiSSPTSVLNHRSEEIQQ 
VQQPFKSAKPEPGVSNAPWWHTGFNTGLGESDDSFPLDTPFLDNYFNESPPEMSIFDQPM 
DQIFCENDDIFNDMLFLGGETMNIEDELTSSSIKDMGSTFSDFDDSLISDIiLVA* 

>G5 (417.. 1421)' 

TTTTTTTTTTGCAATCTCCCCCTAATCTGCT 

TGTCTTTCAAAAAGAAAGAAAAAAGAAAAATTCGATTTCTGGGTTTGTTTTO 

GAAAAAAATCAAGCTTATGAATTTGTGTTTAATTTTC 

TTTTCAGAACGAGATCGTTTTTTCAAATTTCTTCTGATTTO 

GATTTTAGTGAATCGAGGGTGAAATTTTTGATTCCCTCTTTTCGGATCTACACAGAGGTT 

GCTTATTTCAAACCTTTTAGATCCATTTTTTTTTAATTTTCTC^ 

TTTACTTiriTATAAGTCTCAGGTTCAATTTTTTCGGATTCAAAT^^ 
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CAGCTGCTATGAATTTGTACACTTGTAGCAGATCGTTTCAAGACTCTGGTGGTGAACTCA 
TGGACGCGCTTGTACCTTTTATCAAAAGCGTTTCCGATTCTCCTTCTTCTTCTTCTGCAG 
CGTCTGCGTCTGCGTTTCTTCACCCCTCTGCGTTTTCTCTCCCTCCTCTCCCCGGTTATT 
ACCCGGATTCAACGTTCTTGACCCAACCGTTTTCATACGGGTCGGATCTTCAACAAACCG 
GGTCATTAATCGGACTCAACAACCTCTCTTCTTCTCAGATCCACCAGATCCAGTCTCAGA 
TCCATCATCCTCTTCCTCCGACGCATCACAACAACAACAACTCTTTCTCGAATCTTCTCA 
GCCCAAAGCCGTTACTGATGAAGCAATCTGGAGTCGCTGGATCTTGTTTCGCTTACGGTT 

AATGGGTGGCTGAGATCCGTTTGCCGAGAAATCGGACTCGTCTCTGGCTTGGGACTTTTG 
ACACGGCGGAGGAAGCTGCGTTGGCCTATGATAAGGCGGCGTAC^GCTGCGCGGCGATT 

GTGAATATAAACCTCTTCACTCCTCAGTCGACGCTAAGCTTGAAGCTATrrGTAAAAGCA 
TGGCGGAGACTCAGAAACAGGACAAATCGACGAAATCATCGAAGAAACGTGAGAAGAAGG 

tttcgtcgccagatctatcggaga^^ 

GATCTCCACCGGTGACGGAGTTTGAAGAGTCCACCGCTGGATCTTCGCCGTTGTCGGACT 

* GACG " CGCTCACCCGGAGGAGC ^ 

ATCCGTCGTACGAGATCGATTGGGATTCGATTCTAGCTTAGGGGCAAAATAGGAAATTCA 

GCCGCTTGCAATGGAGTTTTrGTGAAATTGCATGACTGGCCCAAGAGTAA^AA^AA^ 
ATGGATTAGTGTTAAATTTCGTATGTTAATATTTGTATTAT^n^^^.^^ 



„ w ™ " * * * ^^"""^ J-^^J- ^AtJTuucccAAGAGTAATTAATTAAAT 

ATGGATTAGTGTTAAATTTCGTATGTTAATATTTGTATTATGGTTTGTATTAGTCTCTCT 

TAATCTTTTTTTCTTTTGTCTTATGTAATTTGTAGCTTCAGTTTCTTCATCTATAATGCA 
ATTTTATTATGATTATGTG L 1AI AATGCA 

>G5 Amino Acid Sequence (domain in AA coordinates: 149-216) 

MAAAMNLYTCSRSFQDSGGELMDALVPFIKSVSDSPSSSSAASASAFLHPSAFSLPPLPG 

YYPDSTFLTQPFSYGSDLQQTGSLIGLNNLSSSQIHQIQSQIHHPLPPTHHNNNNSFSNL 

SMAETQKQDKSTKSSKKRBKKVSSPDLSEKVKAEENSVSIGGSPPVTEFEESTAGSSPLS 
DLTFADPEEPPQWNETFSLEKYPSYEIDWDSILA* ^VIJSWSESTAGSSPLS 
>G550 (1..1374) 



ATGGCTGATCCGGCGATTAAGCTCTTTGGAAAGACGATTCCTTTACCTGAGCTTGGTGTT 
GTTGATTCTTCTTCTAGCTATACCGGATTTTTAACCGAAACTCAGATTCCTGTTCGCT 

* cA ° A r cGTGT ^^ 

GAAGAAGGTGATGATGTTGGTGATGGTGGAGGAGAGA 

AAAGATAGTGAGTGTCAGGAAGAGTCATTGAGGAATGAATCTAATGATG^?^ 
ACATCGGGTATAACTGAAAAAACfiRAaar-a*™* a »r, , „ „ „ i ! L1ACi 



V^rTm^, urtij i <-AUUAAGAGTCATTGAGGAATGAATCTAATGATGTTACTACTACT 
ACA ™ GTATAACTCAAAAAACGGAAACAACA ^^ 
GGTGGTACTGCTTGCTCTC^UiGAGGGGAAGlTAAAGAAACCTGATAAGATTCTACCGTGT 
CCGCGATGTAACAGCAIGGAAACCAAGTTCTGTTACTACAA^ 

CCTCGC(^TTTCTGC^UVGAAATGTCAGAGATATTGGACAGCTGGTGGAACGATGAGGAAT 

gttccggttggtgctgggagacgtaagaataagagtccagcw^^ 

GTAAGTATAACATCTGCGGAAGCTATGCA^^ 
AATCGTGCAAATCTTCTCACT^^ 



>S550 Amino Acid Sequence (domain in AA coordinates- 134-180) 

ggtacsq E gklkkpdkilpcp R c N sme T kfcyy n1to ^ QPR h F ckkcqrywtagg™^ 
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VPVGAGRRKNKSPASHyNRHVSITSAEAMQKV7VRTDLQHPNGANLLTFGSDSVLCES^4AS 

GLNLVEKSLLKTQTVLQEPNEGLKITVPLNQTNEEAGTVSPLPKVPCFPGPPPTWPYAWN 

GVSWTILPFYPPPAYWSCPGVSPGAWNSFTWMPQPNSPSGSNPNSPTLGKHSRDENAAEP 

GTAFDETESLGREKSKPERCLWVPKTLRIDDPEEAAKSSIWETLGIKKDENADTFGAFRS 

STKEKSSLSEGRLPGRRPELQANPAALSRSANFHESS* 

>G670 (28.. 1152) 

CACAGCATTGCAGCTGTGAATAACTAAATGGGGAGACATTCTTGCTGTTACAAACAAAAG 
CTGAGGAAAGGGCTTTGGTCTCCTGAAGAAGACGAGAAGCTTCTTACTCACATCACCAAT 
CACGGCCATGGCTGCTGGAGCTCTGTCCCTAAACTCGCTGGTTTGCAGAGATGTGGGAAG 
AGTTGTCGACTCGAGCAGATCTGGTACCGCCGACTAAGATGGATCAATTACTTGAGACCT 
GATTTAAAGAGAGGAGCTTTTTCTCCTGAAGAAGAGAATCTCATCGTCGAACTTCATGCC 
GTCCTTGGAAACAGATGGTCACAGATTGCGTCAAGGCTTCCGGGTAGAACCGACAACGAG 
ATCAAGAATCTATGGAACTCAAGCATCAAGAAGAAACTGAAACAAAGAGGCATTGACCCA 
AACACACACAAGCCCATCTCTGAAGTGGAGAGTTTTAGCGACAAAGACAAACCAACAACA 
AGCAACAACAAAAGAAGCGGTAACGATCAC7\AGTCTCCTAGTTCCTCTTCTGCGACTAAC 
CAAGACTTCTTCCTCGAAAGGCCATCTGATTTATCCGACTACTTCGGATTTCAGAAGCTT 
AACTTCAACTCCAATCTAGGACTCTCTGTTACAACTGATTCTTCACTCTGCTCGATGATT 
CCGCCGCAGTTTAGCCCCGGGAACATGGTTGGTTCTGTCCTTCAGACACCAGTATGCGTA 
AAGCCCTCGATTAGTCTTCCTCCCGACAACAACAGTTCGAGTCCTATCTCCGGAGGAGAT 
CATGTGAAATTGGCTGCACC7^AACTGGGAATTTCAGACAAACAACAATAATACCTC7^AT 
TTCTTCGACAATGGCGGATTCTCATGGTCTATCCCAAATTCTTCTACTTCTTCTTCAC7VA 
GTCAAACCAAATCATAACTTCGAAGAAATAAAATGGTCAGAGTATTTGAACACACCGTTC 
TTCATAGGGAGTACTGTACAGAGTCAAACCTCTCAACCAATCTACATCAAATCAGAAACA 
GATTACTTAGCCAATGTTTCAAACATGACAGATCCTTGGAGCCAAAACGAGAACTTGGGC 
ACAACTGAAACTAGTGACGTGTTCTCCAAGGATCTTCAGAGAATGGCCGTCTCTTTTGGT 
CAGTCCCTTTAGCTTTTTTCTTTCTTTCTTTCTTATTTCTAACAGATGTAGAGAACATAA 
AGATATACAAATACATACAATGTCAATACGTACAGTGGATTTAAGTGTTCTGTATATTTC 
ATGGGCGAGCTGTCTTTATTTTTATGTTTAAAAAAAAAAAAAAAAAA 
>G670 Amino Acid Sequence (domain in AA coordinates: 14-122) 
MGRHS CC YKQKLRKGLWS PEEDEKLLTHI TNHGHGCWS S VPKLAGLQRCGKS CRLEQ I W Y 
RRLRWINYLRPDLKRGAFSPEEENLIVELHAVLGNRWSQI ASRLPGRTDNE I KNLWNSS I 
♦KKKLKQRGIDPNTHKPISEVESFSDKDKPTTSNNKRSGNDHKSPSSSSATNQDFFLERPS 
DLSDYFGFQKLNFNSNLGLSVTTDSSLCSMIPPQFSPGNMVGSVLQTPVCVKPSISLPPD 
NNSSSPISGGDHVKLAAPNWEFQTNNlSrNTSNFFDNGGFSWSIPNS 

IKWSEYLNTPFFIGSTVQSQTSQPIYIKSETDYLANVSNMTDPWSQNENLGTTETSDVFS 

KDLQRMAVSFGQSL* 

>G760 (175.. 1878) 

TGCTTAATTCCAATGCCATCGTGATCGATTCATCTCTCTCTCTCTCTTCCAATTTTCCCA 

ATTCTTTTTTAAAACCCTAATTTTTCAGATATCTGATTATCTCTTGTATTTCTTCTACTC 

GATTTGCTCCCATAAAAACCCTTACTTTCTTCAAGTTCTGGTTTTCACCGATO 

CGTGGCTCAGTGACGTCGCTTGCTCCTGGGTTCCGTTTTCACCCGACGGATGAGGAACTT 

GTTCGCTACTACCTTAAGCGTAAGGTCTGCAACAAACCCTTTAAGTTCGATGCTATTTCC 

GTCACCGACATATACAAGTCTGAGCCTTGGGATCTACCAGATAAGTCGAAGCTGAAAAGT 

AGAGACTTGGAATGGTACTTCTTTAGTATGCTGGATAAGAAGTACAGTAATGGTTCCAAG 

ACGAATCGTGCTACGGAGAAAGGGTATTGGAAGACGACTGGGAAAGATCGGGAGATTCGT 

AATGGTTCAAGAGTCGTTGGGATGAAGAAGACACTTGTTTATCACAAGGGTCGAGCTCCT 

CGTGGTGAAAGGACCAATTGGGTTATGCATGAGTATCGGCTTTCTGATGAGGACTTGAAG 

AAAGCTGGTGTGCC^CAAGAAGCATATGTGTTATGTAGGATATTCCAGAAT^AGTGGTACG 

GGTCCTAAGAATGGGGAGCAGTATGGTGCTCCTTATCTTGAGGAGGAGTGGGAAGAAGAT 

GGAATGACTTATGTAeCTGCTCAAGATGCTTTCAGTGAAGGATTGGCTTTGAATGATGAT 

GTTTATGTCGATATTGATGACATTGACGAGAAGCCCGAT^AATCTGGTGGTCTATGATGCC 

GTTCCTATTCTACCTAACTATTGTCATGGGGAATCAAGTAACAATGTTGAATCAGGCAAT 

TACTCAGACTCTGGAAATTACATTCAACCAGGAAACAATGTTGTCGACTCTGGTGGGTAC 

TTTGAACAACCAATTGAAACTTTTGAGGAAGATCGGAAGCCTATTATACGGGAGGGTAGC 

ATT(^GCCTTGTTCTCTGTTTCCAGAGGAACAAATTGGCTGTGGTGTGCAAGACGAAAAT 

GTGGTGAATCTGGAATCTTCCAACAATAATGTGTTTGTAGCTGATACATGCTACAGTGAC 

ATTCCTATTGATCATAACTATTTACCGGATGAGCCATTCATGGATCCTAATAACAATCTT 
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CCACTCAACGATGGTCTGTACCTGGAAACGAATGATCTCAGCTGTGCTCAACAAGATGAT 

TTTAACTTCGAAGATTATCTCAGCTTCTTTGATGATGAGGGTTTGACTTTTGACGATTCT 

CTATTAATGGGACCTGAAGATTTTCTTCCCAACCAAGAAGCCCTTGACCAGAAACCTGCC 

CCTAAAGAATTGGAGAAGGAGGTCGCAGGAGGCAAAGAGGCAGTGGAGGAAAAGGAAAGT 

GGCGAAGGATCTTCTTCAAAACAAGATACAGATTTCAAGGACTTTGATTCAGCTCCGAAG 

TACCCATTTCTCAAAAAGACGAGCCACATGCTTGGAGCCATTCCTACTCCATCTTCATTT 

GCTTCACAGTTCCAAACAAAGGACGCAATGCGTCTACACGCAGCACAATCTTCTGGTTCA 

GTTCACGTGACTGCAGGTATGATGAGAATATCAAACATGACTCTAGCAGCGGACAGCGGT 

ATGGGCTGGTCATATGACAAGAACGGTAACCTCAACGTAGTCCTTTCATTCGGGGTAGTC 

CAACAGGATGATGCGATGACTGCCTCGGGAAGCAAGACAGGAATTACGGCGACAAGAGCT 

ATGTTAGTCTTCATGTGTTTATGGGTTCTCCTACTCTCTGTTAGCTTCAAAATAGTAACC 

ATGGTGTCTGCTCGGTAATAGGATCAAAGTTGAATCGTCTCAAAGACTTTTTTTGGTGTT 

TGTACCTCTCCAATCATATAGCCTTTAACTTTGGCAGTGCTTTGCTGCTCAATATTTAAA 
TTTTAAAAAAAAAAAAAAAAA 

>G760 Amino Acid Sequence (domain in AA coordinates- 12-156) 
MGRGSVTSLAPGFRFHPTDEELVRYYLKRKVCNKPFKFDAISVTDIYKSEPWDLPDKSKL 
KSRDLEWYFFSMLDKKYSMGSKTNRATEKGYWKTTGKDREIRNGSRWGMKKTLVYHKGR 
APRGERTNWVMHEYRLSDEDLKKAGVPQEAYVLCRIFQKSGTGPKNGEQYGAPYLEEEWE 
EDGMTYVPAQDAFSEGLALNDDVYVDIDDIDEKPENLWYDAVPILPNYCHGESSNMVES 
GNYSDSGNYIQPGNNWDSGGYFEQPIETFEEDRKPIIREGSIQPCSLFPEEQIGCGVQD 
ENWNLESSNNNVFVADTCYSDIPIDHNYLPDEPFMDPNHNLPLNDGLYLETNDLSCAQQ 
DDFNFEDYLSFFDDEGLTFDDSLLMGPEDFLPNQEALDQKPAPKELEKEVAGGKEAVEEK 
ESGEGSSSKQDTDFKDFDSAPKYPFLKKTSHMLGAIPTPSSFASQFQTKDAMRLHAAQSS 

GSVHVTAGMMRISNMTLAADSGMGWSYDKNGNLNWLSFGWQQDDAMTASGSKTGITAT 

RAMLVFMCLWVLLLSVSFKIVTMVSAR* 

>G831 (92.. 1987) 

TTCTTTCATCGTTGTGTCTATTATAAATATATGTCAATTTGGTTTCTAAAAAATTCTACC 

ATTGATTGATTGATTTTTTTTTCTTTAAGAGATGAATTTATTTACAAGAATCTCATCTCG 

GACTAAGAAGGCCAATCTTTACTACGTAACCCTAGTTGCTCTTCTCTGCATCGCTAGCTA 

CCTTCTCGGTATTTGGCAAAACACGGCGGTTAATCCACGCGCCGCCTTCGATGATTCAGA 

CGGTACACCGTGCGAGGGATTCACCAGACCTAATTCTACGAAAGATCTCGACTTCGACGC 

GCATCACAACATTCAAGATCCACCTCCGGTGACGGAAACCGCCGTTAGTTTCCCGTCGTG 

TGCCGCCGCGTTGAGCGAGCACACGCCATGCGAAGACGCGAAGCGATCGTTGAAATTCTC 

GAGGGAGAGATTGGAGTATAGGCAAAGGCATTGTCCCGAGAGAGAAGAAATCTTGAAGTG 

CAGAATTCCGGCGCCGTACGGTTACAAAACGCCGTTCCGATGGCCGGCGAGTCGTGACGT 

GGCGTGGTTCGCTAATGTGCCTCACACGGAGCTTACGGTTGAGAAAAAGAATCAGAATTG 

GGTCCGGTACGAGAATGATCGGTTTTGGTTCCCTGGTGGAGGTACGATGTTTCCACGTGG 

CGCTGATGCTTACATTGATGATATCGGACGGTTGATTGATCTCAGCGACGGCTCTATCCG 

TACAGCCATCGATACCGGTTGCGGGGTGGCTAGCTTCGGTGCATATCTTTTATCAAGAAA 

CATTACAACGATGTCATTTGCACCAAGAGACACACACGAAGCTCAAGTCCAGTTCGCACT 

CGAGCGTGGTGTGCCGGCGATGATCGGAATCATGGCTACAATCCGCCTACCGTACCCTTC 

TAGAGCCTTTGATTTAGCACATTGCTCTCGTTGCCTTATTCCGTGGGGCCAAAACGATGG 

GGCTTACTTGATGGAGGTGGATAGGGTTTTAAGACCAGGAGGGTACTGGATACTTTCTGG 

ACCGCCGATTAATTGGCAGAAACGGTGGAAAGGGTGGGAACGGACCATGGATGATTTGAA 

TGCAGAGCAGACTCAGATCGAGCAGGTCGCGAGAAGCTTGTGTTGGAAGAAAGTTGTTCA 

AAGAGATGATCTTGCTATTTGGCAAAAACCCTTTAACCACATTGACTGTAAGAAAACCAG 

AGAGGTTTTGAAAAATCCGGAGTTTTGTCGTCATGATCAAGATCCCGACATGGCCTGGTA 

TACGAAGATGGATT6TTGTTTGACACCATTACCTGAAGTTGATGACGCTGAGGATCTAAA 

GACGGTGGCCGGAGGGAAGGTAGAAAAGTGGCCGGCTAGATTAAACGCGATTCCTCCGAG 

AGTAAACAAAGGCGCTCTCGAGGAAATCACACCTGAAGCTTTCTTGGAGAACACGAAACT 

GTGGAAACAGAGAGTTTCTTATTACAAGAAGTTAGATTACCAGTTGGGTGAAACCGGGAG 

ATACAGAAACTTAGTCGACATGAACGCTTACCTCGGTGGATTCGCGGCGGCTCTAGCGGA 

TGATCCGGTCTGGGTCATGAACGTTGTCCCGGTCGAGGCTAAGCTCAATACGCTCGGTGT 

CATCTACGAGCGTGGTCTAATCGGAACGTATCAAAACTGGTGTGAAGCCATGTCGACGTA 

TCCAAGAACGTATGATTTTATCCATGCTGACTCGGrrTTCACAlTGTACCAAGGTCAATG 

TGAACCGGAGGAGATATTGTTGGAGATGGACCGAATTCTTAGACCGGGTGGTGGTGTGAT 

TATAAGAGATGACGTGGACGTTTTGATCAAGGTTAAGGAATTAACCAAAGGATTAGAATG 
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GGAAGGTAGAATTGCTGACCACGAGAAGGGTCCTCATGAAAGAGAGAAGATTTACTATGC 
GGTGAAACAGTATTGGACCGTTCCTGCGCCTGATGAAGATAAAAACAACACTAGTGCTCT 

TATACAACAATAAATTCTCAATAATTGTTGTCGCGGCCG 

>G831 Amino Acid Sequence (domain in AA coordinates: 470-591) 

MNLFTRISSRTKKANLYYVTLVALLCIASYLLGIWQNTAVNPRAAFDDSDGTPCEGFTRP 

NSTKDLDFDAHHNIQDPPPVTETAVSFPSCAAALSEHTPCEDAKRSLKFSRERLEYRQRH 

CPEREEILKCRIPAPYGYKTPFRWPASRDVAWFAJSTVPHTELTVEKKNQNVATRYENDRFWF 

PGGGTMFPRGADAYIDDIGRLIDLSDGSIRTAIDTGCGVASFGAYLLSRNITTMSFAPRD 

THEAQVQFALERGVPAMIG IMATI RLP YPSRAFDLAHCSRCL I PWGQNDG AYLMEVDRVL 

RPGGYWILSGPPINWQKRWKGWERTMDDLNAEQTQIEQVARSLCWKKWQRDDLAIWQKP 

FNHIDCKKTREVLKNPEFCRHDQDPDMAWYTKMDSCLTPLPEVDDAEDLKTVAGGKVEKW 

PARLNAIPPRVNKGALEEITPEAFLENTKLWKQRVSYYIOCLDYQLGETGRYRNLVDMNAY 

LGGFAAAIJtf)DPWVMNWPVEAKLNTLGVIYE 

SVFTLYQGQCEPEEILLEMDRILRPGGGVIIRDDVDVLIKVKELTKGLEWEGRIADHEKG 

PHEREKIYYAVKQYWTVPAPDEDKNNTSALS* 

>G864 (503.. 1534) 

TGCAAAAACATTTTCTTGTCTCTCCTCTGCCC7^AATTTTTTTTCTTTCCAGGAATATTTC 
CTAGAAAAACCCAAGCAAAGCTTTAACCCCTTCCTCCTCCAAAAGTAGCATCTTCCTCTT 
TTTCTATTTCTCCTTTCCTCTTCTTATCTCTCTCTCGTTTGTGAACGATTCCTTAAGAAT 
ATAACCAAAAGCCCTTTTCTCCTTTCTTCAACTTTCCGGGAAAAATCTTCACGCAGCAAG 
GTTTCTCTCTCGGCTCTCGCAGTGTTTTTCGGGCCTTTTGTTCTTTCTATT^AAAAAAAAA 
TTCGCGTCCTTTAAGAAAACTTTTTCCACCTAGAGAAGAAGAAGAGTATCACTCTTGTTG 
TTCAAGTTTCTCTCTTTAATAAT^AAATCCATCTTTATTCTTTGTCTTCTTTCCTTTTTGC 
TTTCCCTAATCTCTATGTTATAAACACACAGAGAGAAACAAAGTCACAGTCTCGAGTCAA 
AAACAGAGAATACGAAAGAAAAATGGAAGCGGAGAAGAAAATGGTTCTACCGAGAATCAA 
ATTCACAGAGCACAAAACCAACACGACAACAATCGTATCGGAGTTAACCAACACTCACCA 
AACCAGGATTCTTCGTATCTCAGTCACTGACCCAGACGCTACTGATTCCTCCAGTGACGA 
CGAAGAAGAAGAACATCAACGCTTTGTCTCTAAACGCCGTCGTGTTAAGAAGTTTGTCAA 
CGAAGTCTATCTCGATTCCGGTGCTGTTGTTACTGGTAGTTGTGGTCAAATGGAGTCGAA 
GAAGAGAGAAAAGAGAGCGGTTAAATCGGAGTCTACTGTTTCTCCGGTTGTTTCAGCGAC 
GACGACTACGACGGGAGAGAAGAAGTTCCGAGGAGTGAGACAGCGTCCATGGGGAAAATG 
GGCGGCGGAGATAAGAGATCCGTTGAAACGTGTACGGCTCTGGTTAGGTACTTACAACAC 
GGCGGAAGAAGCTGCTATGGTTTACGATAACGCCGCTATTCAGCTTCGTGGTCCCGACGC 
TCTGACTAATTTCTCAGTCACTCCGACAACAGCGACGGAGAAGAAAGCCCCACCACCGTC 
TCCGGTGAAGAAGAAGAAGAAGAAAAACAACAAAAGCAAAAAATCCGTTACTGCTTCTTC 
CTCCATCAGCAGAAGCAGCAGCAACGATTGTCTCTGCTCTCCGGTGTCTGTTCTCCGATC 
TCCTTTCGCCGTCGACGAATTCTCCGGCATTTCTTCATCACCAGTCGCGGCCGTTGTAGT 
CAAGGAAGAGCCATCCATGACAACGGTATCTGAAACTTTCTCTGATTTCTCGGCGCCCTT 
GTTCTCAGATGATGACGTGTTCGATTTCCGGAGCTCAGTGGTTCCCGACTATCTCGGCGG 
CGATTTATTTGGGGAAGATCTATTCACGGCGGATATGTGTACGGATATGAACTTCGGATT 
CGATTTCGGATCCGGATTATCCAGCTGGCACATGGAGGACCATTTTCAAGATATCGGGGA 
TCTATTCGGGTCGGATCCTCTTTTAGCTGTTTAATAATATTTTAAATAAATAAATAGTTA 
TACCGGCCGTTACTAAACGGAACCGGAGAAAGTTTTGTATACCGGTGACATAAAATCTCG 
GTTATGTTCGTAATCTTTTTTTCH^ 

TGTAAGTTAATGGTGATAATTATTAACGTTTTAAGTTTTGAAAAAA7VAAAAAAA 
AAAAAAA 

>G864 Amino Acid Sequence (domain in AA coordinates: 119-186) 
MEAEKKIWLPRIKFTEHKTNTTTIVSELTNTHQTO 

FVSKRRRVKKFVNEVYLDSGAWTGS CGQMESKKRQKRAVKSESTVS PWS ATTTTTGEK 

KFRGVRQRPWGKWAAEIRDPLKRVRLWLGTYNTAEEAAMVYDNAAIQLRGPDAL 

PTTATEKKAPPPSPVKKKKKKNNKSKKS^ 

SGISSSPVAAVVVKEEPSMTTVSETFSDFSAPLFSDDDWDFRSSVVPDYLGGDLFGEDL 

FTABMCTDMNFGFDFGSGLSSWHMEDHFQDIGDLFGSDPLLAV* 

>G884 (31.. 1575) 

TTTTTTTTTGTTTGTTAATTTTGGGGATCGATGTCGGA71AAGGAAGAA 
TCGAAGTCCACCGGAGCTCCGTCGCGTCCGACTTTATCTCTTCCTCCACGGCCGTTTAGT 
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GAGATGTTCTTTAACGGTGGCGTTGGATTCAGTCCTGGTCCGATGACTCTGGTCTCTAAT 
ATGTTCCCTGATTCCGATGAGTTTAGGTCTTTCTCTCAGCTTCTCGCTGGAGCCATGTCT 
TCTCCAG CGACTGCAGCTG CTG CTGCTGCTGCTGCGACGGCTAGTGATTACCAGAG ACTT 
GGTGAAGGGACTAATAGCTCTAGTGGTGATGTTGACCCGAGATTCAAGCA7VAACAGACCA 
ACCGGTTTGATGATTTCTCAATCTCAATCGCCGTCGATGTTCACCGTACCGCCTGGTTTA 
AGTCCAGCTATGTTGCTCGATTCACCAAGCTTTTTGGGTCTTTTCTCTCCCGTTCAGGGA 
TCATATGGAATGACACATCAGCAAGCTCTAGCTCAAGTCACTGCTCAAGCAGTTCAAGCC 
AATGCCAATATGCAACCACAAACAGAGTACCCTCCTCCCTCTCAAGTTCAATCATTTTCA 
TCGGGTCAAGCGCAGATCCCGACCTCGGCTCCACTACCAGCTCAAAGAGAAACCTCAGAT 
GTAACCATCATAGAGCACAGGTCACAACAGCCTCTAAATGTTGACAAACCAGCTGATGAT 
GGCTATAACTGGCGAAAATATGGGCAAAAGCAAGTTAAAGGTAGCGAGTTTCCACGAAGC 
TATTACAAGTGTACTAATCCAGGATGTCCTGTCAAGAAGAAGGTTGAGAGATCTCTTGAT 
GGACAAGTAACGGAGATTATCTACAAAGGTCAGCACAATCATGAACCTCCTCAAAACACT 
AAGCGAGGTAACAAAGATAACACCGCGAATATAAATGGGAGTTCGATAAATAACAATCGC 
GGGAGTTCTGAATTGGGGGCATCACAGTTTCAAACTAATAGCTCCAACAAGACTAAGAGA 
GAGCAACATGAAGCAGTAAGTCAAGCTACGACAACAGAGCACTTGTCTGAGGCAAGTGAC 
GGTGAAGAAGTTGGTAATGGAGAAACTGATGTGAGAGAGAAAGATGAGAATGAGCCTGAT 
CCCAAGAGAAGAAGTACAGAAGTTCGGATTTCAGAACCAGCTCCTGCTGCTTCACATAGA 
ACTGTGACAGAGCCTAGAATTATTGTCCAAACGACGAGTGAAGTTGATCTTCTAGATGAT 
GGATATAGGTGGCGTAAATATGGACAGAAAGTTGTCAAAGGGAATCCTTATCCGAGGAGC 
TACTACAAGTGCACAACACCAGGATGTGGTGTGAGGAAACATGTAGAGAGAGCAGCAACA 
GATCCAA7^GCTGTAGTAACAACATATGAAGGAAAACATAACCATGACCTTCCCGCTGCT 
AAATCAAGCAGCCATGCCGCTGCAGCGGCACAGTTAAGGCCAGATAATCGACCTGGCGGT 
TTGGCTAACTTAAATCAACAGCAGCAGCAACAGCCCGTTGCGCGGCTAAGGCTTAAAGAA 
GAGCAAACAACTTGAGAGAAGAAAACTCTTGACCGTTTTTCATTACAAAAGCTTTCAAAT 
TCCACTCACACACTTGTCTGAAAAATCTAGCAGTTTGCAGGAAAGAAACAGCTTCAAGAG 
GTTGTAGTTCTTCTATGTTCTGGTGTAAAACTTAAAAGCTTTTTAGGGTTTTCAGATTTC 
TGTTTACTAATACTGTATGTGAATTCTTTTGTACATGAGGAAGAAAATTACAGGGGGATA 

TTTTGTGTTGTATCTTTTGTGTTATTGITrCAGTAAAAGATAGGTCTTACATTTTGTGTA . 
AAAAAAAAAAAAAAAAAAA 

>G884 Amino Acid Sequence (conserved domain in AA coordinates : 227-285 , 407-465) 

MSEKEEAPSTSKSTGAPSRPTLSLPPRPFSEMFFNGGVGFSPGPMTLVSNMFPDSDEFRS 

FSQLLAGAMSSPATAAAAAAAATASDYQRLGEGTNSSSGDVDPRFKQNRPTGLMISQSQS 

PSMFTVPPGLSPAMLLDSPSFLGLFSPVQGSYGMTHQQALAQVTAQAVQANANMQPQTEY' 

PPPSQVQSFSSGQAQIPTSAPLPAQRETSDVTIIEHRSQQPLNVDKPADDGYWWRKYGQK 

QVKGSEFPRSYYKCTNPGCPVKKKVERSLDGQWEIIYKGQHNHEPPQNTKRGNKDNTAN 

INGSSINNNRGSSELGASQFQTNSSNKTKREQHEAVSQATTTEHLSEASDGEEVGNGETD 

TOEKDENEPDPKRRSTETOISEPAPAASHRTVTEPRIIVQTTSEVX>LLDDGYRTO 

VVXGNPYPRSYYKCTTPGCGTOKHVERAA 

QLRPDNRPGGLANLNQQQQQQPVARLRLKEEQTT* 
>G898 (161.. 772) 

GAAAAAAAGATTCAAAAACCCTAGATTTCACAAAATCGAra^ 

GGCGATTTTCCTCGAGTGAAATTCGGCTCAAGGTGATTATAGCGATCATCGAATCAAATT 

GATTGAAGAGGTAC^U^GGTTAGTTACTTTGAGCTGAAAGATGAACACGTC^GAGGTGAG 

AGTACCTCGAGGAAATCGACGGAGGAAAGCTGTGATTGATCTGAATGCGGTACCTGTTGA 

TCAAGAAGGGACCTCTGCTTCTGTTAGAACTCTTACGGTGCCTATTACACCGTCTCAGCC 

TGCTCCTACGATGATTGATGTCGATGCTATTGAGGATGATGTTATTGAATCATCCGCTAG 

TGCTTTTGCTGAAGeTAAAAGCAAATCAAGAAATGCACGTCGGAGACCTTTGATGGTT^ 

TGTAGAGTCAGGAGGTACGACTAGATTCCCT^CCAACATAA 

TCCTTCTAGTGAATCTGTCM^^ 

GTCTTCGAGAGTGTCTAGATCAAAGGCTCCAGCTCCTCCACCAGAAGAGCCA7U\GTTTAC 
ATGTCCAATCTGC^TGTGTCCCTTTACGGAGGAGATGTCAACCAAGTGCGGTCmC^TCTT 
CTGCAAGGGATGTATAAAGATGGCAATATCTCGCC^^ 

AAAGGTTACTGCAAAAGAGCTGATTCGAGTTTTCCTTCC71ACCACTAGATGAGTGGTCCG 
GCAACATC^CCAGCCACCCTGTCTAATGGT^ 

ACATTGAAGGGACTTCGTTGACTTGGTATTTTTG7VATATTTTGCTTTGTTG 
TATTCAGTGATCAAGT^GCCAGAAGGCCCTATC^TTCGATGGATATCATTGGTAATAACT 
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CTTTGTTTTTAGTTGTTGTTCTATGTAATTTAGGTCTCTGCAAACCTCTCAGTCGATACT 
CTTCTCTCTTGATAGATGATAAGATATATGGAAAAAAAAATTAATATTGAATCTTTACTA 
AAA 

>G898 Amino Acid Sequence (domain in AA coordinates: 148-185) 

MNTSEVRVPRGNRRRKAVIDLNAVPVDQEGTSASVRTLTVPITPSQPAPTMIDVDAIEDD 

VIESSASAFAEAKSKSRNARRRPLMVDVESGGTTRFPANISNKRRRIPSSESVIDCEHAS 

VNDEVNMSSRVSRSKAPAPPPEEPKFTCPICMCPFTEEMSTKCGHIFCKGCIimAISRQG 

KCPTCRKKVTAKELIRVFLPTTR* 

>G900 (1..648) 

ATGGGGAAGAAGAAGTGCGAGTTATGTTGTGGTGTAGCGAGAATGTATTGTGAGTCAGAT 
CAAGCGAGTTTATGTTGGGATTGTGACGGTAAAGTTCACGGAGCTAATTTTCTGGTGGCG 
AAACACATGCGTTGTCTTCTATGTAGCGCGTGTCAGTCACACACGCCTTGGAAAGCTTCT 
GGGCTGAATCTTGGCCCAACTGTTTCTATCTGTGAGTCTTGTTTAGCTCGTAAGAAGAAT 
AACAACAGCTCCCTCGCCGGGAGGGATCAGAATCTTAACCAAGAAGAAGAGATCATTGGT 
TGTAACGACGGAGCTGAGTCTTATGATGAGGAAAGCGATGAGGATGAAGAAGAAGAAGAA 
GTGGAGAATCAGGTTGTTCCGGCTGCGGTGGAGCAAGAACTTCCGGTGGTGAGTTCGTCG 
TCTTCGGTTAGTAGTGGTGAAGGAGATCAGGTGGTGAAAAGGACGAGACTTGATTTGGAT 
CTTAACCTCTCCGATGAGGAGAACCAATCTAGACCATTGAAAAGATTATCGAGAGACGAA 
GGTTTGTCAAGATCAACTGTTGTGATGAATAGCTCAATCGTGAAATTACACGGAGGGAGG 
AGAAAAGCAGAGGGATGTGATACATCATCGTCGTCTTCGTTTTATTGA 

>G900 Amino Acid Sequence (domain in AA coordinates: 6-28, 48-74) 
MGKKKCELCCGVARMYCESDQASLCWDCDGKVHGANFLVAKHMRCLLCSACQSHTPWKAS 
GLNLGPTVS ICESCIiARKKNNNSSLAGRDQNLNQEEEI IGCNDGAES YDEESDEDEEEEE 
VENQWPAAVEQELPWSSSSSVSSGEGDQWICRTRLDLDLNLSDEENQSRPLKRLSRDE 
GLSRST WMNS S I VKLHGGRRKAEG CDTS S SSSFY* 
>G913 (108.. 806) 

CATTCAAAAACATCATATATATACACAAACACACTTTGATACAACAAAAAAAAACAGAAC 

ACAAACAAAAACACATTGTAACATTAGTTTAAGCATTAAGCTTCTTTATGTCGAATAATA 

ATAATTCTCCGACCACCGTGAATCAAGAAACGACGACGTCTCGTGAAGTCTCAATCACAT 

TGCCTACTGATCAATCTCCTCAAACCTCACCAGGATCATCTTCTTCTCCTTCACCGAGAC 

CTTCCGGTGGATCACCGGCGAGAAGAACGGCGACTGGATTATCCGGCAAGCACTCTATTT 

TCAGGGGGATTCGACTACGTAACGGAAAATGGGTATCGGAGATTAGAGAGCCACGTAAAA 

CGACAAGAATTTGGCTCGGGACTTATCCGGTACCGGAGATGGCTGCCGCCGCTTACGACG 

TGGCTGCGTTAGCTTTAAAAGGACCCGACGCCGTTTTGAATTTTCCTGGTTTAGCTTTGA 

CTTACGTGGCTCCGGTTTCAAACTCTGCTGCGGATATAAGAGCGGCTGCTAGTAGAGCAG 

CGGAGATGAAGCAACCGGATCAGGGTGGGGATGAGAAGGTATTGGAACCGGTTCAACCCG 

GCAAAGAGGAAGAATTAGAAGAAGTGTCGTGTAACTCGTGTTCGTTGGAGTTTATGGATG 

AGGAAGCGATGTTGAATATGCCGACTTTGTTGACGGAGATGGCTGAAGGGATGTTGATGA 

GTCCACCGAGAATGATGATACATCCGACGATGGAAGATGATTCGCCGGAGAATCATGAAG 

GAGATAATCTTTGGAGTTATAAATGAATCCATTGAAGCTCCT 

CGGTCGAATGAGATTTTCCCCCTTTTTTTTTTTCTTTTTGGGTCGCTGTT 

>G913 Amino Acid Sequence (domain in AA coordinates: 62-128) 

MSNNWNSPTTVNQETTTSREVSITLPTDQSPQTSPGSSSSPSPRPSGGSPARRTATGLSG 

KHSIFRGIRLRNGKWVSEIREPRKTTRIWLGTYPVPEMAAAAYDVAALALKGPDAVLNFP 

GLALTYVAPVSNSAADIRAT^SRAAEMKQPDQGGDEKVLEPVQPGKEEELEEVSCNSCSL 

EFMDEEAMLNMPTLLTEMAEGMLMSPPRMM^ 

>G937 (45.. 1046) 

TGGAAAAAGTTTGASTTTTTAATTCGAATCGAGAAAAAATAAAAATGGGTTCTTTAGG 
ATGAGCTTAGTTTGGGATCGATCTTTGGGAGAGGAGTTTCGATGAATGTTGTGGCGGTTG 
AGAAAGTTGATGAACATGTTAAGAAGCTTGAAGAAGAGAAGAGAAAGCTCGAAAGTTGTC 
AACTTGAGCTTCCTCTGTCTTTGCAGATTT^ 

AGAGATGTTCAGAGATGGAGACTCAACCATTGTTGAAAGATTTCATTTCTGTO 

CTATTCAAGGAGAAAGAGGAATAGAATTGCTGAAAAGAGAGGAGCTAATGAGGGAGAAGA 

AGTTTCAGCAATGGAAAGCTAATGATGATCACACTAGTAAGATCAAGAGCAAGCTTGAGA 

TTAAGAGAAATGAGGAGAAATCTCCTATGTTGTTGATTCCAAAGGTGGAT^ACTGGTTTAG 

GCCTCGGTTTAAGTTCGAGTTCGATAAGAAGAAAAGGGATTGTTGCCTCATGTGGCTTTA 

CTTCTAACTCTATGCCACAACCACCAACACCAGCAGTACCACAACAACCAGCATTTCTTA 
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AGCAGCAAGCTTTACGGAAGCAAAGAAGGTGTTGGAATCCAGAGTTGCATCGCCGATTTG 
TCGATG CATTG CAACAGCTAGGTGGACCGGGAGTGG CAACTCCTAAACAAATTAGAGAAC 

ATATGCAAGAAGAAGGCTTAACCAATGATGAAGTCAAGAGTCATTTACAGAAATACAGGT 

TACACATCAGGAAGCCAAATTCGAATGCGGAGAAACAATCAGCAGTTGTTTTAGGGTTTA 

ACTTGTGGAATTCTTCAGCACAAGATGAAGAAGAGACATGTGAAGGAGGAGAATCATTGA 

AGAGAAGCAATGCGCAATCAGATTCTCCTCAAGGTCCTTTGCAGTTACCGTCTAC7^ACAA 

CAACAACTGGTGGAGATAGTAGCATGGAAGATGTTGAAGATGCTAAGTCTGAGAGCTTTC 

AACTGGAGAGATTGAGATCACCATAAATCTCAAGAAACCAAACTCTTGATCACGGTTTTG 

TTATTTTGGATTCATTACTATATCTATTAGTAGTGAATGAGAACAATAATTATAGAAAGG 

TTTATAGATATATATATAGAGAAAAAGAGAGAGTGAGGATGGTTCAAATTATTTGCAGA 

>G937 Amino Acid Sequence (conserved domain in AA coordinates • 197-^46) 

MGSLGDELSLGSIFGRGVSMNWAVEKVDEHVKKLEEEKRKLESCQLELPLSLQILNDAI 

LYLKDKRCSEMETQPLLKI)FISVNKPIQGERGIELLKREELMREKKFQQWKANDDHTSKI 

KSKLEIKRWEEKSPMLLIPKVETGLGLGLSSSSIRRKGIVASCGFTSNSMPQPPTPAVPQ 

QPAFLKQQALRKQRRCWNPELHRRFVDALQQLGGPGVATPKQIREHMQEEGLTNDEVKSH 

LQKYRLHIRKPNSNAEKQSAWLGFNLWNSSAQDEEETCEGGESLKRSWAQSDSPQGPLQ 

LPSTTTTTGGDSSMEDVEDAKSESFQLERLRSP* 

>G960 (63.. 1538) 

TACCGTCGACCCACGCGTCCGAGTGTATTCAAAGTCGGAAAGAAACCCTAAAGAAGAGGA 

TTATGGGTGCTGTATCGATGGAGTCGCTTCCTTTAGGTTTCAGATTCAGACCTACCGATG 

AAGAGCTCGTCAATCACTACCTCCGTCTCAAGATCAACGGACGTCACTCCGATGTCCGTG 

TCATCCCTGATATCGATGTCTGCAAATGGGAACCTTGGGATCTTCCTGCTCTCTCGGTGA 

TTAAGACGGATGATCCAGAGTGGTTCTTTTTCTGCCCTCGTGATCGGAAATACCCTAATG 

GTCATCGCTCTAACAGAGCAACTGACTCTGGCTATTGGAAAGCTACTGGTAAAGATCGTA 

GCATCAAGTCTAAGAAGACTTTAATCGGTATGAAGAAGACTCTTGTCTTCTATCGTGGAC 

GAGCTCCTAAAGGTGAGCGGACTAATTGGATTATGCACGAGTATCGTCCCACTCTTAAGG 

ATCTTGATGGCACTTCCCCTGGCCAAAGCCCTTACGTTCTTTGTCGCCTCTTCCACAAGC 

CTGATGATCGGGTTAATGGTGTCAAGTCCGATGAAGCAGCTTTTACGGCCAGCAACAAAT 

ACTCACCTGATGATACATCATCTGATCTTGTTCAAGAAACACCTTCCTCTGATGCTGCTG 

TTGAGAAACCATCAGATTATTCAGGTGGATGCGGTTATGCTCATAGTAATAGTACCGCAG 

ATGGGACAATGATTGAGGCACCTGAAGAGAATOTTTGGTTATCTTGTGACCTTGAAGATC 

AAAAGGCACCACTACCGTGTATGGATTCTATATATGCTGGTGATTTCAGTTACGATGAGA 

TTGGATTCCAATTTCAAGATGGTACCAGCGAACCAGATGTATCACTAACAGAATTGTTGG 

AGGAGGTGTTCAATAACCCTGATGACTTCTCTTGCGAGGAATCGATCAGTCGAGAGAATC 

CAGCAGTCTCACCAAATGGGATATTTTCATCTGCTAAAATGCTGCAGTCTGCAGCACCAG 

AGGATGCTTTCTTCAACGACTTCATGGCTTTCACTGATACAGATGCTGAGATGGCGCAAT 

TGCAGTATGGTTCAGAAGGTGGAGCTTCTGGTTGGCCAAGTGACACTAATTCATACTATA 

GTGATTTGGTTCAGCAAGAGCAAATGATCAATCATAACACAGAGAACAACCTCACAGAAG 

GGAGAGGGATAAAGATCCGGGCTCGACAGCCTCAGAACCGGCAGAGTACAGGATTGATAA 

ACCAGGGTATTGCTCCAAGGAGAATCCGTCTGCAGCTGCAGTCTAACTCTGAAGTAAAAG 

AACGAGAGGAGGTGAATGAAGGACACACTGTTATTCCCGAGGCCAAAGAAGCTGCAGCTA 

AATACTCAGAGAAGAGTGGTTCTTTGGTTAAACCTCAAATAAAGCTCAGGGCGCGGGGAA 

CTATAGGCCAAGT7\AAAGGAGAGAGATTTGCAGACGACGAGGTACAGGTGCAGAGCACAA 

AGAGAGAGAGAGAGAGAATCAAATGTAGTTTAATGTAATTAGGGATGATGCAATGTTAGC 

ATGTTTGTGTGTTGTAACTTAAAAACTTATTTAGGAATCTGATAAAAGTTACTGTTGAAA 

AAAGAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

>G960 Amino Acid Sequence (domain in AA coordinates: 13-156) 

MGAVSMESLPLGFRFRPTDEELTOHYLRLKINGRHSDVRVIPDIDVCKWEPWDLPALSVT 

KTDDPEWFFFCPRDRKYPNGHRSNRATDSGYWKATGKDRSIKSKKTLIGMK^ 

APKGERTNWIMHEYRPTLKDLDGTSPGQSPYVLCRLFHKPDDRVNGVKSDEAAFTASNKY 

SPDDTSSDLVQETPSSDAAVEKPSDYSGGCGYAHSNSTADGTMIEAPEENLWLSCDLEDQ 

KAPLPCMDSIYAGDFSYDEIGFQFQDGTSEPDVSLTELLEEVFNNPDDFSCEESISRENP 

AVSPNG I FS S AKMLQSAAPEDAFFNDFMAFTDTDAEMAQLQYGSEGGASGWPSDTNS YYS 

DLVQQEQMINHNTENNLTEGRGIKIRARQPQNRQSTGLINQGIAPRRIRLQLQSNSEVKE 

REEVNEGHTVI PEAKEAAAKYSEKSGS LVKPQIKLRARGTIGQ VTCGERFADDEVQVQSTK 

RERERIKCSLM* 

>G991 (6.. 533) 
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G AAAAATGGAAG AAG AAAAGAG ATTGG AG CTAAGG CTAGCTCCTCCTTGTCACCAATTCA 
CTTCCAACAACAACATCAATGGATCTAAACAAAAAAGCTCGACCAAAGAAACATCATTCC 
TTTCCAATT^ACAGGGTTGAGGTAGCTCCAGTGGTGGGATGGCCGCCGGTGAGATCATCCC 
GGAGAAACCTAACGGCACAACTAAAGGAGGAGATGAAGAAGAAGGAGAGTGATGAAGAGA 
AGGAATTGTACGTTAAGATCAACATGGAAGGAGTTCCAATAGGAAGAAAAGTCAACCTTT 
CAGCTTATAACAACTACCAACAGCTTTCACATGCCGTTGACCAACTCTTCTCTAAGAAAG 
ATTCGTGGGATCTAAACAGACAATACACTTTGGTCTACGAAGACACTGAAGGAGATAAAG 
TTCTGGTCGGGGATGTTCCTTGGGAGATGTTTGTATCTACTGTAAAGAGGTTGCATGTTT 
TAAAGACCTCCCACGCCTTCTCACTCTCACCTAGAAAACATGGCAAGGAATAGAGAGAGG 
TTGGCCAAAATCATCAGTTCGATGGTTTGTTTTTAATGTAATTTTTGTGGAAACTAATGG 
GGTTTGGCTTTGATTTACTGGTTTTCTTTTTCACTTATGTACTAGGTTTTTGCTTGCTAT 
GTTATTTCTTGTTTTGGTTGTAAATATGCTGTTCGTTTAAGAAATCGGGGGTTAGTATGT 
TATCGTGTGTATAAAAATAGTGTAAGCACGTAAGTTGATTACAATVAT^TUUVAAAAAAAAAA 

AAAAAAAAA 

>G991 Amino Acid Sequence (domain in AA coordinates: 7-14,48-59,82-115,128-164) 
MEEEKRLELRLAPPCHQFTSNNNINGSKQKSSTKETSFLSNNRVEVAPWGWPPVRSSRR . 
NLTAQLKEEMKKKESDEEKELYVKIIMEGVPIGRKTOLSAYNNYQQLSHAVDQLFSKiaDS 
WDLWRQYTLVYEDTEGDK^VGDVPWEMFVSTVKRLHVLKTSHAFSLSPRKHGKE* 
>G748 (98.. 1444) 

CCACGCGTCCGCACTCTCCCAAATCTCTCTTCTTTAACAACAAAAAAAAAATCACAGAGA 

CATAGAGAGAAGAAGACGGAACAGAGGCTCCAAAAAAATGATGATGGAGACTAGAGATCC 

AGCTATTAAGCTTTTCGGTATGAAAATCCCTTTTCCGTCGGTTTTTGAATCGGCAGTTAC 

GGTGGAGGATGACGAAGAAGATGACTGGAGCGGCGGAGATGACAAATCACCAGAGAAGGT 

AACTCCAGAGTTATCAGATAAGAACAACAACAACTGTAACGACAACAGTTTTAACAATTC 

GAAACCCGAAACCTTGGAC7^AAGAGGAAGCGACATCAACTGATCAGATAGAGAGTAGTGA 

CACGCCTGAGGAT7VATCAGCAGACGACACCTGATGGTAAAACCCTAAAGAAACCGACTAA 

GATTCTACCGTGTCCGAGATGCAAAAGCATGGAGACCAAGTTCTGTTATTACAACAACTA 

CAACATAAACCAGCCTCGTCATTTCTGCAAGGCTTGTCAGAGATATTGGACTGCTGGAGG 

GACTATGAGGAATGTTCCTGTGGGGGCAGGACGTCGTAAGAACAAAAGCTCATCTTCTCA 

TTACCGTCACATCACTATTTCCGAGGCTCTTGAGGCTGCGAGGCTTGACCCGGGCTTACA 

GGCAAACACAAGGGTCTTGAGTTTTGGTCTCGAAGCTCAGCAGCAGCACGTTGCTGCTCC 

CATGACACCTGT.TATGAAGCTACAAGAAGATCAAAAGGTCTCAAACGGTGCTAGGAACAG 

GTTTCACGGGTTAGCGGATCAACGGCTTGTAGCTCGGGTAGAGAATGGAGATGATTGCTC 

AAGCGGATCCTCTGTGACCACCTCTAACAATCACTCAGTGGATGAATCAAGAGCACAAAG 

CGGCAGTGTTGTTGAAGCACAAATGAACAACAACAACAACAATAACATGAATGGTTATGC 

TTGCATCCCAGGTGTTCCATGGCCTTACACGTGGAATCCAGCGATGCCTCCACCAGGTTT 

TTACCCGCCTCCAGGGTATCCAATGCCGTTTTACCCTTACTGGACCATCCCAATGCTACC 

ACCGCATCAATCCTCATCGCCTATAAGCCAAAAGTGTTCAAATACAAACTCTCCGACTCT 

CGGAAAGCATCCGAGAGATGAAGGATCATCGAAAAAGGACAATGAGACAGAGCGAAAACA 

GAAGGCCGGGTGCGTTCTGGTCCCGAAAACGTTGAGAATAGATGATCCTAACGAAGCAGC 

AAAGAGCTCGATATGGACAACATTGGGAATCAAGAACGAGGCGATGTGCAAAGCCGGTGG 

TATGTTCAAAGGGTTTGATCATAAGACAAAGATGTATAACAACGACAAAGCTGAGAACTC 

CCCTGTTCTTTCTGCTAACCCTGCTGCTCTATCAAGATCACACAATTTCCATGAACAGAT 

TTAGAGTTACATATGTATATGTATATATGTATGATTGATTGTATGTATAGATGATACTGG 

AGAATGATGAGTTTTTGAGAATCAAACTOTTTTCTTCTTTCTAGTGATTGCCTTTATO 

TTTACATGTTTTGGTTCTCTGTACACTATTTGATTTACCTTTTTTACTTTCTTTCTTCAT 

TTGTCAGGAAATGTTGGAAGATAACATTAATGGTAAAAAGTTGGTGTGGACCGTTGTTGC 

GTTGGCATTTCAAAAAAAAAAAA7VAAA 

>G748 Amino Acid Sequence (domain in AA coordinates: 112-140) 
Ml^ETRDPAIKLFGMKIPFPSVFESAVTVEDDEEDDW 

NDNSFNWSKPETLDKEEATSTDQIESSDTPEDNQQTTPDGKTLKKPTKILPCPRCKSMET 
KFC YYNN YN INQPRHFCKACQRYWTAGGTMRNVPVG AGRRKNKS S S SH YRH ITI S E ALEA 
ARLDPGLQANTRVLSFGLEAQQQHVAAPMTPVM&QEDQCTSNGARNRFHGLADQRLVAR 
VENGDDCSSGSSVTTSHOTSVDESRAQSGSVVEAQMNl^^ 

PAMPPPGFYPPPGYPMPFYPYWTIPMLPPHQSSSPISQKCSNTNSPTLGKHPRDEGSSKK 
DNETERKQKAGCVLVP KTLRIDDPNEAAKS S I WTTLGI KNEAMCKAGGMFKGFDHKTKM Y 
KNDKAENSPVLSANPAALSRSHNFHEQI * 
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>G247 (1..660) 

ATGAGAATGACAAGAGATGGAAAAGAACATGAATACAAGAAAGGTTTATGGACAGTGGAA 

GAAGACAAGATCCTCATGGATTATGTCCGAACTCATGGCCAGGGCCACTGGAACCGCATC 

GCCAAGAAAACTGGGCTCAAGAGATGTGGGAAAAGCTGTAGGTTGAGATGGATGAACTAC 

TTAAGCCCTAATGTTAACAGAGGC7UVTTTTACTGACCAAGAAGAAGATCTCATCATCAGA 

CTCCACAAGCTCCTCGGCAACAGATGGTCGTTGATAGCGAAAAGAGTTCCGGGAAGAACA 

GACAACCAAGTAAAGAATTACTGGAACACACATCTCAGCAAGAAACTTGGTCTCGGAGAT 

CATTCAACTGCCGTCAAAGCCGCATGCGGTGTAGAGTCTCCACCGTCTATGGCCCTTATA 

ACCACAACGTCCTCCTCTCATCAAGAGATCTCCGGTGGAAAAAATTCAACTCTAAGGTTC 

GACACTTTAGTTGACGAATCCAAACTCAAACCAAAATCCAAACTAGTCCACGCAACACCA 

ACTGACGTAGAAGTTGCAGCTACGGTTCCAAATCTGTTCGATACCTTTTGGGTTCTTGAA 

GACGACTTCGAGCTTAGTTCACTCACTATGATGGATTTTACTAATGGGTATTGCCTTTGA 

>G247 Amino Acid Sequence (domain in AA coordinates: 15-116) 

MRMTMXSKEHEYKKGLWT^ 

LSPNVlteGNFTDQEEDLIIRLHKLLGNRWS^ 

HSTAVKAACGVESPPSMALITTTSSSHQEISGGKNSTLRFDTLVDESKLKPKSKLVHATP 

TDVEVAATVPNLFDTFWVLEDDFELSSLTMMDFTNGYCL* 

>G585 (111.. 2039) 

CTCTCAAACATTTCTCTGTTTGTTCCGGCGAAAACGGCAACTGTTTCATCA7UVTGACAAA 

CACAAAAACCTTAACATCTAGTTTGTATCCTCTCTGATACTTCAAAAAAAATGGATGAAG 

AAACAATGGCTACCGGACAAAACAGAACAACTGTGCCAGAGAATCTGAAGAAACACCTCG 

CAGTTTCAGTTCGAAACATTCAATGGAGTTATGGTATCTTTTGGTCTGTCTCTGCTTCTC 

AGTCTGGAGTTTTAGAATGGGGAGATGGATACTATAATGGAGATATCAAAACGAGGAAGA 

CGATTCAAGCTTCGGAGATCAAAGCTGATCAGCTTGGTCTACGGAGGAGCGAGCAGCTTA 

GCGAGCTTTACGAGTCTCTCTCCGTCGCTGAATCTTCTTCTTCAGGCGTTGCTGCCGGAT 

CTCAAGTCACCAGACGAGCTTCCGCCGCCGCACTTTCACCGGAAGATCTCGCCGACACCG 

AGTGGTACTATTTGGTTTGTATGTCTTTCGTCTTCAACATTGGTGAAGGAATGCCTGGAC 

GGACGTTTGCAAACGGTGAACCGATATGGTTGTGCAACGCTCATACGGCGGATAGTAAAG 

TGTTTAGCCGTTCTCTTCTAGCAAAAAGTGCTGCGGTTAAGACAGTGGTTTGCTTCCCGT 

TCCTTGGAGGAGTCGTTGAGATTGGTACCACAGAACATATTACGGAAGACATGAATGTAA 

TACAATGCGTGAAGACATCATTCCTCGAAGCCCCTGATCCGTACGCTACAATATTACCAG 

CAAGATCCGATTATCACATCGACAACGTTCTTGATCCGCAACAGATTCTAGGCGACGAGA 

TTTACGCGCCTATGTTCAGTACGGAGCCTTTTCCAACAGCTTCTCCGAGCAGAACTACCA 

ACGGTTTCGATCAAGAACATGAACAAGTAGCAGATGATCATGATTCTTTCATGACCGAAA 

GAATCACTGGAGGAGCTTCTCAGGTGCAAAGCTGGCAGCTCATGGACGACGAGCTTAGTA 

ACTGCGTTCACCAGTCGCTAAATTCCAGCGATTGCGTCTCTCAAACGTTTGTTGAAGGGG 

CGGCTGGACGGGTTGCTTACGGTGCAAGAAAGAGTAGAGTTCAAAGACTAGGGCAAATTC 

AAGAGCAACAGAGAAATGTGAAGACATTGTCATTTGATCCAAGAAACGACGACGTTCATT 

ACC7VAAGTGTGATCTCAACGATTTTTAAGACCAACCATCAGTTAATTCTCGGACCGCAGT 

TTCGAAACTGCGATAAACAGTCAAGCTTCACTAGGTGGAAGAAATCATCGTCATCATCAT 

C^GGAACCGCCACGGTCACGGCACCATCACAAGGAATGTO 

TTCCGCGAGTGCACCAGAAAGAGAAGTTAATGTTGGACTCACCAGAAGCCAGAGATGAAA 

CTGGGAACCATGCGGTTTTAGAGAAGAAGCGCCGCGAGAAATTGAACGAACGGTTCATGA 

CCTTGAGAAAAAT(^TTCCGTCAATC^CAAGATCGATAAAGTATCGATTCTTGACGATA 

CGATAGAGTATCTTCAAGAACTCGAGAGACGGGTTC^GAACTAGAATCTTGCAGAGAAT 

CAACCGATACAGAGACTCGTGGGACGATGACGATGAAGAGGAAGAAACCATGCGACGCAG 

GAGAAAGAACATCAGCTAATTGCGCAAATAATGAAACAGGAAATGGGAAGAAGGTGTCGG 

TTAAGAATGTTGGTGAAGCCGAGCCAGCAGATACCGGTTTTACTGGTTTAACCGATAATT 

TAAGGATCGGTTCGTTTGGTAATGAGGTGGTTATTGAGCTTAGATGTGCTTGGAGAGAAG 

GAGTATTGCTTGAGATAATGGATGTGATTAGTGATCTCCATTTGGATTCTCATTCGGTTC 

AATCCTCGACCGGAGACGGTTTGCTCTGCTTAACCGTCAATTGCAAGCACAAGGGGTC^ 

AAATAGCGACACCAGGAATGATCAAAG AAG CACTTCAAAGGGTTG CATGGATCTGTTGAA 

GACTACTTAGTTAAAATTGAGAGCAAAGAAAAAACATTCCCGGTTTGGTT^ 

GGTTTTCTTCTAACCGGGTTTTAGGAATTAATC 

TTTTTTGTGTCTTTTTTTCCGTTGCTTAACGTAGGTGAAGAGGAACATACACTATC 

TTTTGTTTGAGGTAGATTATTTTAAGGGTATTAGTAATAGTAATAGCCAGTTTAGATGAT 
TTTGTGTTCTTTTGTTGTT 
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>G585 Amino Acid Sequence (domain in AA coordinates : 436-501) 

MDEETMATGQNRTTVPENLKKHLAVSVRNIQWSYGIFWSVSASQSGVLEWGDGYYNGDIK 

TRKTI QAS EI KADQLGLRRSEQLSELYESLS VAES S S SGVAAGSQVTRRASAAALS PEDL 

ADTEWYYLVCMSFVFNIGEGMPGRTFANGEPIWLCNAHTADSKVFSRSLLAKSAAVKTW 

CFPFLGGWEIGTTEHITEDI4NVIQCVKTSFLEAPDPYATILPARSDYHIDNVLDPQQIL 

GDEIYAPMFSTEPFPTASPSRTTNGFDQEHEQVADDHDSFMTERITGGASQVQSWQLMDD 

ELSNCVHQSLNSSDCVSQTFVEGAAGRVAYGARKSRVQRLGQIQEQQRNVKTLSFDPRND 

DVHYQS VI STI FKTNHQLILGPQFRNCDKQSS FTRWKKS S SS S SGTATVTAPSQGMLKKI 

IFDVPRTOQKEKLMLDSPEARDETGNHAVLEKKRREKLNERFMTLRKI IPS INKIDK^ S I 

LDDTIEYLQELERRVQELESCRESTDTETRGTMTMKRKKPCDAGERTSANCANNETGNGK 

KVSVNNVGEAEPADTGFTGLTDNLRIGSFGNEWIELRCAWREGVLLEIMDVISDLHLDS 

HSVQSSTGDGLLCLTVNCKHKGSKIATPGMIKEALQRVAWIC* 

>G634 (1..798) 

ATGGAGCAAGGAGGAGGTGGTGGT6GTAAT6AAGTTGTGGAGGAAGCTTCACCTATTAGT 
TCAAGACCTCCTGCTAACAACTTAGAAGAGCTTATGAGATTCTCAGCCGCCGCGGATGAC 
GGTGGATTAGGAGGTGGAGGTGGAGGAGGAGGAGGAGGAAGTGCTTCTTCTTCATCGGGA 
AATCGATGGCCGAGAGAAGAAACTTTAGCTCTTCTTCGGATCCGATCCGATATGGATTCT 
ACTTTTCGTGATGCTACTCTCAAAGCTCCTCTTTGGGAACATGTTTCCAGGAAGCTATTG 
GAGTTAGGTTACAAACGAAGTTCAAAGAAATGCAAAGAGAAATTCGAAAACGTTCAGAAA 
TATTACAAACGTACTAAAGAAACTCGCGGTGGTCGTCATGATGGTAAAGCTTACAAGTTC 
TTCTCTCAGCTTGAAGCTCTCAACACTACTCCTCCTCCTCCTCCTTCTCATCCTCACGCT 
CATCAACCAGAACAGAAACAACAACAACAACCACAACAAGAGATGGTCATGAGCTCGGAA 
CAATC ATCATTACCATCATCATCAAG ATGGCCAAAG G CAG AG ATTCTAG CG CTTAT AAAC 
CTGAGAAGTGGAATGGAACCAAGGTACCAAGATAATGTACCTAAAGGACTTCTATGGGAA 
GAGATCTCAACTTCAATGAAGAGAATGGGATACAACAGAAACGCTAAGAGATGTAAAGAG 
AAATGGGAAAACATAAACAAATACTACAAGAAAGTTAAAGAAAGCAACAACAGCAACTAC 
AACAACAAGAATCAATGA 

>G634 Amino Acid Sequence (domain in aa coordinates: 62-147, 189-245) 

MBQGGGGGGNEVVEEASPISSRPPANNIiEELMRFSAAADDGGLGGGGGGGGGGSASSSSG 

NRWPREETLALLRIRSDMDSTFRDATLKAPLWEHVSRKLLELGYKRSSKKCKEKFENVQK 

YYKRTKETRGGRHDGKAYKFFSQLEALNTTPPPPPSHPHAHQPEQKQQQQPQQEMVMSSE 

QSSLPSSSRWPKAEILALINLRSGMEPRYQDNVPKGLLWEEISTSMKRMGYNRNAKRCKE 

KWENINKYYKKVKESNNSNYWNKNQ* 

>G676 (1..612) 

atgagaaagaaagtaagtagtagtggtgacgaaggaaacaatgagtacaagaaaggtttg 
tggacagtagaagaagacaaaatcctcatggattatgtcaaagctcatggcaaaggtcac 
tggaatcgtattgccaaaaagactggtttaaagagatgtggaaagagttgtagattgagg 
tggatgaattatctcagccctaatgtgaaaagaggcaatttcaccgagcaagaagaggat 
cttatcattaggctccacaagttgcttggtaataggtggtctttaattgctaaaagagtg 
ccgggtcgaacggataatcaagtgaagaactattggaacacgcatcttagtaagaaactc 
ggaatcaaagatcagaaaaccaaacagagcaatggtgatattgtttatcaaatcaatctc 
ccgaatcctaccgaaacatcagaagaaacgaaaatctcgaatattgtcgataacaataat 
atcctcggagatgaaattcaagaagatcatcaaggaagtaactacttgagttcactttgg 
gttcatgaggatgagtttgagcttagcacactcaccaacatgatggactttatagatgga 
cactgtttttga 

>G676 Amino Acid Sequence (domain in AA coordinates: 17-119) 

MRKKVSSSGDEGNNEYKKGLWTVEEDKILM^ 

WMNYLSPITVKRGNFTEQEEDLIIRLHKLLGNRW^ 

GIKDQKTKQSNGDIVYQINLPNPTETSEETKISNIVDNNNILGDEIQEDHQGSNYLSSLW 

VHEDEFELSTLTNMMDFIDGHCF* 

>G682 (1..228) 

ATGGATAACCATCGCAGGACTAAGCAACCCAAGACCAACTCCATCGTTACTTCTTCTTCT 

GAAGAAGTGAGTAGTCTTGAGTGGGAAGTTGTGAACATGAGTCAAGAAGAAGAAGATTTG 

GTCTCTCGAATGCATAAGCTTGTCGGTGACAGGTGGGAACTGATAGCTGGGAGGATCCCA 

GGAAGAACCGCTGGAGAAATTGAGAGGTTTTGGGTCATGAAAAATTGA 

>G682 Amino Acid Sequence (domain in AA coordinates 27-63) 

MDNHRRTKQPKTNS I VTS S SEEVS S LEWEWNMSQEEEDLVSRMHKLVGDRWELI AGR I P 
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GRTAGE I ERFWVM KN * 
>G635 (1..993) 

ATGGAGATCATGCGTCCAGGGGTCTCAGAAAACACTTTGAAAGGAAAAATAAGAATCACA 

ACGCGGTGCATGTGGCTTGACAAAGGAAGACTTTTAGATGCACTTCACAAAGCAGCTCAT 

GCTGCTCTATCAAGTTGTCCTGTGACATGTCCCTTGTCTCACATGGAAAGAACAGTCTCC 

GAAGTCCTGAGGAAGATTGTAAGGAAGTACAGTGGTAAAAGGCCTGAAGTCATCGCTATA 

GCCACTGAGAATCCAATGGCTGTCCGAGCTGATGAGGTCAGTGCGAGACTGTCTGGTGAT 

CCAAGTGTTGGTTCTGGAGTTGCAGCTTTAAGGAAAGTTGTTGAAGGAAATGACAAAAGA 

AGTCGGGCGAAGAAAGCACCTTCACAAGAAGCTTCCCCCAAAGAAGTAGATCGCACTTTG 

GAAGATGATATCATTGATAGTGCAAGACTACTGGCTGAAGAAGAAACTGCGGCATCAACA 

TACACGGAAGAAGTTGATACGCCCGTTGGGAGTTCTTCAGAAGAGTCAGACGATTTTTGG 

AAATCATTCATCAATCCATCATCGTCACCTTCACCGAGTGARACAGAAAATATGAATAAG 

GTAGCTGATACGGAGCCTAAAGCAGAGGGTAAGGAAAACAGCAGAGACGACGATGAATTA 

GCTGATGCTTCAGATTCTGAAACCAAGTCATCACCAAAACGTGTGAGGAAGAACAAATGG 

AAACCGGAGGAGATAAAGAAGGTAATCAGAATGCGAGGAGAGCTGCACAGTAGATTTCAA 

GTGGTGAAAGGTAGAATGGCATTGTGGGAAGAGATCTCTTCAAATCTATCAGCTGAAGGA 

ATCAATCGAAGCCCGGGACAATGCAAATCTCTCTGGGCATCACTTATTCAGAAATACGAG 

GAGAGCAAGGCTGATGAGAGAAGCAAGACGAGTTGGCCACATTTTGAGGATATGAACAAC 
ATTTTGTCAGAGCTAGGCACACCTGCGTCTTAA 

>G635 Amino Acid Sequence (domain in AA coordinates- 239-323) 
MEIMRPGVSENTLKGKIRITTRCMWLDKGRLLDALHKAAHAALSSCPVTCPLSHMERTVS 
EVLRKIVRKYSGKRPEVIAIATENPMAVRADEVSARLSGDPSVGSGVAALRKWEGNDKR 
SRAKKAPSQEASPKEVDRTLEDDI IDSARLLAEEETAASTYTEEVDTPVGS S SEESDDFW 

KSFINPSSSPSPSETENMNKVADTEPKAEGKENSRDDDELADASDSETKSSPKRVRKNKW 

KPEEIKKVIRMRGELHSRFQWKGRMALWEEISSNLSAEGINRSPGQCKSLWASLIOKYE 

ESKADERSKTSWPHFEDMNNILSELGTPAS* 

>G1068 (150.. 1310) 

GAGAGTTGTTAGCTAGCTCACACGCTTrCGCTTAAAACTCAAAAACCTGCACTTTCTCGT 

CTATTTTCTCGGCATTCGTAAAACAGAAAAGTGGGTCTCCAAGAAAATTACCCTAAATTC 

AaVAAGATTCATACTTTTCTCCACCTCCAATGGATTCCAGAGAGATCCACCACCAACAAC 

AGCAACAACAACAACAACAACAGCAGCAGCAGCAACAACAGCAACATCTACAACAACAGC 

AACAACCACCGCCAGGGATGTTAATGAGTCACCACAATTCCTACAATCGAAACCCTAACG 

CCGCCGCCGCTGTTTTAATGGGTCACAACACCTCCACATCTCAAGCTATGCATCAAAGAT 

TACCTTTTGGTGGTTCTATGTCACCGCATCAGCCTCAACAACATCAGTATCATCATCCTC 

AGCCTCAGCAACAGATAGATCAGAAGACTCTTGAATCTCTTGGATTTCCTACTTCGCCTC 

TTCCTTCTGCTTCTAATTCTTACGGTGGTGGAAATGAAGGAGGTGGTGGTGGTGATAGCG 

CCGGAGCTAATGCTAACTCTTCCGATCCACCTGCTAAACGGAACAGAGGACGTCCTCCTG 

GCTCCGGTAAGAAGCAGCTCGATGCTTTAGGAGGAACAGGAGGAGTTGGGTTCACGCCTC 

ATGTCATTGAGGTTAAAACAGGAGAGGACATAGCTACGAAGATATTGGCGTTTACGAACC 

AAGGGCCACGCGCAATCTGTATTCTCTCAGCTACAGGAGCTGTAACTAATGTGATGCTTC 

GTCAAGCTAACAATAGCAATCCTACTGGAACTGTTAAGTATGAGGGCCGATTTGAAATCA 

TTTCTCrrGTCAGGTTCTTTCTTGAATTCTGAGAGTAATGGTACTGTGACCAAAACTGGTA 

ACTTGAGTGTGTCGCTGGCTGGACACGAAGGCCGGATTGTGGGTGGATGTGTTGATGGAA 

TGCTAGTAGCTGGATCACAAGTCCAGGTCATTGTGGGAAGCTTTGTACCAGATGGAAGGA 

AGCAGAAACAAAGTGCGGGGCGTGCTCAGAATACTCCGGAGCCAGCTTCAGCACCAGCCA 

ATATGTTGAGCTTTGGTGGTGTTGGTGGACCGGGAAGCCCTCGATCTCAAGGACAACAAC 

ACTCGAGCGAGTCATCAGAGGAAAACGAAAGTAATTCTCCGTTGCACCGTAGAAGCAACA 

ACAACffiACAGCAAC^ATCATGGGATATTTGGAAACTCTACACCTCAACCGCTTCACCAAA 

TTCCTATGCAGATGTACCAGAATCTCTGGCCTGGCAACAGTCCTCAATAAACAGATGGTT 

CATGGGTCAAGATTTGACCGGGTTTGCTTCTCTGTTCCTTTTGACACATCTCTCCATCAG 

ATTTATCTCTATAAAGTAGATTGAGCTCTCTTACTCTCTCATCTTCTTCTCCTTTACTAT 

TTCTCTTAAATTTAGCTTTGGTTTTAGATAAATAGAGAGAGAGAGACATGTTAAGTAGGT 

TTCAAATTCAATCTTGTTTAGTTTGTTTCTTAGTAGTTTCTT^ATTGTGATGATCATA 

AAGACTTGTTCTTTTTCTCCTATATTCAACGAATTATCCACTTTAA 

>G1068 Amino Acid Sequence (domain in AA coordinates: 143-150) 

^SREIHHQQQQQQQQQ Q QQQ QQQQHLQQQQQ ppp GMLMSHHNSYNR1JpNAAAAVIjM 

TSTSQAMHQRLPFGGSMSPHQPQQHQYHHPQPQQQIDQKTLESLGFPTSPLPSASNSYGG 
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GNEGGGGGDSAGANANSSDPPAKRNRGRPPGSGKKQLDALGGTGGVGFTPHVIEVKTGED 
I ATKILAFTNQGPRAICI LSATGAVTNVMLRQANWSNPTGTVKYEGRFE 1 1 SLSGS FLNS 
ESNGTVTKTGNLSVSLAGHEGRIVGGCVDGMLVAGSQVQVIVGSFVPDGRKQKQSAGRAQ 
NTPEPASAPANMLSFGGVGGPGSPRSQGQQHSSESSEENESNSPLHRRSNNNNSNNHGIF 

GNSTPQPLHQIPMQMYQNLWPGNSPQ* 
>G1225 (1..984) 

ATGACTCTAGAAGCTTTATCATCAAACGGTCTTTTAAACTTTTTGCTCTCTGAAACTCTT 
TCACCAACTCCATTCAAGTCTCTCGTCGATCTCGAGCCATTGCCGGAAAATGATGTCATC 
ATATCGAAGAACACAATTTCGGAGATATCTAATCAAGAACCGCCACCACAGCGACAACCA 
CCAGCTACGAATCGAGGGAAGAAGCGGCGGAGGAGGAAGCCTAGGGTTTGCAAAAACGAG 
GAAGAAGCTGAGAATCAACGAATGACTCACATTGCCGTCGAAAGAAATCGAAGAAGACAA 
ATGAATCAACATCTCTCTGTCTTGCGATCTCTCATGCCTCAACCTTTTGCTCACAAGGGT 
GATCAAGCTTCAATAGTTGGTGGAGCCATAGATTTCATCAAAGAACTTGAACACAAATTA 
CTATCTCTTGAAGCTCAAAAACATCATAATGCTAAATTAAACCAGTCGGTTACTTCTTCA 
ACAAGTCAAGACTCAAATGGTGAACAAGAGAATCCTCATCAACCATCTTCACTATCTCTA 
TCGCAGTTCTTTCTTCATTCATACGATCCGAGCCAAGAGAATAGGAACGGCTCAACAAGC 
TCGGTGAAAACCCCTATGGAAGATCTTGAGGTGACTCTAATCGAAACTCATGCTAACATC 
AGAATCTTGTCGAGAAGAAGAGGTTTCCGGTGGAGCACGTTGGCCACCACCAAACCGCCG 
CAGCTTTCGAAGCTGGTGGCTTCTCTACAATCGCTGTCCCTCTCCATTCTTCACCTTAGT 
GTCACAACATTGGACAATTATGCTATTTACTCCATCAGCGCTAAGGTGGAAGAGAGTTGC 
CAGCTAAGTTCAGTAGATGACATTGCAGGAGCAGTTCACCACATGCTAAGTATCATTGAA 
GAGGAGCCTTTTTGTTGCTCATCAATGTCAGAATTACCATTTGACTTCTCTTTGAATCAC 
TCAAATGTCACTCATTCTCTCTGAGAAATCTCTTTTTTGTTGTTGTTATTCCTTCTTTTA 
ATTTTATCACATAGCACATCTTTAGTTTTTTTTTTT 

>G1225 Amino Acid Sequence (domain in AA coordinates: 78-147) 

MTLEALSSNGLLNFLLSETLSPTPFKSLVDLEPLPENDVIISKNTISEISNQEPPPQRQP 

PATNRGKKRRRRKPRVCKKTEEEAENQRMTH^ 

DQASIVGGAIDFIKELEHKLLSLEAQKlfflNAJaaNQSVTSSTSQDSNGEQENPHQPSSLSL 

SQFFLHSYDPSQENRNGSTSSVKTPMEDLEVTLIETHANIRILSRRRGFRWSTLATTKPP 

QLSKLVASLQSLSLSILHLSVTTLDOTAIYSISAKVEESCQLSSVDDIAGAVHHMLSIIE 

EEPFCCSSMSELPFDFSLNHSNVTHSL* 

>G1337 (97.. 1398) 

AATGGATTTGTCATCATTCTTCTCACCGTCCTTAGTCTCTGAAAATAAATTCTGATTTTG 
ATTTCGAATTTTAGGGATTTTGAGAGAGAGTCAGTTATGAGTAGTTCGGAGAGAGTACCG 
TGCGATTTCTGCGGCGAGCGTACGGCGGTTTTGTTTTGTAGAGCCGATACGGCGAAGCTG 
TGTTTGCCTTGTGATCAGCAAGTTCACACGGCGAATCTGTTGTCGAGGAAGCACGTGCGA 
TCTCAGATCTGCGATAATTGCGGTAACGAGCCAGTCTCTGTTCGGTGTTTCACCGATAAT 
CTGATTTTGTGTCAGGAGTGTGATTGGGATGTTCACGGAAGTTGTTCAGTTTCCGATGCT 
CATGTTCGATCCGCCGTGGAAGGTTTTTCCGGTTGTCCATCGGCGTTGGAGCTTGCTGCT 
TTATGGGGACTTGATTTGGAGCAAGGGAGGAAAGATGAAGAGAATCAAGTTCCGATGATG 
GCGATGATGATGGATAATTTCGGGATGCAGTTGGATTCTTGGGTTTTGGGATCTAATGAA 

AGGTATAAGCAGGTATTGTGTAAGCAGCTTGAGGAGTTGCTTAAGAGTGGTGTTGTCGGT 

GGTGATGGCGATGATGGTGATCGTGACCGTGATTGTGACCGTGAGGGTGCTTGTGATGGA 

GATGGAGATGGAGAAGCAGGAGAGGGGCTTATGGTTCCGGAGATGTCAGAGAGATTGAAA 

TGGTCAAGAGATGTTGAGGAGATCAATGGTGGCGGAGGAGGAGGAGTTAACCAGCAGTGG 

AATGCTACTACTACTAATCCTAGTGGTGGCCAGAGTTCTCAGATATGGGATTTTAACTTG 

GGACAGTCACGGGGACCTGAGGATACGAGTCGAGTGGAAGCTGCATATGTAGGG71AAGGT 

GCTGCTTCTTCATTCACAATCAACAATTTTGTTGACCATATGAATGAAACTTGTTC 

AATGTGAAAGGTGTCAAAGAGATTAAAAAGGATGACTACAAGCGATCAACTTCAGGCCAG 

GTACAACCAACAAAATCTGAGAGCAACAATCGTCCAATTACCTTTGGCTCTGAGAAAGGT 

TCGAACTCCTCCAGTGACTTGCATTTCACAGAGC^TATTGCTGGAACTAGTTGTAAGACC 

ACAAGACTAGTTGCAACTAAGGCTGATCTGGAGCGGCTGGCTCAGAACAGAGGAGATGCA 

ATGCAGCGTTACAAGGAAAAGAGGAAGACACGGAGATATGATAAGACCATAAGGTATGAA 

TCGAGGAAGGCAAGAGCTGACACTAGGTTGCGTGTCAGAGGCAGATTTGTGAAAGCTAGT 

GAAGCTCCTTACCCTTAACCTTAAGTTTTTTCACATAGGCTTCCTTTTAGCTACAAACTT 

AGTTACTTTTTTTACTCCACTGCCTCATAAATGTACAGACCGGTCTCGTTTCATCTGGCC 
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GCCCTTCTTGTTTTATTGCCTTATCTGGCCCTTTTATGTACCTTGGAATCTTATCTAGTT 

TAAAAAAGATTGTAACCTTCTAGAAAACCATATTCTGTTGACAGTATATACATGTCTATC 
CAAGCAAAAA 

>G1337 Amino Acid Sequence (domain in AA coordinates: 9-75) 

MSSSERVPCDFCGERTAVLFCRADTAKLCLPCDQQVHTANLLSRKHVRSQICDNCGNEPV 

SVRCFTDNLILCQECDWDVHGSCSVSDAHVRSAVEGFSGCPSALELAALWGLDLEQGRKD 

EENQVPMMAMMMDNFGMQLDSWVLGSNELIVPSDTTFKKRGSCGSSCGRYKQVLCKQLEE 

LLKSGWGGDGDDGDRDRDCDREGACDGDGDGEAGEGLMVPEMSERLKWSRDVEEINGGG 

GGGVNQQWNATTTNPSGGQSSQIWDFNLGQSRGPEDTSRVEAAYVGKGAASSFTINWFVD 

HMNETCSTNVKGVKEIKKDDYKRSTSGQVQPTKSESNNRPITFGSEKGSNSSSDLHFTEH 

IAGTSCKTTRLVATKADLERLAQNRGDAMQRYKEKRKTRRYDKTIRYESRKARADTRLRV 
RGRFVKASEAPYP* 

>G1759 (110.. 700) 

CGAGAAAAGGAAAAAAAAAAATAGAAAGAGAAAACGCTTAGTATCTCCGGCGACTTGAAC 
CCAAACCTGAGGATCAAATTAGGGCACAAAGCCCTCTCGGAGAGAAGCCATGGGAAGAAA 
AAAACTAGAAATCAAGCGAATTGAGAACAAAAGTAGCCGACAAGTCACCTTCTCCAAACG 
TCGCAACGGTCTCATCGAGAAAGCTCGTCAGCTTTCTGTTCTCTGTGACGCATCCGTCGC 
TCTTCTCGTCGTCTCCGCCTCCGGCAAGCTCTACAGCTTCTCCTCCGGCGATAACCTGGT 
CAAGATCCTTGATCGATATGGGAAACAGCATGCTGATGATCTTAAAGCCTTGGATCATCA 
GTCAAAAGCTCTGAACTATGGTTCACACTATGAGCTACTTGAACTTGTGGATAGCAAGCT 
TGTGGGATCAAATGTCAAAAATGTGAGTATCGATGCTCTTGTTCAACTGGAGGAACACCT 
TGAGACTGCCCTCTCCGTGACTAGAGCCAAGAAGACCGAACTCATGTTGAAGCTTGTTGA 
GAATCTTAAAGAAAAGGAGAAAATGCTGAAAGAAGAGAACCAGGTTTTGGCTAGCCAGAT 
GGAGAATAATCATCATGTGGGAGCAGAAGCTGAGATGGAGATGTCACCTGCTGGACAAAT 
CTCCGACAATCTTCCGGTGACTCTCCCACTACTTAATTAGCCACCTTAAATCGGCGGTTG 
AAATCAAAATCCAAAACATATATAATTATGAAGAAAAAAAAAATAAGATATGTAATTATT 
CCGCTGATAAGGGCGAGCGTTTGTATATCTTAATACTCTCTCTTTGGCCAAGAGACTTTG 
TGTGTGATACTTAAGTAGACGGAACTAAGTCAATACTATCTGTTTTAAGACAAAAGGTTG 
ATGAACTTTGTACCTTATTCGTGTGAGAAAAAAAAAAAAAAAA 

>G1759 Amino Acid Sequence (conserved domain in AA coordinates- 2-57) 

MGRKKLEIKRIENKSSRQVTFSKRRNGLIEKARQLSVLCDASVALLVVSASGKLYSFSSG 
DNLVTCILDRYGKQHADDLKALDH^^ 

EEHLETALSVTRAKKTELMLKLTO^ 

AGQI SDNLP VTLPLLN* 

>G1804 (169. .1497) 

TATCTCTCTCTTTCTCAAAACCTTTCAGTCAAAATTCTCCGGCGGCTTTTAAACTATGTG 
AAGGAGGAGAACCTCCATAACAAGAAGCGGATTCTCTCAGTTTTCCGGCGGCGGAGGAAC 
ACAAAGCCACCGGTTTTTAGACACACAGATTTCATTT^ 

GAAACGAAGTTGACGTCAGAGCGAGAAGTAGAGTCGTCCATGGCGCAAGCGAGACATAAT 

GGAGGAGGTGGTGGTGAGAATCATCCGTTTACTTCTTTGGGAAGACAATCCTCTATCTAC 

TCATTGACCCTTGACGAGTTCCAACATGCTTTATGTGAGAACGGC^GAACTTTGGGTCC 

ATGAACATGGACGAGTTTCTTGTCTCTATTTGGAACGCAGAGGAGAATAATAACAATCAA 

CAACAAGCAGCAGCAGCTGCAGGTTCACATTCTGTTCCGGCTAATCACAATGGTT^ 

AACAACAATAACAATGGAGGCGAGGGTGGTGTTGGTGTCTTTAGTGGTGGTTCTAGAGGC 

AACGAAGATGCTAACAATAAGAGAGGGATAGCGAACGAGTCTAGTCTTCCTCGACAAGGC 

TCTTTGACACTTCCAGCTCCGCTTTGTAGGAAGACTGTTGATGAGGTTTGGTCTGAGATA 

CATAGAGGTGGTGGTAGCGGTAATGGAGGAGACAGCAATGGACGTAGTAGTAGTAGTAAT 

GGACAGAACAATGCTCAGAACGGCGGTGAGACTGCGGCTAGACAACCGACTTTTGGAGAG 

ATGACACTTGAGGATTTCTTGGTGAAGGCTGGTGTGGTTAGAGAACATCCCACTAATCCT 

AAACCTAATCCAAACCCGAACCAAAACCAAAACCCGTCTAGTGTAATACCCGCAGCTGCA 

C^GC^CAGCTTTATGGTGTGTTTCAAGGAACCGGTGATCCTTC^TTCCCGGG 

ATGGGTGTGGGTGACCCATCAGGTTATGCTAAAAGGACAGGAGGAGGAGGGTATCAGCAG 

GCGCCACCAGTTCAGGCAGGTGTTTGCTATGGAGGTGGCGTTGGGTTTGGAGCGGGTGGA 

CAGCAAATGGGAATGGTTGGACCGTTAAGCCCGGTGTCTTCAGATGGATTAGGACATGGA 

CAAGTGGATAACATAGGAGGTCAGTATGGAGTAGATATGGGAGGGCTAAGGGGAAGGAAA 

AGAGTAGTGGATGGTCCAGTGGAGAAAGTAGTGGAGAGAAGACAGAGGAGGATGATCAAG 

AACCGCGAGTCTGCTGCTAGATCTAGAGCAAGAAAACAAGCATATACAGTGGAATTGGAA 
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GCTGAACTTAACCAGTTGAAAGAAGAGAATGCGCAGCTAAAACATGCATTGGCGGAGTTG 
GAGAGGAAGAGGAAGCAACAGTATTTTGAGAGTTTGAAGTCAAGGGCACAACCGAAATTG 
CCGAAATCGAACGGGAGATTGCGGACATTGATGAGGAACCCGAGTTGTCCACTCTAAACA 
AACAATAGGAAGATGGAGAAGAAGTCGGAGACAGAACGAGGGAAAAACTGATGATTTTCT 
ACGTTGTTGTTTTGTCTTTGAGG/^ATGAGGTTATAGAATCTTTATACTTTGATGTTTTCT 
GTGTTGGTAGGAGGAACACCATCTGATCTGCTTTACTAGTGTTCCCTGTGAACAAAGAAA 
GTGATTCTGTGTTTCAACATCATCAATCTTTGGAAA ' 

>G1804 Amino Acid Sequence (domain in AA coordinates: 357-407) 

MVTRETKLTSEREVESSMAQARHNGGGGGENHPFTSLGRQSSIYSLTLDEFQHALCENGK 

NFGSMNMDEFLVSIWNAEENl^QQQAAAAAG 

GSRGNEDANNKRGIANESSLPRQGSLTLPAPLCRKTVDEWSEIHRGGGSGNGGDSNGRS 

SSSNGQNNAQNGGETAARQPTFGEMTLEDFLVKAGWREHPTNPKPNPNPNQNQNPSSVI 

PAAAQQQLYGVFQGTGDPSFPGQAMGVGDPSGYAKRTGGGGYQQAPPVQAGVCYGGGVGF 

GAGGQQMGMVGPLSPVSSDGLGHGQVDNIGGQYGVDMGGLRGRKRWDGPVEKWERRQR 

RMIKNRESAARSRARKQAYTVELEAELNQLKEENAQLKHALAELERKRKQQYFESLKSRA 

QPKLPKSNGRLRTLMRNPSCPL* 

>G207 (16.. 930) 

aaaagatctgtttcaatggcggatcgtgttaaaggtccatggagtcaagaagaagatgag 
cagctacgaaggatggttgagaaatacggaccgaggaattggtctgcgattagcaaatcg 
attccaggtcgatctggtaaatcgtgtagattacgttggtgtaatcagttatctccggag 
gttgagcatcgtcctttctcgccggaggaagatgagactattgtaaccgcccgtgctcag 
tttggtaacaagtgggcgacgattgctcgtcttcttaacggtcgtacggataacgccgtt 
aaaaatcactggaactctacgcttaagaggaaatgcagcggaggtgtggcggttacgacg 
gtgacggagacggaggaagatcaggatcggccgaagaagaggagatctgttagctttgat 
cctgcttttgctccggtggatactggattgtacatgagtcctgagagtcctaacggaatc 
gatgttagtgattctagcacgattccgtcaccgtcgtctcctgttgctcagctgtttaaa 
ccaatgccgatttccggcggttttacggtggttccgcagccgttaccggttgaaatgtct 
tcgtcttcggaggatccacctacttcgttgagtttgtcactacctggagctgagaacacg 
agttcgagccataacaataacaacaacgcgttgatgtttccgagatttgagagtcagatg 
aagattaatgtagaggagagaggaggaggaggagaaggacgtagaggtgagtttatgacg 
gtggtgcaggagatgataaaagctgaagtgaggagttacatggcggaaatgcagaaaaca 
agtggtggattcgtcgteggaggtttatacgaatccggcggcaatggtggttttagggat 
tgtggagtaataacacctaaggttgagtagttttggtttagggttaaaacttgaatcgat 
tggggattttcaagagcattcatttttggggtttatggtaaaattaaaaacaaaaacaaa 
atgtacagaggaattaaaatttctatggaataatcttaaatctcaaatatttgttacttg 
ttttggtgattcataaccaaaatcaaa 

>G207 Amino Acid Sequence (domain in AA coordinates: 6-106) 

MADRVKGPWSQEEDEQLRRMVEKYGPRNWSAISKSIPGRSGKSCRLRWCNQLSPEVEHRP 

FSPEEDETIVTARAQFGNKWATIARLLNGRTDNAVK^ 

EDQDRPKKRRSVSFDPAFAPVDTGLYMSPESPNGIDVSDSSTIPSPSSPVAQLFKPMPIS 
GGFTWPQPLPVEMSSSSEDPPTSLSLSLPGAENTSSSHNNNW^ 

ERGGGGEGRRGEFMTWQEMIKAEVRSYMAEMQKTSGGFVVGGLYESGGNGGFRDCGVIT 
PKVE* 

>G218 (1..1182) 

ATGGAGGCAGAGATCGTGAGACGATCGGAGGTAACGGGATTAAGAAGGGAGGTGGAAGAA 
TCGTCAATTGGTAGAGGAGATTGCGATGGTGATGGCGGCGATGTGGGAGAAGATGCGGCA 
GGGTTCGTTGGGACGAGCGGGAGAGGAAGAAGAGATCGAGTTAAAGGGCCGTGGTCGAAG 
GAGGAGGATGATGTGTTGAGTGAGCTCGTTAAGAGGTTGGGAGCGAGGAATTGGAGTTTT 
ATCGCTCGGAGTATTCCTGGTCGTTCAGGCAAGTCTTGTCGTCTTCGTTGGTGTAATCAG 
CTCAATCCAAATCTTATACGCAATTCATTTACTGAGGTAGAGGATCAGGCTATCATCGCA 
GCACATGCCATCCACGGAAACAAATGGGCTGTTATCGCGAAGCTCCTCCCCGGAAGAACA 
GATAATGCTATC^U^GAACCACTGGAACTCTGCTTTAAGACGTCGATTCATAGACTTTGAA 
AAGGC(^GAATATAGGAACTGGAAGCTTGGTCGTGGATGATTCTGGATTTGACAGAACG 
ACAACAGTAGCCTCATCAGAAGAAACTTTATCTTCAGGCGGTGGTTGCCATGTAACTACT 
CCAATTGTATCTCCAGAAGGCAAAGAAGCTACCACCTCCATGGAAATGTCTGAAGAACAA 
TGCGTAGAGAAAACAAACGGAG7VAGGTATTTCTAGGCAAGATGATAAGGATCCTCCAACG 
CTTTTCCGCCCAGTGCCTCGGCTCAGTTCTTTTAATG CTTG CAATCACATGGAAGGATCA 
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CCCTCTCCACATATACAAGACCAAAATCAGCTCCAATCATCTAAACAAGACGCAGCAATG 

CTAAGATTGCTTGAAGGAGCTTACAGCGAACGGTTTGTGCCTCAAACATGTGGAGGTGGT 

TGTTGCAGCAACAATCCCGATGGCAGTTTTCAGCAAGAATCATTGTTGGGTCCAGAGTTT 

GTGGATTACTTAGACTCACCAACGTTTCCGAGTTCCGAACTAGCTGCTATAGCAACGGAA 

ATAGGCAGCCTCGCTTGGCTGAGAAGCGGTTTAGAGAGTAGCAGCGTGAGGGTGATGGAA 

GACGCAGTTGGTCGGTTAAGGCCTCAAGGCTCCAGGGGTCATCGAGATCATTATCTTGTA 

TCTGAACAGGGGACGAACATAACCAATGTCCTGTCCACATAA 

>G218 Amino Acid Sequence (domain in AA coordinates: TBD) 

MEAEIVRRSEVTGLRREVEESSIGRGDCDGDGGDVGEDAAGFVGTSGRGRRDRVKGPWSK 

EEDDVLSELVKRLGARNWSFIARSIPGRSGKSCRLRWCNQLNPNLIRNSFTEVEDQAIIA 

AHAIHGNKWAVIAKXLPGRTDNAIKNHWNSALRRRFIDFEKAKNIGTGSLVVDDSGFDRT 

TTVASSEETLSSGGGCHVTTPIVSPEGKEATTSMEMSEEQCVEKTNGEGISRQDDKDPPT 

LFRPVPRLSSFNACNHMEGSPSPHIQDQNQLQSSKQDAAMLRLLEGAYSERFVPQTCGGG 

CCSNNPDGSFQQESLLGPEFVDYLDSPTFPSSELAAIATEIGSLAWLRSGLESSSVRVME 

DAVGRLRPQGSRGHRDHYLVSEQGTNITNVLST* 

>G241 (46.. 867) 

GAAAAACATTTCAACTTCTTTTATCAGCAATCACAAATCAAAGAGATGGGAAGAGCTCCA 
TGCTGTGAGAAGATGGGGTTGAAGAGAGGACCATGGACACCTGAAGAAGATCAAATCTTG 
GTCTCTTTTATCCTCAACCATGGACATAGTAACTGGCGAGCCCTCCCTAAGCAAGCTGGT 
CTTTTGAGATGTGGAAAAAGCTGTAGACTTAGGTGGATGAACTATTTAAAGCCTGATATT 
AAACGTGGCAATTTCACCAAAGAAGAGGAAGATG CTATCATCAG CTTACACCAAATACTT 

GGCAATAGATGGTCAGCGATTGCAGCAAAACTGCCTGGAAGAACCGATAACGAGATCAAG 

AACGTATGGCACACTCACTTGAAGAAGAGACTCGAAGATTATCAACCAGCTAAACCTAAG 

ACCAGCAACAAAAAGAAGGGTACTAAACCAAAATCTGAATCCGTAATAACGAGCTCGAAC 

AGTACTAGAAGCGAATCGGAGCTAGCAGATTCATCAAACCCTTCTGGAGAAAGCTTATTT 

TCGACATCGCCTTCGACAAGTGAGGTTTCTTCGATGACACTCATAAGCCACGACGGCTAT 

AGCAACGAGATTAATATGGATAACAAACCGGGAGATATCAGTACTATCGATCAAGAATGT 

GTTTCTTTCGAAACTTTTGGTGCGGATATCGATGAAAGCTTCTGGAAAGAGACACTGTAT 

AGCCAAGATGAACAC^CTACGTATCGAATGACCTAGAAGTCGCTGGTTTAGTTGAGATA 

CAACAAGAGTTTCAAAACTTGGGCTCCGCTAATAATGAGATGATTTTTGACAGTGAGATG 

GAACTTCTGGTTCGATGTATTGGCTAGAACCGGCGGGGAACAAGATCTCTTAGCCGGGCT 

CTAGTTAACATGTTTGAGGAGTAAAGTGAAATGGTGCAAATTAGTTAAGGCTAAGAAATT 

CAAAAGCTTTTGTTTACCGAGAAAAAAACACACTCTAACTCTTGATGTGATGTAGTTAGT 
GTATTAATTAGAGGCTGCGTTTTCAA 

>G241 Amino Acid Sequence (domain in AA coordinates- 14-114) 
MGRAPCCEKMGLKRGPWTPEEDQILVSFII^GHSlSmRALPKQAGLLRCGKSCRLRWMNY 
LKPDIKRGNFTKEEEDAIISLHQILGNRWSAIAAKLPGRTDNEIKNVWHTHLKKRLEDYQ 
PAKPKTSNKKKGTKPKSESVITSSNSTRSESELADSSNPSGESLFSTSPSTSEVSSMTLI 

SHDGYSNEINMDNKPGDISTIDQECVSFETFGADIDESFWKETLYSQDEHNYVSNDLEVA 

GLVEIQQEFQNLGSANWEMIFDSEMELLVRCIG* 

>G254 (15.. 923) 

CGATTTCGAGCTCTATGGTGTCCGTAAACCCTAGACCTAAGGGTTTTCCAGTTTTCGATT 

CCTCGAATATGAGTTTACCAAGCTCCGATGGATTTGGTTCGATTCCGGCCACGGGACGGA 

CCAGTACGGTGTCGTTTTCTGAGGATCCGACGACGAAGATTCGGAAGCCGTACACAATCA 

AGAAGTCGAGAGAGAATTGGACAGATCAAGAGCACGATAAATTTCTAGAAGCTCTTCACT 

TATTCGATAGGGATTGGAAGAAAATAGAAGCCTTTGTTGGATCAAAAACAGTAGTTCAGA 

TACGAAGCCACGCTCAGAAATACTTTCTCAAAGTTCAGAAGAGTGGTGCTAACGAACATC 

TTCC^CTTCCTCGACCTAAGAGGAAAGCGAGTCATCCTTATCCTATAAAGGCTCCTAAAA 

ATGTTGCTTATACCTCTCTCCCGTCTTCGAGTAC^TTACCGTTGCTTGAGCCTGGTTATT 

TGTATAGCTCTGATTCGAAGTCATTGATGGGAAACCAGGCTGTTTGTGCATCTACCTCTT 

CTTCGTGGAATCATGAATCGACAAATCTGCCAAAACCGGTGATTGAAGAGGAACCGGGAG 

TCTCGGCCACGGCTCCTCTCCCAAATAATCGCTGCAGACAGGAAGATACAGAGAGGGTAC 

GAGCAGTGACAAAGCCAAATAACGAAGAAAGTTGTGAAAAGCCACATAGAGTGATC 

ATTTTGCTGAAGTTTACAGCTTCATTGGAAGTGTCTTCGATCCCAACACATCAGGCCACC 

TCCAGAGATTAAAGC^GATGGATCCAATAAATATGGAAACGGTTCTTTTACTGATGCAAA 

ACCTGTCTGTAAATCTGACAAGTCCCGAGTTTGCAGAGCAAAGGAGGTTGATATCATCAT 

ACAGCGCTAAAGCTTTGAAATAGAGATAGAATAAAACAATAATGTACCTTATGTGAGATC 



192 



BNSOOCID: <WO_03013227A2J_> 



WO 03/013227 PCT/US02/25805 

193/286 



AAGAGACAATCATCCAAGGTCTGTATGCATTGCTTGGATTTAGGCCTCGTGTTCTCACTA • 

CAGGAGCAGAACCAATCGCAAAGACTCTTAGATGGCTACTGAGTTGTGGTTTTTATGTCT 

CTGTAAGTCGCGGTGGAGCACACGTGTTTGTCCTGTCTTGTGTATGTGTGTATAGATAAT 

ACAAGGTTTTGCAGAGTAAGGTCACAGTTAGCTGCT^AGTGAGTTTGGATCAATCTTAAGA 

TTAAAACCCTGAGAGTGAGTGTCCAAAGAGACTGTGTAATATTGGTTTGGCGGTCAGCAG 

AAGAGTTTTGAAGTGCACATCCAGTTAGTGATAACACGGTTGAAGAAAAGGTAAGGTTAC 

AAGTTTAGTTTTGAATAATTGTATACTCAT^AAAATATGAATGTATAAAGAATAATCACTT 

GAGTCGCCTTA 

>G254 Amino Acid Sequence (domain in AA coordinates: 62-106) 

MVSVNPRPKGFPVFDS SNMSLPSSDGFGS I PATGRTSTVS FSEDPTTKI RKP YTI KKSRE 

NWTDQEHDKFLEALHLFDRDWKKIEAFVGSKTWQIRSHAQKYFLKVQKSGANEHLPLPR 

PKRKASHPYPIKAPKNVAYTSLPSSSTLPLLEPGYLYSSDSKSLMGNQAVCASTSSSWNH 

ESTNLPKPVIEEEPGVSATAPLPNNRCRQEDTERVRAVTKPNNEESCEKPHRVMPNFAEV 

YS FIGS VFDPNTSGHLQRLKQMDPINMETVljIiLMQNLS VNLTS PEFAEQRRLI S S YS AKA 

LK* 

>G26 (73- .729) 

TTGGCTTGTACCCAAACCCATCTTTGACTTCAAAAATAAAATAAAAATAATCATAATTGA 
CATCATCGGATAATGCATAGCGGGAAGAGACCTCTATCACCAGAATCAATGGCCGGAAAT 
AGAGAAGAGAAAAAAGAGTTGTGTTGTTGCTCAACTTTGTCGGAATCTGATGTGTCTGAT 
TTTGTCTCTGAACTCACTGGTCAACCCATCCCATCATCCATTGATGATCAATCTTCGTCG 
• CTTACTCTTCAAGAAAAAAGTAACTCGAGG CAACGAAACTACAGAGGCGTGAGGCAAAGA 
CCGTGGGGAAAATGGGCGGCTGAGATTCGTGACCCGAACAAGGCAGCTCGTGTGTGGCTT 
GGGACGTTCGACACTGCAGAAGAAGCCGCCTTAGCGTATGATAAAGCTGCATTTGAGTTT 
AGAGGTCACAAGGCCAAGCTTAACTTCCCCGAGCATATTCGTGTCAACCCTACTCAACTC 
TATCCATCGCCCGCTACTTCCCATGATCGCATTATCGTGACACCACCTAGTCCACCTCCA 
CCAATTGCTCCTGACATACTTCTTGATCAATATGGCCACTTTCAATCTCGAAGTAGTGAT 
TCCAGTGCCAACTTGTCCATGAATATGCTGTCTTCTTCGTCTTCATCTTTGAATCATCAA 
GGGCTAAGACCAAATTTGGAGGATGGTGAAAACGTGAAGAACATTAGTATCCACAAACGA 
CGAAAATAACATGTTAATGGCATAAATATCTCTTCGTCCAAGTTATCAAACGCATTGACC 
TCCGGCTTTGATCATTTTAGGCGCTTAATCTCTTTACGACTTCATTTTGGTAGTCTTTAA 
AGAGTCTATGGAGTGGATTTAGCTAGGAATCAGGCCTTATGGATGAAAAATATATAAATT 
TTGAACATGACTATGCAAGAATGGGATGAAGACTACTTAGCTTGGAAAACGTCCTGATAG 
GTCATGACGACTATATCCACAGAAGATGACCGACGGAGACAACAACATGCCTCACCTGAT 
CGACCGATCAAATGAGATAATGTGTTGACCGGACCGGTCGGATCAGGTTGGGTCGAGTAT 

ATCA 

>G26 Amino Acid Sequence (domain in AA coordinates: 67-134) 

MHSGKRPLSPESMAGNREEKKELCCCSTLSESDVSDFVSELTGQPIPSSIDDQSSSIiTLQ 

EKSNSRQR1TOGTOQRPWGKWAAEIRDPNKAARVVJLGTFDTAEEAALAYDKAAFEFRGHK 

AKLNFPEHIRVNPTQLYPSPATSHDRIIVTPPSPPPPIAPDILLDQYGHFQSRSSDSSAN 

LSMNMLS S S S S SLNHQGLRPNLEDGENVKNI S IHKRRK* 

>G263 (48.. 902) 

TTTTTAGTTTTATTTTTCTGTGGTAAAATAAAAAAAGTTCGCCGGAGATGACGGCTGTGA 

CGGCGGCGCAAAGATCAGTTCCGGCGCCGTTTTTAAGCAAAACGTATCAGCTAGTTGATG 

ATCATAGCACAGACGACGTCGTTTCATGGAACGAAGAAGGAACAGCTTTTGTCGTGTGGA 

AAACAGCAGAGTTTGCTAAAGATCTT.CTTCCTCAATACrTCAAGCATAATAATTT 

GCTTCATTCGTCAGCTCAACACTTACGGATTTCGTAAAACTGTACCGGATAAATGGGAAT 

TTGCAAACGATTATTTCCGGAGAGGCGGGGAGGATCTGTTGACGGACATACGACGGCGTA 

AATCGGTGATTGCTTCAACGGCGGGGAAATGTGTTGTTGTTGGTTCGCCTTCTGAGTCTA 

ATTCTGGTGGTGGTGATGATCACGGTTCAAGCTCCACGTCATCACCCGGTTCGTCGAAGA 

ATCCTGGTTCGGTGGAGAACATGGTTGCTGATTTATCAGGAGAGAACGAGAAGCTTAAAC 

GTGAAAACAATAACTTGAGCTCGGAGCTCGCGGCGGCGAAGAAGCAGCGCGATGAGCTAG 

TGACGTTCTTGACGGGTCATCTGAAAGTAAGACCGGAACAAATCGATAAAATGATCAAAG 

GAGGGAAATTTAAACCGGTGGAGTCTGACGAAGAGAGTGAGTGCGAAGGTTGCGACGGCG 

GCGGAGGAGCAGAGGAGGGGGTAGGTGAAGGATTGAAATTGTTTGGGGTGTGGTTGAAAG 

GAGAGAGAAAAAAGAGGGACCGGGATGAAAAGAATTATGTGGTGAGTGGGTCCCGTATGA 

CGGAAATAAAGAACGTGGACTTTCACGCGCCGTTGTGGAAAAGCAGCAAAGTCTGCAACT 

AAAAAAAGAGTAGAAGACTGTTCAAACCAGCGTGTGACACGTCATCGACGACGACGAAAA 
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AAATGATTTAAAAAACTATTTTTTTCCGTAAGGAAGAAAAGTTATTTTTATGTTTTAAAA 

AGGTGAAGAAGGTCCAGAAGGATCAACGC7VAATATATAAATGGATTTTCATGTATTATAT 
AATTTAATTAGTGTATTAAGAAAA 

>G263 Amino Acid Sequence (domain in AA coordinates: TBD) 

MTAVTAAQRSVPAPFLSKTYQLVDDHSTDDWSWNEEGTAFWWKTAEFAKDLLPQYFKH 

NNFSSFIRQLNTYGFRKTVPDKWEFANDYFRRGGEDLLTDIRRRKSVIASTAGKCVWGS 

PSESNSGGGDDHGSSSTSSPGSSKNPGSVENMVABLSGENEKLKRENNNLSSELATVAKKQ 

RDELVTFLTGHLKVRPEQIDKMIKGGKFKPVESDEESECEGCDGGGGAEEGVGEGLKLFG 

VWLKGERKKRDRDEKNYWSGSRMTEIKNVDFHAPLWKSSKVCN* 

>G308 (196. .1794) 

AGTAATTTAGTTTTTTTTTTTTTTT^ 

AGTGAAAAAACAAATCCTAAGCAGTCCTAACCGATCCCCGAAGCTAAAGATTCTTCACCT 

TCCCAAATAAAGCAAAACCTAGATCCGACATTGAAGGAAAAACCTTTTAGATCCATCTCT 

GAAAAAAACCCAACCATGAAGAGAGATCATCATCATCATCATCAAGATAAGAAGACTATG 

ATGATGAATGAAGAAGACGACGGTAACGGCATGGATGAGCTTCTAGCTGTTCTTGGTTAC 

AAGGTTAGGTCATCGGAAATGGCTGATGTTGCTCAGAAACTCGAGCAGCTTGAAGTTATG 

ATGTCTAATGTTCAAGAAGACGATCTTTCTCAACTCGCTACTGAGACTGTTCACTATAAT 

CCGGCGGAGCTTTACACGTGGCTTGATTCTATGCTCACCGACCTTAATCCTCCGTCGTCT 

AACGCCGAGTACGATCTTAAAGCTATTCCCGGTGACGCGATTCTCAATCAGTTCGCTATC 

GATTCGGCTTCTTCGTCTAACCAAGGCGGCGGAGGAGATACGTATACTACAAACAAGCGG 

TTGAAATGCTCAAACGGCGTCGTGGAAACCACCACAGCGACGGCTGAGTCAACTCGGCAT 

GTTGTCCTGGTTGACTCGCAGGAGAACGGTGTGCGTCTCGTTCACGCGCTTTTGGCTTGC 

GCTGAAGCTGTTCAGAAGGAGAATCTGACTGTGGCGGAAGCTCTGGTGAAGCAAATCGGA 

TTCTTAGCTGTTTCTCAAATCGGAGCTATGAGACAAGTCGCTACTTACTTCGCCGAAGCT 

CTCGCGCGGCGGATTTACCGTCTCTCTCCGTCGCAGAGTCCAATCGACCACTCTCTCTCC 

GATACTCTTCAGATGCACTTCTACGAGACTTGTCCTTATCTCAAGTTCGCTCACTTCACG 

GCGAATCAAGCGATTCTCGAAGCTTTTCAAGGGAAGAAAAGAGTTCATGTCATTGATTTC 

TCTATGAGTCAAGGTCTTCAATGGCCGGCGCTTATGCAGGCTCTTGCGCTTCGACCTGGT 

GGTCCTCCTGTTTTCCGGTTAACCGGAATTGGTCCACCGGCACCGGATAATTTCGATTAT 

CTTCATGAAGTTGGGTGTAAGCTGGCTCATTTAGCTGAGGCGATTCACGTTGAGTTTGAG 

TACAGAGGATTTGTGGCTAACACTTTAGCTGATCTTGATGCTTCGATGCTTGAGCTTAGA 

CC^GTGAGATTGAATCTGTTGCGGTTAACTCTGTTTTCGAGCTTCACAAGCTCTTGGGA 

CGACCTGGTGCGATCGATAAGGTTCTTGGTGTGGTGAATCAGATTAAACCGGAGATTTTC 

ACTGTGGTTGAGCAGGAATCGAACCATAATAGTCCGATTTTCTTAGATCGGTTTACTGAG 

TCGTTGCATTATTACTCGACGTTGTTTGACTCGTTGGAAGGTGTACCGAGTGGTCAAGAC 

AAGGTCATGTCGGAGGTTTACTTGGGTAAACAGATCTGCAACGTTGTGGCTTGTGATGGA 

CCTGACCGAGTTGAGCGTCATGAAACGTTGAGTCAGTGGAGGAACCGGTTCGGGTCTGCT 

GGGTTTGCGGCTGCACATATTGGTTCGAATGCGTTTAAGCAAGCGAGTATGCTTTTGGCT 

CTCTTCAACGGCGGTGAGGGTTATCGGGTGGAGGAGAGTGACGGCTGTCTCATGTTGGGT 

TGGCACACACGACCGCTCATAGCCACCTCGGCTTGGAAACTCTCCACCAATTAGATGGTG 

GCTCAATGAATTGATCTGTTGAACCGGTTATGATGATAGATTTCCGACCGAAGCCAAACT 

AAATCCTACTGTTTTTCCCTTTGTCACTTGTTAAGAT^ 

ATTGAAAAATTTTAATCTCGCCTAAATTACT 

>G308 Amino Acid Sequence (domain in AA coordinates: 270-274) 
MKRDHHHHHQDKKTM^ 

EDDLSQIATETVHYNPAELYTWLDSMLTDLNPPSSNAEYDLKAIPGDAILNQFAIDSASS 
KENLTVAEALTOQIGFLAVSQIGAMRQVA 

HFYETCPYLKFAHFTANQAILEAFQGK3CRVHVIDFSMSQGLQWPALMQALAL 

RLTGlGPPAPDNFDYtHEVGCKLAHLAEAIHVEFEYRGFVANTLADLDASMLELRPSEIE 

SVAVNSVFELHKLLGRPGAIDKVLGVVNQIKPEIFTVVEQESNHNSPIFLDRFTESLHYY 

STLFDSLEGVPSGQDKVMSEVYLGKQICNWACDGPDRVERHETLSQWRNRFGSAGFAAA 

HlGSNAFKQASMLLAIiFNGGEGYRVEESDGCLMLGWHTRPLIATSAWKLSTN* 

>G38 (149.. 1156) 

GAGGAAAACTCGAAAAAGCTACACACAAGAAGAAGAAGAAAAGATACGAGCAAGAAGACT 
AAACACGAAAGCGATTTATCAACTCGAAGGAAGAGACTTTGATTTTC^ 

TATAGATTGTGTTGTTTCTGGGAAGGAGATGGCAGTTTATGATCAGAGTGGAGATAGAAA 
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CAGAACACAAATTGATACATCGAGGAAAAGGAAATCTAGAAGTAGAGGTGACGGTACTAC 

TGTGGCTGAGAGATTAAAGAGATGGAAAGAGTATAACGAGACCGTAGAAGAAGTTTCTAC 

CAAGAAGAGGAAAGTACCTGCGAAAGGGTCGAAGAAGGGTTGTATGAAAGGTAAAGGAGG 

ACCAGAGAATAGCCGATGTAGTTTCAGAGGAGTTAGGCAAAGGATTTGGGGTAAATGGGT 

TGCTGAGATCAGAGAGCCTAATCGAGGTAGCAGGCTTTGGCTTGGTACTTTCCCTACTGC 

TCAAGAAGCTGCTTCTGCTTATGATGAGGCTGCTAAAGCTATGTATGGTCCTTTGGCTCG 

TCTTAATTTCCCTCGGTCTGATGCGTCTGAGGTTACGAGTACCTCAAGTCAGTCTGAGGT 

GTGTACTGTTGAGACTCCTGGTTGTGTTCATGTGAAAACAGAGGATCCAGATTGTGAATC 

TAAACCCTTCTCCGGTGGAGTGGAGCCGATGTATTGTCTGGAGAATGGTGCGGAAGAGAT 

GAAGAGAGGTGTTAAAGCGGATAAGCATTGGCTGAGCGAGTTTGAACATAACTATTGGAG 

TGATATTCTGAAAGAGAAAGAGAAACAGAAGGAGCAAGGGATTGTAGAAACCTGTCAGCA 

ACAACAGCAGGATTCGCTATCTGTTGCAGACTATGGTTGGCCCAATGATGTGGATCAGAG 

TCACTTGGATTCTTCAGACATGTTTGATGTCGATGAGCTTCTACGTGACCTAAATGGCGA 

CGATGTGTTTGCAGGCTTAAATCAGGACCGGTACCCGGGGAACAGTGTTGCCAACGGTTC 

ATACAGGCCCGAGAGTCAACAAAGTGGTTTTGATCCGCTACAAAGCCTCAACTACGGAAT 

ACCTCCGTTTCAGCTCGAGGGAAAGGATGGTAATGGATTCTTCGACGACTTGAGTTACTT 

GGATCTGGAGAACTAAACAAAACAATATGAAGCTTTTTGGATTTGATATTTGCCTTAATC 

CCACAACGACTGTTGATTCTCTATCCGAGTTTTAGTGATATAGAGAACTACAGAACACGT 

TTTTTCTTGTTATAAAGGTGAACTGTATATATCGAAACAGTGATATGACAATAGAGAAGA 

CAACTATAGTTTGTTAGTCTGCTTCTCTTAAGTTGTTCTTTAGATATGTTTTATGTTTTG 

TAACAACAGGAATGAATAATACACACTTGTGAAGCTTTTAAAAAAAAAAAAAAAAAAAAA 

>G38 Amino Acid Sequence (domain in AA coordinates: 76-143) 

MAVYDQSGDRNRTQIDTSRKRKSRSRGDGTTVAERLKRWKEYNETVEEVSTKKRKVPAKG 

SKKGCMKGKGGPENSRCSFRGWQRIWGKWVAEIREPNRGSRLWLGTFPTAQEAASAYDE 

AAKAOTGPLARLNFPRSDASEVTSTSSQSEVCTVETPGCVHVKTEDPDCESKPFSGGVEP 

MYCLENGAEEMKRGVKADKHWLSEFEHWYWSDILKEKEKQKEQGIVETCQQQQQDSLSVA 

DYGWPNDVDQSHLDSSDMFDVDELLRDLNGDDVFAGLNQDRYPGNSVANGSYRPESQQSG 

FDPLQSLNYGIPPFQLEGKDGNGFFDDLSYLDLEN* 

>G43 (38.. 643) 

CTCCTGTCTTGTCTAAAGAAAAAAGAGAGAGGAAGAAATGGAGACTTTTGAGGAAAGCTC 

TGATTTGGATGTTATACAGAAACATCTATTTGAAGACTTGATGATCCCTGATGGTTTCAT 

TGAAGATTTTGTCTTTGATGATACTGCTTTTGTCTCCGGACTCTGGTCTCTAGAACCCTT 

TAACCCAGTTCCGAAACTGGAACCTAGTTCACCTGTTCTTGATCCAGATTCCTATGTCCA 

AGAGATTCTGCAAATGGAAGCAGAATCATCATCATCATCATCAACAACAACGTCACCTGA 

GGTTGAGACTGTCTCA7^CCGGAAAAAAACAAAGAGGTTTGAAGAAACGAGACATTACAG 

AGGCGTGAGAAGGAGGCCATGGGGGAAATTTGCAGCAGAGATTCGAGATCCGGCAAAGAA 

AGGATCCAGGATTTGGTTAGGCACTTTTGAGAGTGATATTGATGCTGCAAGGGCTTACGA 

CTATGCT^GCTTTTAAGCTCAGGGGAAGAAAAGCTGTTCTCAACTTTCCTTTGGATGCCGG 

AAAGTATGATGCTCCGGTCAATTCATGCCGAAAAAGGAGGAGAACCGATGTACCACAGCC 

TCAAGGAACAACAACAAGTACTTCATCATCGTCATCAAACTAATGGGGGAATAGTGATGT 

TTAATTAGTATATATAGGTTAATATCTTAAGTATGTGAAGCATCATGTATAGAGCCAAGA 

ACCTGTTAGACTAGTGTACTGAAAAGAACTCTTGCAAAATATGTACTAAAGAGTTCCTGT 

AACAATGGAACTTCTGCGTTTTCTCTTGTCTTAAAGAGCTTAAGGTTCTAGAAACAAAGT 

TCTTGTCCTTTCGGTTTAAAAAAAAAAAAAAAAAAAAAAAAA 

AAAAAAAAA ' 

>G43 Amino Acid Sequence (domain in AA coordinates: 104-172) 

METFEESSDLDVIQKHLFEDLMIPDGFIEDFVFDDTAFVSGLWSLEPFNPVPKLEPSSPV 

LDPDS YVQE I LQMEAES S S S S STTTSPEVETVSNRKKTKRFEETRHYRGVRRRPWGKFAA 

EIRDPAKKGSRIWLGTFESDIDAARAYDYAAFKLRGRKAVLNFPLDAGKYDAPVNSCRKR 

RRTDVPQPQGTTTSTSSSSSN* 

>G536 (1..768) 

ATGTCGACAAGGGAAGAGAATGTTTACATGGCGAAATTAGCCGAACAAGCTGAACGTTAC 
GAAGAAATGGTTGAATTCATGGAGAAAGTTGCGT^AAACTGTTGATGTTGAGGAACTTTCA 
GTTGAAGAGAGGAATCTTCTCTCTGTTGCTTACAAGAACGTGATTGGAGCGAGAAGAGCT 
TCGTGGAGAATCATTTCTTCGATTGAGCAGAAAGAAGAGAGCAAAGGGAACGAAGATCAT 
GTTGCTATTATCAAGGATTACAGAGGAGAGATTGAATCCGAGCTTAGCAAAATCTGTGAT 
GGGATTTTGAATGTTCTTGAAGCTCATCTTATTCCTTCTGCTTCACCAGCTGAATCTAAA 
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GTGTTTTATCTTAAGATGAAGGGTGATTATCATAGGTATCTTGCTGAGTTTAAGGCTGGT 

GCTGAAAGGAAAGAAGCTGCTGAAAGCACTTTGGTTGCTTACAAGTCTGCTTCCGACATT 

GCCACTGCTGAGTTAGCTCCTACTCACCCGATAAGGCTTGGTCTTGCACTCAACTTCTCT 

GTGTTTTACTATGAAATCCTCAACTCGCCTGATCGTGCTTGCAGCCTCGCAAAGCAGGCG 

TTTGATGATGCAATCGCTGAGTTAGATACATTGGGTGAGGAATCATACAAGGACAGTACA 

CTGATTATGCAGCTTCTTAGAGACAATCTCACTCTCTGGACTTCAGATATGACTGACGAA 

GCAGGAGATGAGATTAAGGAGGCATCAAAGCCCGATGGTGCCGAGTAA 

>G536 Amino Acid Sequence (domain in AA coordinates • 226-?33) 

MSTREENVYMAKLAEQAERYEEMVEFMEKVAKTVDVEELSVEERNLLSVAYKNVIGARRA 

SWRIISSIEQKEESKGNEDHVAIIKDYRGEIESELSKICDGILNVLEAHLIPSASPAESK 

VFYLKMKGDYHRYLAEFKAGAERKEAAESTLVAYKSASDIATAELAPTHPIRLGLALNFS 

VFYYEILMSPDRACSLAKQAFDDAIAELDTLGEESYKDSTLIMQLLRDNLTLWTSDMTDE 
AGDE I KEAS KPDGAE * 

>G567 (38.. 1273) 

AAAAAGAAGAATCAGAAAGTGAAAAAGAGAGCGAGCGATGAACAGTATCTTCTCCATTGA 

CGATTTCTCCGATCCTTTCTGGGAAACTCCTCCGATTCCTCTCAATCCCGACTCTTCTAA 

GCCTGTTACGGCGGATGAAGTTAGCCAGAGTCAACCGGAATGGACTTTCGAGATGTTTCT 

CGAAGAGATTTCTTCGTCGGCGGTGAGCTCTGAGCCACTTGGTAACAACAACAACGCGAT 

CGTCGGTGTTTCTTCGGCGCAATCTCTTCCTTCTGTTTCCGGACAGAATGATTTCGAGGA 

TGATAGTCGATTTCGTGATCGCGATTCGGGAAATTTGGATTGTGCTGCTCCCATGACGAC 

GAAGACGGTGAATGTTGATTCCGATGATTATCGTCGTGTTCTTAAGAACAAGCTTGAGGC 

TGAGTGCGCGACTGGTGTTTCTCTTCGGGTTGGGTCTGTGAAGCCTGAAGATTCGACTAG 

TTCTCCAGAAACTCAACTTCAACCAGTTCAATCCAGTCCTCTTACTCAAGGAGAACTTGG 

TGTTACTTCTTCCTTACCAGCTGAGGTGAAAAAAACTGGTGTATCAATGAAGCAGGTTAC 

TAGTGGATCGTCGAGAGAATATTCTGATGACGAGGACCTTGATGAAGAGAATGAAACCAC 

CGGTTCCTTGAAGCCAGAGGACGTTAAAAAATCTAGAAGGATGCTGTCAAATCGTGAGTC 

AGCTAGGCGATCTAGAAGGAGAAAGCAGGAGCAAACAAGTGACCTCGAAACACAGGTTAA 

TGATCTAAAAGGTGAGCATTCATCACTTCTTAAACAACTGAGCAACATGAATCACAAGTA 

TGACGAGGCTGCTGTTGGCAATAGAATACTAAAGGCTGACATTGAGACATTAAGAGCTAA 

GGTGAAAATGGCGGAAGAAACCGTGAAGAGAGTAACAGGAATGAATCCGATGCTTCTCGG 

AAGATCAAGTGGACATAACAACAACAACAGAATGCCAATAACTGGTAACAACAGGATGGA 

TTCTTCTAGCATTATTCCAGCTTATCAACCACACTCAAACCTAAACCATATGTCAAACCA 

AAACATCGGGATCCCAACCATTCTACCTCCAAGACTCGGAAACAATTTCGCTGCTCCTCC 

ATCCCAAACCAGCTCTCCCTTGCAGAGAATTAGAAATGGGCAAAATCACCATGTTACTCC 

AAGCGCCAACCCGTATGGCTGGAATACCGAACCTCAGAACGATTCAGCATGGCCGAAAAA 

ATGCGTGGACTGATCAAACAAGAAGCGGGTTTCGCACTATATTAATGTCTATGCATCTGT 

AATTTGTAAGTGTTATTAAGTTACGAATCATGAGAAAACATCTTGTGAAAATACAGTCTC 

ATGGCTTATATATATATATAAGCTCTGTCTTATAACATTACAAGATTCTTATTTGAGAAT 

CGTCTTTCTATTTATAGCTAATAAAAAAAAAAAAAAAAA 

>G567 Amino Acid Sequence (domain in AA cordinates 210-270) 

MNSIFSIDDFSDPFWETPPIPLNPDSSKPVTADEVSQSQPEWTFEMFLEEISSSAVSSEP 

LGNOTONAIVGVSSAQSLPSVSGQNDFEDDSRFRDRDSGNLDCAAPMTTKTVNVDSDDYRR 

VLKNKLEAECATGVSLRVGSVKPEDSTSSPETQLQPVQSSPLTQGELGVTSSLPAEVKKT 

GVSMKQVTSGSSREYSDDEDLDEEHETTGSLKPEDVKKSRRMLSHRESARRSRRRKQEQT 

SDLETQVlTOIiKGEHSSLLKQLSNMNHKYDEAAVGNRILKADIETIjRAKVKMAEETVKRVT 

GMNPMLLGRSSGHNNNNRMPITGNNRMDSSSIIPAYQPHSNLHHMSNQNIGIPTILPPRI. 

GNNFAAPPSQTSSPLQRIRNGQNHHVTPSANPYGWNTBPQNDSAWPKKCVD* 
>G680 (338..22T5) 

CAGTTATCTTCTTCCTTCTTCTCTCTGTTTTTTAAATTTATTTTTAGAGAATTTTT 

TrTTGCTTCCGATTTGATTATTTCCGGGAACGATGACTTCTCCGGGGAGTTCCCGGTGAG 

ATGATAAGTCAGATTGCATACTTGTCTCCTCCATGGCTACTCTCAAGGGTrTTCGCTGCG 

GTGGATTCGTTTGGTTTCTCTAGAATCTAAAGAGGTTATCACAACGGCTTTGCAATTTGA 

AAACTTTCATGTTTGGGGAGATCAAAGATGGTTTCTTTTTTATACTTTACTTGTTAGAGA 

GGATTTGAAGCAGCGAATAGCTGCAACCGGTCCTGTTATGGATACTAATACATCTGGAGA 

AGAATTATTAGCTAAGGCAAGAAAGCCATATACAATAACAAAGCAGCGAGAGCGATGGAC 

TGAGGATGAGCATGAGAGGTTTCTAGAAGCCTTGAGGCTTTATGGAAGAGCTTGGCAACG 

AATTGAAGAACATATTGGGACAAAGACTGCTGTTCAGATCAGAAGTCATGCACAAAAGTT 
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CTTCACAAAGTTGGAGAAAGAGGCTGAAGTTAT^AGGCATCCCTGTTTGCCAAGCTTTGGA 
CATAGAAATTCCGCCTCCTCGTCCTAAACGAAAACCCAATACTCCTTATCCTCGAAAACC 
TGGGAACAACGGTACATCTTCCTCTCAAGTATCATCAGCAAAAGATGCAAAACTTGTTTC 
ATCGGCCTCTTCTTCACAGTTGAATCAGGCGTTCTTGGATTTGGAAAAAATGCCGTTCTC 
TGAGAAAACATCAACTGGAAAAGAAAATCAAGATGAGAATTGCTCGGGTGTTTCTACTGT 
GAACAAGTATCCCTTACCAACGAAACAGGTAAGTGGCGACATTGAAACAAGTAAGACCTC 
AACTGTGGACAACGCGGTTCMGATGTTCCCAAGAAGAACAAAGACAAAGATGGTAACGA 
TGGTACTACTGTGCACAGCATGCAAAACTACCCTTGGCATTTCCACGCAGATATTGTGAA 
CGGGAATATAGCAAAATGCCCTCT^AAATCATCCCTCAGGTATGGTATCTCAAGACTTCAT 
GTTTCATCCTATGAGAGAAGAAACTCACGGGCACGCAAATCTTCAAGCTACAACAGCATC 
TGCTACTACTACAGCTTCTCATCAAGCGTTTCCAGCTTGTCATTCACAGGATGATTACCG 
TTCGTTTCTCCAGATATCATCTACTTTCTCCAATCTTATTATGTCAACTCTCCTACAGAA 
TCCTGCAGCTCATGCTGCAGCTACATTCGCTGCTTCGGTCTGGCCTTATGCGAGTGTCGG 
GAATTCTGGTGATTCATCAACCCCAATGAGCTCTXCTCCTCCAAGTATAACTGCCATTGC 
CGCTGCTACAGTAGCTGCTGCAACTGCTTGGTGGGCTTCTCATGGACTTCTTCCTGTATG 
CGCTCCAGCTCCAATAACATGTGTTCCATTCTCAACTGTTGCAGTTCCAACTCCAGCAAT 
GACTGAAATGGATACCGTTGAAAATACTCAACCGTTTGAGAAACAAAACACAGCTCTGCA 
AGATCAAACCTTGGCTTCGAAATCTCCAGCTTCATCATCTGATGATTCAGATGAGACTGG 
AGTAACCAAGCTAAATGCCGACTCAAAAACCAATGATGATAAAATTGAGGAGGTTGTTGT 
TACTGCCGCTGTGCATGACTCAAACACTGCCCAGAAGAAAAATCTTGTGGACCGCTCATC 
GTGTGGCTCAAATACACCTTCAGGGAGTGACGCAGAAACTGATGCATTAGATAAAATGGA 
GAAAGATAAAGAGGATGTGAAGGAGACAGATGAGAATCAGCCAGATGTTATTGAGTTAAA 
TAACCGTAAGATTAAAATGAGAGACAACAACAGCAACAACAATGCT^ACTACTGATTCGTG 
GAAGGAAGTCTCCGAAGAGGGTCGTATAGCGTTTCAGGCTCTCTTTGCAAGAGAAAGATT 
GCCTCAAAGCTTTTCGCCTCCTCAAGTGGCAGAGAATGTGAATAGAAAACAAAGTGACAC 
GTCAATGCCATTGGCTCCTAATTTCAAAAGCCAGGATTCTTGTGCTGCAGACCAAGAAGG 
AGTAGTAATGATCGGTGTTGGT^ACATGCAAGAGTCTTAAAACGAGACAGACAGGATTTAA 
GCCATACAAGAGATGTTCAATGGAAGTG7\AAGAGAGCCAAGTTGGGAACATAAACAATCA 
AAGTGATGAAAAAGTCTGCAAAAGGCTTCGATTGGAAGGAGAAGCTTCTACATGACAGAC 
TTGGAGGTAAAAAAAAAACATCCACATTTTTATCAATATCTTTAAATCTAGTGTTAGTAG 
TTTGCTTCTCCAATCTTTATGAAAGAGACTTTTAATTTTCCTTCCGAACATTTCTTTGGT 
CATGTCAGGTTCTGTACCATATTACCCCATGTCTTGTCTCTTGTCTCTGTTTGTGTATGC 
TACTTG'TGGTCTATATGTCATCTGCTACTACTGTTAATTAACCATTAAGCAATGGATTTG 
TCTTTA 

>G680 Amino Acid Sequence (domain in AA coordinates: 24-70) 
MDTNTSGEELLAKARKPYTITKQRERWTEDEHERFLEALRLYGRAWQRIEEHIGTKTAVQ 
IRSHAQKFFTKLEKEAEVKGIPVCQALDIEIPPPRPKRKPNTPYPRKPGNNGTSSSQVSS 
AKDAKLVS SAS S S QLNQAFLDLEKMPFSEKTS TGKENQDENCSGVSTVNKY PLPTKQVSG 
DIETSKTSTVDNAVQDVPKKNKDKDGNDG 

GMVSQDFMFHPMREETHGHAmQATTASATTTASHQAFPACHSQDDYRSFLQISSTFSNL 
I MSTLLQWPAAHAAATFAAS VWPYAS VGNSGDS STPMS S SPPS I TAI AAATVAAATAWWA 
SHGLLPVraPAPITC\nPFSTVAVPTPAMTEMDTVENTQPFEKQNTALQDQTLAS 
SDDSDETGVTKXNADSKTNDDKIEEVVVTAAVHDSNTAQKKN 
TDALDKMEKDKEDVKETDENQPDVIELNNRKIKMRDNNSKNNATTO 

ALFARERLPQSFSPPQVAEimiRKQSDTSMPLAPNFKSQDSCATUDQEGVVMIGVGTCKSIi 

KTRQTGFKPYKRCSMEVKESQVGNINNQSDEKVCKRLRLEGEAST* 

>G867 (64.. 1098) 

CACAACACAAACA(^TTCTGTTTTCTCCATTGTTTCAAACCATAAAAAAAAACACAGA^ 

TAAATGGAATCGAGTAGCGTTGATGAGAGTACTACAAGTACAGGTTCCATCTGTGAAACC 

CCGGCGATAACTCCGGCGAAAAAGTCGTCGGTAGGTAACTTATACAGGATGGGAAGCGGA 

TCAAGCGTTGTGTTAGATTCAGAGAACGGCGTAGAAGCTGAATCTAGGAAGCTTCCGTCG 

TCAAAATACA7VAGGTGTGGTGCCACAACCAAACGGAAGATGGGGAGCTCAGATO 

AAACACCAGCGCGTGTGGCTCGGGACATTCAACGAAGAAGACGAAGCCGCTCGTGCCTAC 

GACGTCGCGGTTCACAGGTTCCGTCGCCGTGACGCCGTCACAAATTTCAAAGACGTGAAG 

ATGGACGAAGACGAGGTCGATTTCTTG7UVTTCTCATTCGAAATCTGAGATCGTTGATATG 

TTGAGGAAACATACTTATAACGAAGAGTTAGAGCAGAGTAAACGGCGTCGTAATGGTAAC 

GGAAACATGACTAGGACGTTGTTAACGTCGGGGTTGAGTAATGATGGTGTTTCTACGACG 



197 



BNSDOCID: <WO_03013227A2J_> 



W ° 03/013227 PCT/US02/258«5 

198/286 



GGGTTTAGATCGGCGGAGGCACTGTTTGAGAAAGCGGTAACGCCAAGCGACGTTGGGAAG 

CTAAACCGTTTGGTTATACCGAAACATCACGCAGAGAAACATTTTCCGTTACCGTCAAGT 

AACGTTTCCGTGAAAGGAGTGTTGTTGAACTTTGAGGACGTTAACGGGAAAGTGTGGAGG 

TTCCGTTACTCGTATTGGAACAGTAGTCAGAGTTATGTTTTGACTAAAGGTTGGAGCAGG 

TTCGTTAAGGAGAAGAATCTACGTGCTGGTGACGTGGTTAGTTTCAGTAGATCTAACGGT 

CAGGATCAACAGTTGTACATTGGGTGGAAGTCGAGATCCGGGTCAGATTTAGATGCGGGT 

CGGGTTTTGAGATTGTTCGGAGTTAACATTTCACCGGAGAGTTCAAGAAACGACGTCGTA 

GGAAACAAAAGAGTGAACGATACTGAGATGTTATCGTTGGTGTGTAGCAAGAAGCAACGC 

ATCTTTCACGCCTCGTAACAACTCTTCTTCTTTTTTTTTCTTTTGTTGTTTTAATAATTT 

TTAAAAACTCCATTTTCGTTTTCTTTATTTGCATCGGTTTCTTTCTTCTTGTTTACCAM 

GGTTCATGAGTTGTTTTTGTTGTATTGATGAACTGTAAATTTTATTTATAGGATAAATTT 
TAAAAAAAAAAAAAAAAAAAA 

>G867 Amino Acid Sequence (domain in AA coordinates: 59-124) 

MESSSVDESTTSTGSICETPAITPAKKSSVGNLYRMGSGSSWLDSENGVKAESRKLPSS 

K^KGWPQPNGRWGAQIYEKHQRWLGTFNEEDEAAJIAYDVAVHRFRRRDAVTNFKDVKM 

DEDEVDFLNSHSKSEIVDMLRKHTYNEELEQSKRRRNGNGNMTRTLLTSGLSNDGVSTTG 

FRSAEALFEKAVTPSDVGKLNRLVIPIOmAEKHFPLPSSNVSVKGVLLNFEDWGKVWRF 

RYSYWNSSQSYVLTKGWSRFVKEKNLRAGDVVSFSRSNGQDQQLYIGWKSRSGSDLDAGR 

VLRLFGVNI SPESSRNDWGNKRVNDTEMLSLVCSKKQRIFHAS * 

>G956 (1. .840) 

ATGGAGGAGACAGAAAAGAATAAGGGCAGCATAAGTATGGTTGAGGCTAATCTACCTCCT 
GGTTTTAGATTCCATCCTAGAGACGACGAGCTCGTCTGTGACTACTTAATGAGAAGAACC 
GTTCGCAGCCTCTATCAACCAGTTGTCTTGATCGACGTCGATCTTAACAAATGCGAGCCT 
TGGGACATTCCTCAAACGGCGAGAGTGGGAGGGAAAGAATGGTACTTTTACAGCCAAAAA 
GACCGTAAATACGCAACAGGCTACAGAACAAACCGGGCTACGGCCACCGGTTATTGGAAA 
GCCACCGGG7JAAGATAGAGCAATCCAAAGAAACGGTGGTCTTGTGGGTATGAGAAAGACA 
CTTGTGTTTTACCGAGGTCGATCCCCTAAAGGTCGTAAAACTGATTGGGTCATGCATGAG 
TTTCGTCTCCAAGGAAAACTTCTTCACC^CTCCCCTAATTCTCTCGAGGAAGAGTGGGTA 
TTGTGTAGAGTTTTCCACAAGAACAGCAACGGAGCTGATATAGACGACATCACAAGGAGC 
TGCTCTGATGCAACAGCTTCTGCATTCATGGACTCTTACATCAACTTCGACCATCATCAC 
ATCATCAATCAGCATGTACCCTGCTTCTCGAATAATTTGTCACATAACCAAACCAACCAA 
TCCGGTTTAATCTCCAAGAACTCCAGCCCA.TTGTTTAATGCTTCCCCTGATCAAATGATT 
CTCAGAACTTTGCTAAGTCAACTCACAAAAAAAGTCGAAGAATCACAGAGTCGTGGAGAC 
GGAAGCTCAGAGAGCCAATTGACCGACATTGGCATCCCAAGCCATGCATGGAATTACTGA 
>G956 Amino Acid Sequence (domain in AA coordinates: TBD) 

MEETEKNKGSISMVEANLPPGFRFHPRDDELVCDYLMRRTVRSLYQPWLIDVDLNKCEP 

WDIPQTARVGGKEWYFYSQKDRKYATGYRTlN^TATGYWKATGKDRAIQRNGGLVGMRKT 

LVFYRGRSPKGRKTDWVMHEFRLQGKLLHHSPNSLEEEWVLCRVFHKNSNGADIDDITRS 

CSDATASAFMDSYINFDHHHIINQHVPCFSNNLSHNQTNQSGLISKNSSPLFNASPDQMI. 

LRTLLSQLTKKVEESQSRGDGSSESQLTDIGIPSHAWNY* 

>G996 (53.. 1063) 

CGATCGATCTTGAATTGATTCTTTGTAGTATTTTATTTACATATATATATAGATGGGAAG 
A(^TTCATGTTGTTACAAACAGAAACTGAGGAAAGGACTTTGGTCTCCTGAAGAAGAT^ 

GAAGCTTCTTCGTTACATCACT7\AGTATGGTCATGGTTGCTGGAGCTCTGTCCCTAAACA 
AGCTGGTTTACAGAGATGTGGAAAAAGTTGTAGATT^ 

AGATTTGAAGAGAGGAGCATTTTCTCAAGATGAAGAAAATCTCATTATTGAACTTCATC^ 

CGTTCTTGGCAATAGATGGTCTCAGATAGCTGCACAGCTTCCTGGAAGAACCGACAATGA 

AATCAAGAATCTTTGGAATTCTTGTTTGAAGAAGAAATTGAGGCTGAGAGGAATTGACCC 

^TTACACACAAGCTCTTAACCGAAATCGAAACCGGT^ 

TGAGAAGAGTCAACAGACCTACCTCGTTGAGACTGATGGCT 

TAGTACTAACCAAAACAACAACACTGATCATCTTTAT^ 

GTTAAGTCTAGAAAACGGTTCAAGAATCGCAGCCGGTTCTGACCTCGGTATCTGGATTCC 
CC^AACCGGAAGAAACCATCATCATCATGTCGATGAAACCATCCCTAGTGCAGTGGTACT 
ACCCGGTTCAATGTTCTCATCCGGTTTAACCGGTTATAGATCCTCCT^TCTCGGTTTAAT 
TGAATTGGAAAACTC^TTCTCAACCGGGCCAATGATGACAGAGCATCAGCAAATTCAAGA 
GAGTAACTACAACAATTCMCATTCTTTGGAAATGGGAATCTGAATTGGGGATTAACmT 
GGAGGAAAATCTU^AATCCATTCACAATATCGAATCATTCAAATTCG^ 
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TATAAAATCAGAGACCAATTTTTTTGGCACAGAGGCTACAAATGTTGGTATGTGGCCATG 
TAACCAGCTTCAGCCTCAGCAACATGCATATGGCCATATATAAATCTTCTTGTATATTAT 

AA 

>G996 Amino Acid Sequence (domain in AA coordinates: 14-114) 

MGRHSCCYKQKLRKGLWSPEEDEKLLRYITKYGHGCWSSVPKQAGLQRCGKSCRLRWINY 

LRPDLKRGAFSQDEENLI I ELHAVLGNRWSQI AAQLPGRTDNEIKNLWNSCLKKKLRLRG 

IDPVTHKLLTEIETGTDDKTKPVEKSQQTYLVETDGSSSTTTCSTNQNNNTDHLYTGNFG 

FQRLSLENGSRIAAGSDLGIWIPQTGRNHHHHVDETIPSAWLPGSMFSSGLTGYRSSNL 

GLIELENSFSTGPMMTEHQQIQESNYNWSTFFGNGNLNWGLTMEENQNPFTISNHSNSSL 

YSDIKSETNFFGTEATNVGMWPCNQLQPQQHAYGHI* 

>G1946 (90.. 1547) 

TCTCACCTATTGTAAAAATCACCAGTTTCGTATATAAAACCCTAATTTTCTCAAAATTCC 

CAAATATTGACTTGGAATCAAAAATCCGAATGGATGTGAGCAAAGTAACCACAAGCGACG 

GCGGAGGAGATTCAATGGAGACTAAGCCATCTCCTCAACCTCAGCCTGCGGCGATTCTAA 

GTTCAAACGCGCCTCCTCCGTTTCTGAGCAAGACCTATGATATGGTTGATGATCACAATA 

CAGATTCGATTGTCTCTTGGAGTGCTAATAACAACAGTTTTATCGTTTGGAAACCACCGG 

AGTTCGCTCGCGATCTTCTTCCTAAGAACTTTAAGCATAATAATTTCTCCAGCTTCGTTA 

GACAGCTTAATACCTATGGTTTCAGGAAGGTTGACCCAGATAGATGGGAATTTGCGAATG 

AAGGTTTTTTAAGAGGTCAGAAGCACTTGCTACAATCAATAACTAGGCGAAAACCTGCCC 

ATGGACAGGGACAGGGACATCAGCGATCTCAGCACTCGAATGGACAGAACTCATCTGTTA 

GCGCATGTGTTGAAGTTGGCAAATTTGGTCTCGAAGAAGAAGTTGAAAGGCTTAAAAGAG 

ATAAGAACGTCGTTATGCT^AGAACTCGTCAGATTAAGACAGCAGCAACAGTCCACTGATA 

ACCAACTTCAAACGATGGTTCAGCGTCTCCAGGGCATGGAGAATCGGCAACAACAATTAA 

TGTCATTCCTTGCAAAGGCAGTACAAAGCCCTCATTTTCTATCTCAATTCTTACAGCAGC 

AGAATCAGCAAAACGAGAGTAATAGGCGCATCAGTGATACCAGTAAGAAGCGGAGATTCA 

AGCGAGACGGCATTGTCCGTAATAATGATTCTGCTACTCCTGATGGACAGATAGTGAAGT 

ATCAACCTCCAATGCACGAGCAAGCCAAAGCAATGTTTAAACAGCTTATGAAGATGGAAC 

CTTACAAAACCGGCGATGATGGTTTCCTTCTAGGTAATGGTACGTCTACTACCGAGGGAA 

CAGAGATGGAGACTTCATCAAACCAAGTATCGGGTATAACTCTTAAGGAAATGCCTACAG 

CTTCTGAGATACAGTCATCATCACCAATTGAAACAACTCCTGAAAATGTTTCGGCAGCAT 

CAGAAGCAACCGAGAACTGTATTCCTTCACCTGATGATCTAACTCTTCCCGACTTCACTC 

ATATGCTACCGGAAAATAATTCAGAGAAGCCTCCAGAGAGTTTCATGGAACCAAACCTGG 

GAGGTTCTAGTCCATTACTAGATCCAGATCTGTTGATCGATGATTCTTTGTCCTTCGACA 

TTGACGACTTTCCAATGGATTCTGATATAGACCCTGTTGATTACGGTTTACTCGAACGCT 

TACTCATGTCAAGCCCGGTTCCAGATAATATGGATTCAACACCAGTGGACAATGAAACAG 

AGCAGGAACAAAATGGATGGGACAAAACTAAGCATATGGATAATCTGACTCAACAGATGG 

GTCTCCTCTCTCCTGAAACCTTAGATCTCTCAAGGCAAAATCCTTGATTTTGGGAGTTTT 

TAAAGTCTTTTGAGGTAACACAGTCCCTGAGAGCAGCATATTCAT 

>G1946 Amino Acid Sequence (domain in AA coordinates: 32-130) 

I^VSCTTTSDGGGDSMETKPSPQPQPAAILSSNAPPPFLSIOYDtm)DHNTDSIVSWS^ 

mSFIWKPPEFARDLLPKNFKEINNFSSFTOQLNTYGFRKVDPDRWEFANEGFLRGQKHL 

LQS I TRRKP AHGQGQGHQRS QHSNGQNS S VS ACVEVGKFGLEEE VERLKRDKNVLMQELV 

RLRQQQQSTDNQLQTMVQRLQGMENRQQQLMS FLAKAVQS PHFLSQFLQQQNQQNESNRR 

ISDTSKKRRFKRDGIVRNNDSATPDGQIVKYQPPMHEQAK^ 

LGKGTSTTEGTEMETSSNQVSGITLKEMPTASEIQSSSPIETTPENVSAASEATENCIPS 
PDDLTLPDFTHMLPENNSEKPPESFMEPNLGGSSPLLDPDLLIDDSLSFDIDDFPMDSDI 
DPVDYGLLERLLMSSPVPDl^^STPVDNETEQEQNGWDKTKHMDNLTQQMGLLSPETLDL 
SRQNP* — 
>G217 (84.. 2618) 

cttcgttcttaccgagfttccacgagcattagcttcagagaccttgaattggagtgcggtt 
ggatcaaaaacagttgagcgaagatgaggattatgattaagggaggtgtttggaagaaca 
ccgaagatgagattctcaaagccgccgtgatgaagtatggtaagaaccaatgggctcgga 
tctcgtctcttctcgttcgtaagtctgctaaacagtgtaaagctcgctggtacgagtggc 
tcgatccatctatcaaaaagactgaatggaccagagaagaagatgagaagcttctacatc 
ttgctaaacttctgcctactcaatggagaactattgctcctattgtgggtcgtacaccat 
ctcaatgtcttgagaggtatgagaagctccttgatgcagcatgcactaaggatgaaaatt 
atgatgcagcggatgatccacgaaaattacgtcctggtgagattgatccgaacccagaag 
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caaagcctgctcgtcctgatccggtagacatggacgaagatgagaaagaaatgctttctg 
aagcaagagctagattggctaacacgaggggaaagaaggctaaaagaaaagctagagaaa 
aacaacttgaggaagctagaaggcttgcttctctgcaaaaaagaagagaactaaaagcag 
ctgggattgatggaaggcataggaaaagaaagagaaagggaatcgactataatgcagaaa 
ttccttttgaaaagagggcacctgcgggattttatgatactgcggatgaagatcgtcctg 
ctgatcaagtaaaatttccaactaccattgaagaacttgaaggaaaaagaagagctgatg 
tagaagcacatttacgcaaacaagatgttgcaaggaataaaattgctcagagacaggatg 
ctccagcagctatattgcaagcaaacaagctgaatgatccggaagttgttaggaagaggt 
caaagctgatgttaccaccaccgcagatttcagaccacgagctagaagaaattgctaaga 
tgggctatgccagtgaccttcttgccgagaatgaggagctaacagaaggcagtgctgcta 
ctcgtgcacttttggcaaattactcacaaacaccaaggcaaggaatgacacccatgagga 
cacctcaaagaactcctgctggtaaaggtgatgctattatgatggaagcagaaaacctgg 
ccagattaagagactctcagacacctttgctaggaggagaaaatcctgagttgcaccctt 
ctgacttcactggggtcactccgagaaagaaggagattcaaacgcctaatccaatgttga 
ccccttcaatgactcctggtggtgctggtcttactccaagaattggcttgacgccatcaa 
gggatgggtcttctttttctatgacacccaaagggactcccttcagggatgaacttcaca 
ttaacgaagacatggacatgcagcaaagtgcaaaacttgagaggcagagacgagaggaaq 
ctagaaggagtttacgctctggtttgactgggcttcctcagccaaagaacgagtaccaaa 
tagttgcacaacctcctcctgaggaaagtgaagagccagaagagaaaattgaggaagaca 
tgtcagacaggatagcgagggaaaaggcggaggaagaagcaagacaacaggcattgctta 
agaagagatccaaggtcttgcagagagatcttcctagacccccagctgcttcattggcag 
taattaggaactcgttgctttcagctgatggagacaaaagttctgttgttcctcctactc 



^■...oayc.cuL.uyccgcucccagctgatggagacaaaagttctgttgttcctcctact 

cgattgaggttgcagataaaatggtaagagaggagcttctacagttgctggagcatgat 

atgcaaagtatccgcttgatgacaaagctgagaagaagaaaggagccaagaaccg*— 

accgttctgcttctcaagttcttgcaattgacgattttgatgaaaatgagctcca- 

ctgacaaaatgataaaggaggaggggaagtttctgtgtgtgtcaatgggacatga. 

agacacttgatgattttgtagaagctcacaacacatgcgtgaatgatctcatgtatttcc 

ccactcgaagcgcttacgagctctcaagtgttgctgggaacgcggacaaagttgcagctt 

ttcaggaggagatggagaatgtgagaaaaaagatggaggaggatgagaagaaggcagaac 
acatgaaqqccaaqtacaaaactt-araraarinrtnh^^h^-, — . . 



. " ~>--^=>-=> — -3-3^ uua aaayauyydyydggaEgagaagaaggcagaac 
acatgaaggccaagtacaaaacttatacaaagggtcatgagaggagggcagagaccgtgt 
ggacccaaatagaggcgacattgaagcaggctgagattggtggaacagaagtagagtgct 
ttaaagcattgaagaggcaagaagagatggctgcatcttttaggaaaaagaatttgcaag 
aggaagtgataaagcaaaaggaaacagagagtaaactgcagactcgctatgggaatatgt 
tggcaatggttgaaaaagcagaggagataatggtcggtttccgagcacaggcattgaaqa 
aacaagaggatgttgaagattctcacaaactgaaagaagctaagctagccactggagagq 
aagaggacatagccatagccatggaagcttctgcataaaaacttaaa^^^^r,^n^^^^ 



aagaggacatagccatagccatggaagcttctgcataaaaacttgagttttgtattgctt 

acaagttttaaggagacgtagcttgactttgtattggtaagtttttttaatatgagtcat 

gactttgtaaaaaggttatgatatattctctgtttgtatgctttgcaagagtcaagaaat 

ttgaatgcttcaggatcaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 
>G217 Amino Acid Rpmrnnro ln«rMcn, s j j _•_ ■ 



»™ ™! C Sequence (conserved domain in AA coordinates: 8-67) 

MRIMIKGGVWKNTEDEILKAAVMKYGKNQWARISSLLVRKSAKQCKARWYEWLDPSIKKT 

EWTREEDEKLLHLAKLIiPTQWRTIAPIVGRTPSQCIiERYEKLLDAACTKDENyDAADDPR 

KLRPGEIDPNPEAKPARPDPVDMDEDEKBMLSEARARI^TRGKXAKRKAREKQLEEARR 

LASLQKRRELKAAGIDGRHRKRKRKGIDYNAEIPFEKRAPAGFYDTADEDRPADQVKFPT 

TIEELEGKRRADVEAHLRKQDVARNKIAQRQDAPAAILQANKLNDPEVVRKRSKLMLPPP 

QISDHELEEIAKMGYASDLLAENEELTEGSAATRALLANYSQTPRQGMTPMRTPQRTPAG 

KGDAIMMEAENLARLRDSQTPLI.GGENPELHPSDFTCVTPRKKEIQTPNPMLTPSMTPGG 

^^ IGLTPSraSSSFSMTPKGTP ^ ELHIN ™ MDM QQ SAKLER QRREEARRSLRSG 

LTGLPQPKNEYQIVAQPPPEESEEPEEKIEEDMSDRIAREKAEEEARQQALLKKRSKVLQ 

RDLPRPPAASLAVIRNSLLSADGDKSSVVPPTPIEVADKMVREELLQLLEHDNAKYPLDD 
KAEKKKGAKNRTNRSASOVLAIDDroEMEIiOE&DTfMTIfWl?f2iri?T.rnTc?M/-iiimTTrrTiT 



^^™^LSSVA(^ADKVAAFQEEMENVRKKMEEDEKKAEHMKAKY^ 

^KG^RRAETVWTQIEATLKQAEIGGTEVECFKALKRQEEMAASFRKKNLQEEVIKQKE 

TESKLQTRYGNMLAMVEKAEBIMVGFRAQALKKQEDVEDSHKLKEAKLATGEEEDIAIAM 
EASA* 

>G2192 (92.. 2971) 

CGGAAAGAGATCAACCAACGATAGAGGAGAAGAAGAACTTGCATACGCAAAAAAACTTTC 
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CCGGGAAAATTCCAGAAACTGCTTTGGAA7UVATGTGCGAGCCCGATGATAATTCCGCTAG 

AAACGGCGTCACTACTCAACCTTCGAGGTCAAGGGAGCTTCTAATGGATGTTGACGACTT 

AGATCTTGACGGTTCATGGCCACTAGATCAAATCCCTTACTTATCCTCATCGAATCGCAT 

GATTTCTCCGATTTTTGTCTCCTCTTCCTCTGAGCAGCCTTGCTCGCCTCTCTGGGCTTT 

CTCCGACGGTGGAGGAAATGGTTTTCACCACGCAACCTCCGGTGGCGATGATGAGAAGAT 

CAGCTCTGTCTCCGGTGTTCCTTCTTTCCGTCTCGCCGAGTATCCTCTCTTCCTCCCTTA 

CTCTTCTCCATCAGCAGCTGAGAACACAACAGAGAAGCATAACAGTTTCCAGTTTCCGTC 

TCCATTGATGAGCCTAGTCCCACCAGAGAACACAGACAACTACTGTGTGATCAAAGAGAG 

GATGACTCAGGCGCTTCGATACTTCAAAGAATCAACCGAACAACACGTTTTGGCTCAGGT 

CTGGGCTCCTGTGAGAAAGAATGGTCGTGATTTGCTGACGACTTTGGGTCAACCTTTTGT 

TCTTAATCCTAATGGTAATGGGCTTAATCAATACAGGATGATCTCTCTCACATATATGTT 

TTCTGTGGATAGTGAAAGTGACGTAGAGCTCGGACTCCCGGGTCGAGTTTTCCGTCAGAA 

ATTGCCTGAATGGACTCCAAATGTTCAGTACTATTCCAGCAAAGAATTCTCGCGGCTTGA 

TCACGCCTTGCACTACAACGTGCGTGGTACACTGGCCTTGCCTGTCTTTAATCCCTCTGG 

TCAGTCCTGCATAGGTGTTGTGGAACTTATAATGACCTCAGAGAAGATTCACTATGCACC 

CGAAGTGGACAAAGTTTGCAAAGCCCTTGAGGCGGTAAATCTGAAAAGCTCGGAAATACT 

TGATCACCAAAC7\ACACAGATATGCAATGAGAGTCGCCAAAACGCGCTTGCTGAGATTCT 

CGAAGTGCTGACAGTTGTATGTGAGACCCATAACTTGCCTCTCGCTCAGACTTGGGTTCC 

ATGTCAGCATGGGAGCGTTCTTGCCAATGGTGGCGGTCTAAAGAAAAACTGCACCAGCTT 

TGACGGTAGCTGCATGGGTCAAATCTGCATGTCTACAACCGACATGGCCTGCTATGTCGT 

GGATGCTCATGTCTGGGGCTTTAGAGATGCCTGTCTTGAACACCATCTCCAGAAAGGCCA 

GGGAGTCGCTGGACGAGCTTTTCTCAATGGTGGCTCATGTTTCTGCAGAGACATCACCAA 

GTTCTGCAAAACGCAGTACCCACTAGTCCATTATGCGCTCATGTTCAAGTTGACCACTTG 

TTTTGCAATATCTCTCCAGAGCTCTTACACGGGCGACGACAGTTACATTCTTGAATTTTT 

TCTTCCTTCGAGTATAACAGACGACCAAGAGCAAGATTTGCTGTTGGGTTCTATTTTGGT 

GACAATGAAAGAACATTTTCAGAGTCTGAGGGTTGCATCTGGGGTTGACTTTGGTGAAGA 

TGACGACAAATTGTCTTTCGAGATCATCCAAGCATTACCGGACAAGAAGGTTCATTCAAA 

AATAGAATCCATTCGAGTTCCCTTTTCTGGTTTTAAGTCAAATGCAACAGAGACGATGTT 

GATTCCTCAGCCTGTGGTTCAGTCTTCTGATCCAGTAAATGAGAAAATCAACGTGGCCAC 

TGTTAACGGTGTGGTTAAGGAGAAGAAGAAAACAGAGAAAAAGCGTGGGAAGACTGAGAA 

AACAATCAGTCTAGATGTACTTCAGCAGTATTTCACTGGAAGTCTCAAAGACGCTGCAAA 

GAGCCTAGGAGTTTGCCCGACGACAATGAAGCGAATTTGCAGGCAACACGGAATCTCGCG 

GTGGCCATCGAGGAAGATCAAGAAAGTGAATCGTTCAATCACAAAGCTGAAACGAGTCAT 

CGAATCTGTTCAAGGTACTGATGGAGGCCTCGACCTGACTTCCATGGCCGTTAGTTCCAT 

CCCTTGGACACACGGTCAAACATCAGCACAGCCACTAAACTCACCCAATGGTTCCAAACC 

ACCTGAGCTACCAAACACCAATAATTCACCTAACCATTGGTCAAGTGATCACAGTCCGAA 

CGAGCCAAATGGTTCGCCTGAGTTACCACCAAGCAATGGTCACAAGCGATCACGAACGGT 

GGATGAGAGCGCTGGGACTCCAACCTCTCATGGCTCATGTGACGGTAACCAATTAGATGA 

ACCGAAAGTCCCAAATCAAGATCCGCTCTTCACGGTTGGTGGATCACCCGGGCTCCTTTT 

TCCACCTTATTCTAGAGATCATGATGTATCTGCAGCTTCCTTCGCAATGCCGAACAGGCT 

TCTTGGTTCTATAGACCATTTCCGAGGAATGCTCATTGAAGACGCTGGAAGTTCAAAAGA 

TCTGAGAAATCTCTGCCCCACTGCAGCATTTGACGATAAGTTTCAAGACACAAACTGGAT 

GAAC^TGATAATAATAGCAACAACAACTTATACGCTCCCCCAAAGGAAGAGGCCATTGC 

AAATGTTGCATGCGAACCATCAGGCTCAGAAATGAGAACGGTAACAATCAAAGCAAGTTA 

CAAAGACGACATAATACGGTTCAGAATATCCTCGGGTTCAGGTATAATGGAATTGAAGGA 

TGAAGTGGCTAAGAGGCTGAAAGTTGATGCAGGAACGTTCGATATCAAGTATCTTGACGA 

TGATAACGAATGGGTTTTAATAGCTTGTGATGCTGATCTTCAAGAATGTCTCGAGATCCC 

TAGATCCTCCCGCAeGAAAATCGTAAGGCTCTTAGTTCATGATGTAACGACAAATCTAGG 

GAGCTCCTGCGAGAGCACTGGAGAATTGTGACCTGATAATTCATTCGAACTCTTTTGTAA 

ATAG 

>G2192 Amino Acid Sequence (conserved domain in AA coordinates : 600-700) 
MCEPDDNS ARNGVTTQPSRSRELLMDVDDLDLDGS WPLDQI P YLS S SNRM I S P I F VS S S S 
EQPCSPLWAFSDGGGNGFHHATSGGDDEKISSVSGVPSFRLAEYPLFLPYSSPSAAENTT 
EKHNSFQFPSPLMSLVTPPENTDNYCVIKERMTQALRYF^ 

LLTTLGQPFVLNPNGNGLNQYRMISLTYMFSVDSESDVELGLPGRVFRQKLPEWTPNVQY 
YSSKEFSRLDHALHYNWGTLAIjPVFNPSGQSCIGW 

AWLKSSEILDHQTTQICNESRQNALAEILEVLTWCETHNLPLAQTWVPCQHGSVIjANG 
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GGLKKNCTS FDGSCMGQI CMSTTDMACY WDAHVWG FRDACLEHHLQKGQG VAGRAFLNG 
GSCFCRDITKFCKTQYPLVHYALMFKLTTCFAISLQSSYTGDDSYILEFFLPSSITDDQE 
QDLLLGS I LVTMKEHFQSLRVASGVDFGEDDDKLS FEI I QALPDKKVHSKI ESI R VPFSG 

FKSNATETMLIPQPWQSSDPVNEKINVATVNGVVKEKKKTEKKRGKTEKTISLDVLQQY 
FTGSLIO)AAKSLGVCPTTMKRICRQHGISRWPSRKlKKVNRSITKLKRVIESVQGTDGGL 
DLTSMAVSSIPWTHGQTSAQPLNSPNGSKPPELPNTNNSPNHWSSDHSPNEPNGSPELPP 
SNGHKRSRTVDESAGTPTSHGSCDGNQLDEPKVPNQDPLFTVGGSPGLLFPPYSRDHDVS 
AAS FAMPNRLLGS IDHFRGML I EDAGS S KDLRNLC PTAAFDDKFQDTNWMNITONNSNNNL 
YAPPKEEAIANVACEPSGSEMRTVTIKASYKDDIIRFRISSGSGIMELKDEVAKRLKVDA 

GTFDIKYLDDDNEWVLIACDADLQECLEIPRSSRTKIVRLLVHDVTTNLGSSCESTGEL* 
>G504 (69.. 1040) 

CGTCGACCTCTTGACGATCATGAGACTGATTTCGTGAAAATATCGTCATTATATCAAATT 
AGAAGTTGATGGAAAACATGGGGGATTCGAGCATAGGGCCGGGCCATCCGCATCTCCCTC 
CCGGGTTTCGGTTTCACCCGACTGATGAGGAACTAGTAGTTCATTACCTCAAGAAGAAAG 
CAGATTCTGTTCCACTTCCAGTCTCAATCATCGCAGAGATTGATCTTTACAAGTTTGATC 
CTTGGGAGCTTCCAAGCAAGGCGAGTTTTGGAGAGCACGAGTGGTACTTCTTTAGTCCTC 
GGGATCGGAAGTATCCAAATGGGGTTAGGCCAAACCGGGCAGCAACTTCCGGTTATTGGA 
AAGCAACGGGAACCGATAAACCGATATTTACGTGCAATAGTCACAAGGTTGGTGTCAAGA 
AAGCGCTTGTTTTTTACGGTGGAAAGCCTCCTAAAGGGATAAAAACAGATTGGATCATGC 
ATGAATAT CG C CTCACTG ATGGTAACCTTAG C ACTG CGG CTAAGC CG CCTG ACTTAACCA 

CGACAAGGAAAAACTCACTACGGCTAGACGATTGGGTTCTATGTAGGATCTATAAGAAGA 

ATAGTTCACAAAGACCAACAATGGAGAGAGTATTACTTAGAGAGGATCTAATGGAAGGCA 

TGCTCTCAAAATCATCTGCTAATTCTTCTTCTACATCAGTACTAGACAACAACGACAACA 

ATAATAACAATAACGAAGAACACTTTTTCGACGGTATGGTCGTTTCTTCAGACAAACGTT 

CCTTGTGTGGTCAATACCGAATGGGCCACGAGGCCTCAGGATCATCTTCATTCGGATCTT 

TCTTATCGAGCAAGAGGTTTCATCATACAGGTGATCTCAACAATGATAACTACAATGTCT 

CTTTTGTTTCGATGCTTAGTGAGATTCCTCAGAGTTCGGGGTTTCATGCAAATGGTGTTA 

TGGATACGACGTCGTCTCTAGCTGATCATGGGGTTTTAAGACAGGCGTTTCAGCTTCCTA 

ACATGAACTGGCACTCATAATCTATATAGATATATATGTGTGTATCATATATGTATCTAT 

GCAGGCCTAATATAGTTTACACATAAATCATCTGGGGCGGCCGCT 

>G504 Amino Acid Sequence (domain in AA coordinates: TBD) 

MENMGDSSIGPGHPHLPPGFRFHPTDEELWHYLKKKADSVPLPVSIIAEIDLYKFDPWE 
LPSKASFGEHEWYFFSPRDRKyPNGVRPNRAATSGYWKATGTDKPIFTCNSHKVGVKKAI, 
VFYGGKPPKGIKTDWIMHEYRLTDGNLSTAAKPPDLTTTRKNSLRLDDWVLCRIYKKNSS 
QRPTMERVLLREDLMEGMLSKSSANSSSTSVLDNNDNNWNWEEHFFDGMWSSDKRSLC 

GQYRMGHEASGSSSFGSFLSSKRFHHTGDLNNDNYNVSFVSMLSEIPQSSGFHANGVMDT 
TSSLADHGVLRQAFQLPNMNWHS * 
>G622 (248.. 2620) 

TCTTTCTTTCTTCAATTCGCCGTCAAAATCTTCTCTTTCTCTTCCCCCGCCGGTCCTTCA 
CCAATCCTCTGATCTCTCTACACACGAACCTTTGATTTTGACCAACGTCGATGCATGTTC 
A^GACTAGTCTCTTCCTCAATCCTTCAATTTCATCAATTCACGTCGATTTCGTATCCGAT 
TCGTTGTTCTAGCTCTTTGTGTGGTGTTAGGGTTTTAAGATTTTGGAATTGGGGTTTGGA 
GTTTGTGATGTTTGAAGTC7VAAATGGGGTCAAAGATGTGCATGAACGCTTCATGTGGTAC 
GACTTCTACTGTTGAATGGAAGAAAGGTTGGCCTCTTCGATCTGGTCTTCTCGCTGATCT 
CTGTTJVTCX3TTGCGGATCTGCGTATGAGAGTTCTCTATTCT 

CCAATCTGGTTGGAGGGAATGCTATTTGTGTAGCAAGAGACTACATTGTGGATGCIATTGC 
TTCTAAGGTAACGATTGAGTTAATGGACTATGGTGGTGTTGGTTGTAGTACATGTGCTTG 
CTGC^TCAACTCAATTTGAACACAAGGGGTC^ 

AATGAAAACGTTAGCTGATAGGCAACATGTAAATGGCGAAAGCGGAGGAAGAAACGAAGG 
CGATCTCTTTTCTCAGCCACTAGTCATGGGCGGAGATAAAAGGGAAGAGTTCATGCCTCA 
CCGTGGGTTTGGTAAGCTAATGAGTCCAGAAAGTACAACCACCGGGCATAGGCTGGATGC 
TGCTGGGGAAATGCATGAATCATCACCTTTACAGCCATCTTTAAATATGGGTTTGGCTGT 
GAATCCGTTTAGCCCATCTTTTGCAACCGAGGCTGTCGAGGGAATGAAACACATCAGTCC 
TTCTCAGTCCAACATGGTCCATTGCTCTGCTTCTAATATACTGCAAAAGCCATCAAGACC 
TGCTATTTCAACTCCTCCTGTGGCTAGTAAATCCGCTCAGGCGCGGATTGGAAGGCCTCC 
TGTCGAAGGGCGAGGGAGAGGCCACTTGCTTCCGCGGTATTGGCCAAAATATACGGATAA 
AGAGGTTCAGCAGATCTCTGGAAATTTGAATTTGAACATTGTACCTCTCT^ 
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TCTTAGTGCCAGTGATGCTGGTCGCATTGGTCGTCTAGTTCTTCCAAAAGCCTGTGCAGA 

GGCATATTTTCCTCCGATTAGTCAATCCGAAGGCATTCCTTTGAAAATCCAAGATGTGAG 

GGGTAGGGAGTGGACGTTCCAGTTCAGATATTGGCCCAATAACAATAGTAGAATGTATGT 

TTTAGAAGGTGTCACTCCATGCATACAGTCCATGATGCTACAGGCTGGTGATACAGTAAC 

TTTCAGTCGGGTTGATCCTGGCGGAAAACTAATCATGGGTTCCAGGAAGGCAGCTAATGC 

TGGAGACATGCAGGGTTGTGGGCTCACCAACGGAACATCAACTGAGGACACATCATCGTC 

TGGTGTAACAGAAAACCCACCCTCCATAAATGGTTCCTCGTGTATTTCACTAATACCGAA 

AGAGTTGAATGGTATGCCTGAGAATTTGAACAGTGAGACTAACGGGGGCAGGATAGGTGA 

TGATCCTACACGAGTTAAAGAGAAGAAGAGAACTCGAACCATTGGTGCAAAAAATAAGAG 

ACTTCTTTTGCATAGTGAAGAATCTATGGAGCTGAGACTCACTTGGGAAGAAGCTCAGGA 

CTTGCTTCGTCCCTCTCCTAGTGTAAAGCCTACCATCGTTGTCATTGAGGAGCAAGAAAT 

TGAAGAATATGACGAACCTCCTGTCTTTGGAAAGAGGACTATAGTCACTACAAAACCTTC 

AGGTGAACAGGAACGATGGGCAACTTGCGACGACTGCTCTAAATGGAGAAGGTTACCTGT 

AGATGCTCTTCTTTCCTTTAAATGGACATGTATAGACAATGTTTGGGATGTGAGTAGGTG 

TTCATGTTCTGCACCGGAGGAGAGTCTGAAGGAACTTGAGAATGTTCTTAAAGTAGGTAG 

AGAGCACAAGAAGAGAAGAACTGGGGAAAGACAGGCAGCACAAAGTCAGCAAGAACCGTG 

TGGTTTGGACGCACTGGCGAGTGCAGCAGTCTTAGGAGACACAATAGGCGAGCCAGAGGT 

AGCGACCACGACCAGACATCCAAGGCACAGGGCTGGATGCTCTTGCATCGTGTGCATTCA 

GCCACCAAGTGGG7W\AGGTAGGCACAAGCCTACATGTGGCTGCACTGTGTGTAGCACCGT 

GAAGAGAAGGTTCAAGACGCTTATGATGAGGAGGAAGAAGAAGCAGTTGGAGCGCGATGT 

AACAGCAGCAGAAGATAAGAAGAAGAAGGACATGGAACTGGCTGAGTCTGATAAGAGTAA 

GGAGGAGAAGGAAGTGAACACAGCGAGAATAGACCTGAACAGTGATCCATACAATAAAGA 

AGATGTTGAAGCTGTTGCGGTGGAGAAAGAAGAGAGTCGAAAAAGAGCAATAGGACAGTG 

TTCGGGCGTGGTGGCTCAAGACGCCAGTGATGTTTTAGGAGTTACAGAGTTAGAAGGAGA 

GGGTAAGAATGTTCGTGAAGAGCCGAGAGTTTCAAGCTGATATGGAAA 

>G622 Amino Acid Sequence (domain in AA coordinates: TBD) 

MFEVKMGSKMCMNASCGTTSTVEWKKGWPLRSGLLADLCYRCGSAYESSLFCEQFHKDQS 

GWRECYLCSKRLHCGCIASKVTIELMDYGGVGCSTCACCHQLNLNTRGENPGVFSRLPMK 

TLADRQHVNGESGGRNEGDLFSQPLVMGGDKREEFMPHRGFGKLMSPESTTTGHRLDAAG 

EMHESSPLQPSLNMGLAVNPFSPSFATEAVEGMKHISPSQSNMVHCSASNILQKPSRPAI 

STPPVASKSAQARIGRPPVEGRGRGHLLPRYWPKYTDKEVQQISGNLNLNIVPLFEKTLS 

ASDAGRIGRLVLPKACAEAYFPPISQSEGIPIiKIQDVRGREWTFQFRYWPNNNSRMYVLE 

GVTPCIQSMMLQAGDTVTFSRVDPGGKLIMGSRKAANAGDMQGCGLTNGTSTEDTSSSGV 

TENPPSINGSSCISLIPKELNGMPENLNSETNGGRIGDDPTRVKEKKRTRTIGAKNKRLIi 

LHSEESMELRLTWEEAQDLLRPSPSVKPTIWIEEQEIEEYDEPPVFGKRTIVTTKPSGE 

QERWATCDDCSKWRRLPVDALLSFKWTCIDKT^^ 

KKRRTGERQAAQSQQEPCGLDALASAAVLGDTIGEPEVATTTRHPRHRAGCSCIVCIQPP 
SGKGRHKPTCGCTVCSTVKRRFKTLMMRRKKKQ 

KEVNTARIDLNSDPYNKEDVEAVAVEKEESRKRAIGQCSGWAQDASDVLGVTELEGEGK 

NVREEPRVSS* 

>G778 (50.. 1249) 

TCTCAAT7VACACAAAACCTTTTAAACTAGTAAAATACACAGATTTTAGGATGAGCCAATG 
TGTTCCAAACTGTCACATCGATGATACTCCGGCAGCAGCCACCACCACCGTCCGCTCCAC 
CACAGCCGCAGACATCCCCATATTAGACTACGAGGTAGCCGAGCTGACGTGGGAGAACGG 
GCAACTAGGCTTGCACGGCTTAGGTCCACCGCGAGTGACGGCTTCGTCGACCAAGTACTC 
CACAGGCGCCGGTGGAACGTTGGAGTCGATAGTGGACCAAGCTACTCGCCTCCCTAACCC 
TAAGCCCACGGATGAGCTCGTCCCGTGGTTCCATCATCGCTCCTCCAGGGCCGCGATGGC 
AATGGACGCGCTTG5CCCTTGCTCCAACCTAGTACACGAGCAGCAGAGCAAGCCTGGTGG 
CGTTGGCTCCACCCGGGTGGGGTCATGTAGCGATGGTCGTACCATGGGCGGTGGAAAACG 
AGCAAGAGTGGCACCGGAGTGGAGCGGCGGCGGGAGTCAGCGGCTGACCATGGACACTTA 
CGACGTAGGTTTCACCTCAACATCAATGGGCTCGCACGATAACACAATCGACGATCATGA 
CTCCGTCTGCCACAGCCGCCCACAGATGGAGGACGAAGT^AGAGAAGAAAGCCGGAGGAAA 
ATCATCAGTTTCAACCAAGAGAAGCAGAGCTGCTGCTATTCATAACCAATCCGAACGTAA 
GAGGAGAGATAAAATCAATCAAAGGATGAAGACTTTGCAAT^AACTGGTTCCCAATTCCAG 
CAAGACGGATAAAGCATCTATGTTGGATGT^GTGATAGAGTATTTGAAGCAACTTCAAGC 
ACAAGTGAGCATGATGAGCAGAATGAATATGCCTTCTATGATGCTTCCTATGGCCATGCA 
GCAACAACAACAACTACAAATGTCTCTCATGTCCAATCCCATGGGTTTAGGGATGGGCAT 



203 



BNSOOCID: <WO_03013227A2_I_> 



WO 03/013227 PCT/US02/25805 

204/286 



GGGGATGCCCGGTCTCGGTCTCCTCGACCTTAATTCTATGAACCGAGCTGCTGCAAGCGC 
TCCTAATATCCATGCCAACATGATGCCAAACCCATTTTTGCCCATGAATTGTCCATCGTG 
GGATGCTTCTTCCAATGACTCTCGATTTCAGTCTCCTCTCATCCCCGATCCTATGTCTGC 
CTTTCTTGCATGCTCTACTCAGCCAACGACGATGGAAGCGTATAGCAGGATGGCTACATT 
ATATCAGCAAATGCAACAACAACTTCCTCCTCCTTCGAATCCAAAATGATTATTACTCAA 
ACACCTCTATATAGTTTACGTCTATATATGTGTTAGTCACATACATACATATATATATTC 
CATCATAATTATTTATTTATATGTATAGGCTTCTCATGAATTATGATATTATACGTATTA 
CGTAAAAAA 

>G778 Amino Acid Sequence (domain in AA coordinates: 220-267) 

MSQCVPNCHIDDTP7^AATTTVRSTTAADIPILDYEVAELTWENGQLGLHGLGPPRVTASS 

TKYSTGAGGTLESIVDQATRLPNPKPTDEL.VPWFHHRSSRAAMAMDALVPCSNLVHEQQS 

KPGGVGSTRVGSCSDGRTMGGGKRARVAPEWSGGGSQRLTMDTYDVGFTSTSMGSHDNTI 

DDHDSVCHSRPQMEDEEEKKAGGKSSVSTKRSRAAAIHNQSERKRRDKINQRMKTLQKLV 

PNSSKTDKASMLDEVIEYLKQLQAQVSMMSRMN^ 

GMGMGMPGLGLLDLNSMNRAAASAPNIHANMMPNPFLPMNCPSWDASSNDSRFQSPLIPD 

PMSAFLACSTQPTTMEAYSRMATLYQQMQQQLPPPSNPK* 

>G791 (173.. 877) 

TTTTCTTTGGGTGTTCCTTCCACCAACGGCAGAAATCGATTCGGCTTAAATCTCCCCCTC 
CTTTCGATCTCTCTGATCGCCGCCGGGAACATTCAATTTCCCGGGAGTTCAACAAAAAAA 
AAACTCTCCGTTTTTATTTTTCCCCCTTTTTCACCGGTGGAAGTTTCCGGAGATGGTGTC 
ACCCGAAAACGCTAATTGGATTTGTGACTTGATCGATGCTGATTACGGAAGTTTCACAAT 
CCAAGGTCCTGGTTTCTCTTGGCCTGTTCAGCAACCTATTGGTGTTTCTTCTAACTCCAG 
TGCTGGAGTTGATGGCTCGGCTGGAAACTCAGAAGCTAGCAAAGAACCTGGATCCAAAAA 
GAGGGGGAGATGTGAATCATCCTCTGCCACTAGCTCGAAAGCATGTAGAGAGAAGCAGCG 
ACGGGACAGGTTGAATGACAAGTTTATGGAATTGGGTGCAATTTTGGAGCCTGGAAATCC 
TCCCAAAACAGACAAGGCTGCTATCTTGGTTGATGCTGTCCGCATGGTGACACAGCTACG 
GGGCGAGGCCCAGAAGCTGAAGGACTCCAATTCAAGTCTTCAGGACAAAATCAAAGAGTT 
AAAGACTGAGAAAAACGAGCTGCGAGATGAGAAACAGAGGCTGAAGACAGAGAAAGAAAA 
GCTGGAGCAGCAGCTGAAAGCCATGAATGCTCCTCAACCAAGTTTTTTCCCAGCCCCACC 
TATGATGCCTACTGCTTTTGCTTCAGCGCAAGGCCAAGCTCCTGGAAACAAGATGGTGCC 
AATCATCAGTTACCCAGGAGTTGCCATGTGGCAGTTCATGCCTCCTGCTTCAGTCGATAC 
TTCTCAGGATCATGTCCTTCGTCCTCCTGTTGCTTAATCAAGAAAAATCATCAACCGGTT 
TGCTTCTTGCTTCCGCTTAAAAGAAAAGTCTCCATTTGTTTTGCTCTCCTCTCTTTCTCG 
GCTTTCTTAGTCTTATCCTTTTGCTTTGTCGTGTTATCATCGTAACTGTTATCTGTTGAA 
CAATGATATGACATTGTAAACTCCAATTGCTTCGCGCAATGTTATCTATTCACATGTAAA 
TTTAAGTAGAGTTTGGCAAAAAAAAAA 

>G791 Amino Acid Sequence (domain in AA coordinates: 75-143) 
MVSPBNANWICDLIDADYGSFTIQGPGFSWPVQQPIGVSSNSSAGVDGSAGNSEASKEPG 
S KKRGRCES S SATS S KACREKQRRDRLTOKFMELGAILEPGNPPKTDKAAJCLVDAVRMVT 
QLRGEAQKLKDSNSSLQDKIKELKTEKNELRDEKQRLKTEKEKLEQQLKAMNAPQPSFFP 
APPMMPTAFASAQGQAPGNKMVPIISYPGVAMWQFMPPASVDTSQDHVLRPPVA* 
>G861 (158.. 880) 

CTTCTTCCTCCTCCTCCATCTCTTCTCTTTACTCTCTCTTTAATCATCTCTCATTCTTGA 

ATCTTGATCGATCAAAATCAATCCCGTTCTCGAAAGATCC^^ 1 

TCTCTCTCITGCTTCTAGGGTTTTTTTGTTCGTTGTGATGGCGAGAGAAAAGATTCAGAT 

CAGGAAGATCGACTU^CGCAACGGCGAGACAAGTGACGTTTTCGAAACGAAGAAGAGGGCT 

TTTCAAGAAAGCTGAAGAACTCTCCGTTCTCTGCGACGCCGATGTCGCTCTCATCATCTT 

CTCTTCCACCGGAAAACTGTTCGAGTTCTGTAGCTCCAGCATGAAGGAAGTCCTAGAGAG 

GGATAACTTGCAGTCAAAGAACTTGGAGAAGCTTGATC^ 

GGTTGAGAACAGTGATCACGCCCGAATGAGTAAAGAAATTGCGGACAAGAGCCACCGACT 
AAGGCAAATGAGAGGAGAGGAACTTCAAGGACTTGACATTGAAGAGCTTCAGCAGCTAGA 
GAAGGCCCTTGAAACTGGTTTGACGCGTGTGATTGAAACAAAGAGTGACAAGATTATGAG 
TGAGATCAG CG AACTTCAG AAAAAGGGAATG CAATTG ATGGATG AG AACAAGCGGTTG AG 
GGAGCAAGGAACGCAACTAACGGAAGAGAACGAGCGACTTGGCATGCAAATATGTAACAA 
TGTGCATGCACACGGTGGTGCTGAATCGGAGAACGCTGCTGTGTACGAGGAAGGACAGTC 
GTCGGAGTCTATTACTAACGCCGGAAACTCTACCGGAGCGCCTGTTGACTCCGAGAGCTC 
CGACACTTCCCTTAGGCTCGGCTTACCGTATGGTGGTTAGAGATGGAACAATTCAAAGAA 
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GTTGATGGAGTGAGGAGAGTAATGTAAATCTTTTTAACTCGGTAGTAACAAGAGACAATG 

TCTAAGTAGTGAATTCTCAAATGTTTGTGTAAGTTTCTGCCTATGGAAGAGGCTTTCATT 

TTTATGATTTTCACTATGTATGATCTCTCTTCACTGCATTTCTGGTTAGT/y\CGGCTTGT 

CACCGAT7U\ACTTTCTCGTTATGGAAAGTTAGAATAAAAAAAAAAAA7UU^AAAA 

>G861 Amino Acid Sequence (domain in AA coordinates: 2-57) 

r47^REKIQIRKIDNATARQVTFSKRRRGLFKKAEELSVLCDADVALIIFSSTGKLFEFCSS 

SMKEVLERHNLQSKmiEKLDQPSLELQLVENSDHARMSKEIADKSHRLRQMRGEELQGLD 

IEELQQLEKALETGLTRVIETKSDKIMSEISELQKKGMQLMDENKRLRQQGTQLTEENER 

LGMQICNNVHAHGGAESENAAVYEEGQSSESITNAGNSTGAPVDSESSDTSLRLGLPYGG 

* 

>G938 (1..1755) 

ATGATGATGTTTAACGAGATGGGAATGTATGGAAACATGGATTTCTTCTCTTCCTCCACA 
TCTCTCGATGTGTGTCCATTACCACAAGCTGAACAAGAACCTGTAGTTGAAGATGTCGAC 
TACACCGATGATGAGATGGATGTGGATGAGCTTGAGAAGAGGATGTGGAGAGACAAAATG 
CGTTTGAAACGTCTCAAGGAGCAACAGAGTAAGTGTAAAGAAGGCGTCGATGGTTCGAAA 
CAGAGGCAGTCGCAAGAGCAAGCTAGGAGGAAGAAAATGTCTAGAGCCCAAGATGGGATC 
TTG AAGTATATGTTGAAGATG ATGGAAGTTTGTAAAG CTCAAGG CTTTGTTTATGGTATT 
ATTCCTGAGAAGGGTAAGCCTGTGACTGGTGCTTCGGATAATTTGAGGGAATGGTGGAAA 
GATAAGGTTAGGTTTGATCGTMTGGTCCAGCTGCTATTGCTAAGTATCAGTCAGAGAAT 
AATATTTCTGGAGGGAGTAATGATTGTAACAGCTTGGTTGGTCCAACACCGCATACGCTT 
CAGGAGCTTCAGGACACGACTCTTGGTTCGCTTTTATCGGCTTTGATGCAACATTGTGAT 
CCACCGCAGAGACGGTTTCCTTTGGAGAAAGGAGTTTCTCCACCTTGGTGGCCTAATGGG 
AATGAAGAGTGGTGGCCTCAGCTTGGTTTACCAAATGAGCT^AGGTCCTCCTCCTTATAAG 
AAGCCTCATGATTTGAAGAAAGCTTGGAAAGTCGGTGTTTTAACTGCGGTGATCAAGCAT 
ATGTCGCCGGATATTGCGAAGATCCGTAAGCTTGTGAGGCAATCAAAATGCTTGCAGGAT 
AAGATGACGGCGAAAGAGAGTGCTACTTGGCTTGCCATTATTAACCAAGAAGAGGTTGTG 
GCTCGGGAGCTTTATCCCGAGTCATGCCCTCCTCTTTCTTCTTCTTCATCATTAGGAAGC 
GGGTCGCTTCTCATTAATGATTGTAGCGAGTATGACGTTGMGGTTTCGAGAAGGAACAA 
CATGGTTTCGATGTGGAAGAGCGGAAACCAGAGATAGTGATGATGCATCCTCTAGCAAGC 
TTTGGGGTTGCTAAAATGCAACATTTTCCCATAAAGGAGGAGGTCGCCACCACGGTAAAC 
TTAGAGTTCACGAGAAAGAGGAAGCAGAACAATGATATGAATGTTATGGTAATGGACAGA 
TCAGCAGGTTACACTTGTGAGAATGGTCAGTGTCCTCACAGCAAAATGAATCTTGGATTT 
CAAGACAGGAGTTCAAGGGACAACCACCAGATGGTTTGTCCATATAGAGACAATCGTTTA 
GCGTATGGAGCATCCAAGTTTCATATGGGTGGAATGAAACTAGTAGTTCCTCAGCAACCA 
GTCCAACCGATCGACCTATCGGGCGTTGGAGTTCCGGAAAACGGGCAGAAGATGATCACC 
GAGCTTATGGCCATGTACGACAGAAATGTCCAAAGCAACCAAACGCCTCCTACTTTGATG 
GAAAACGAAAGCATGGTCATTGATGCAAAAGCAGCTCAGAATCAGCAGCTGAATTTCAAC 
AGTGGCAATCAAATGTTTATGCAACAAGGGACGAACAACGGGGTTAACAATCGGTTCCAG 
ATGGTGTTTGATTCGACACCATTCGATATGGCAGCATTCGATTACAGAGATGATTGGCAA 
ACCGGAGCAATGGAAGGAATGGGGAAGCAGCAGCAGCAGCAGCAGCAGCAGCAAGATGTA 
TCAATATGGTTCTGA 

>G938 Amino Acid Sequence (domain in AA coordinates: 96-104) 
MMMFNEMGMYGNMDFFSSSTSLDVCPLP 

RLKRLKEQQSKCKEGVDGSKQRQSQEQARRKKMSRAQDGILKYMLKMMEVCKAQGFVYGI 

IPEKGKPVTGASDNLREWWKDKVRFDRNGPAAIAKYQSENNISGGSNDCNSLVGPTPHTL 

QELQDTTLGSLLSALMQHCDPPQRRFPLEKGVSPPWWPNGNEEWWPQLGLPNEQGPPPYK 

KPHDLKKAWKVGVLTAVIKHMSPDIAKIRKLVRQSKCLQDKMTAKESATWL^^ 

ARELYPESCPPLSSSSSLGSGSLLINDCSEYDVEGFEKEQHGFDVEERKPEIVMMHPLAS 

FGVAKMQHFPIKEEVATTVNLEFTRKRKQNNDMNVM^ 

QDRSSRDNHQMVCPYRDNRLAYGASKFHMGGMKLWPQQPVQPIDLSGVGVPENGQKMIT 
ELMAMYDRNVQSNQTPPTLMENQSIW^ 

MVFDSTPFDMAAFDYRDDWQTGAMEGMGKQQQQQQQQQDVSIWF* 
>G965 (73.. 1956) 

GATTCTCTGTGTATGTCTGAATCCTTACAGGATCCAAGAGCTTTGGAAAAAAGATATAAT 
GAATAACAAGATATGGGTTTAGCTACTACAACTTCTTCTATGTCACAAGATTATCATCAT 
CACCAAGGAATCTTTTCCTTCTCTAATGGATTCCACCGATCATCATCAACCACTCATCAG 
GAGGAAGTAGATGAATCCGCCGTCGTCTCCGGTGCTCAAATTCCGGTTTATGAAACCGCC 



205 



BNSDOCID: <WO_03O13227A2_L> 



WO 03/013227 PCT/US02/25805 

206/286 



GG7VATGTTGTCTGAAATGTTTGCTTACCCTGGCGGAGGTGGCGGCGGTTCCGGTGGAGAG 
ATTCTTGATCAGTCTACTAAACAGTTGCTAGAGCAACAAAACCGTCACAACAACAACAAT 
AACTCAACTCTTCATATGTTATTACCAAATCATCATCAAGGTTTTGCTTTCACCGACGAA 
AACACTATGCAGCCGCAGCAACAACAACACTTTACATGGCCATCTTCCTCCTCCGATCAT 
CATCAAAACCGAGATATGATCGGAACCGTCCACGTGGAAGGAGGAAAGGGTTTGTCTTTA 
TCTCTCTCATCTTCATTAGCCGCAGCTAAAGCCGAGGT^ATATAGAAGCATTTATTGTGCA 
GCCGTTGATGGAACTTCTTCTTCTTCTAACGCATCCGCTCATCATCATCAATTCAATCAG 
TTCAAGAATCTTCTTCTTGAGAATTCTTCTTCTCAACATCATCACCATCAAGTTGTTGGA 
CATTTTGGTTCATCATCATCATCTCCCATGGCGGCTTCTTCATCCATTGGAGGGATCTAC 
ACGTTGAGGAATTCGAAATATACGAAACCGGCTC7UVGAGTTGTTGGAAGAGTTTTGTAGT 
GTTGGAAGAGGACATTTC7\AGAAGAACAAACTTAGTAGGAACAACTCAAACCCTAATACT 
ACCGGTGGAGGAGGAGGCGGAGGGTCCTCGTCATCGGCCGGAACAGCTAATGATAGTCCT 
CCTTTGTCTCCGGCTGATCGGATTGAACATCAAAGAAGAAAAGTCAAGCTACTATCTATG 
CTTGAAGAGGTGGACCGACGGTACAACCACTACTGCGAACAAATGCAAATGGTAGTGAAC 
TCATTCGACCAAGTAATGGGTTACGGCGCGGCGGTTCCGTACACGACATTAGCTCAAAAG 
GCAATGTCTAGGCATTTCCGGTGTTTGAAAGACGCGGTAGCGGTTCAGCTTAAACGCAGC 
TGTGAGCTTCTAGGGGATAAAGAGGCGGCAGGGGCTGCATCCTCGGGGTTAACCAAAGGG 
GT^AACGCCGCGATTGCGTTTGCTAGAGCAGAGTTTGCGTCAGCAACGAGCGTTTCATCAT 
ATGGGTATGATGGAGCAAGAGGCATGGAGACCGCAACGTGGTTTGCCTGAACGCTCCGTT 
AATATCCTTAGAGCTTGGCTATTCGAGCATTTTCTTAATCCGTACCCAAGCGATGCTGAT 
AAGCACCTCTTAGCACGACAGACTGGTTTATCCAGAAATCAGGTGTCAAATTGGTTCATA 
AATGCTAGGGTTCGCCTATGGAAACCAATGGTGGAAGAGATGTATCAACAAGAAGCAAAA 
GAAAGAGAAGAAGCAGAAGAAGAAAATGAAAATCAACAACAACAAAGAAGACAGCAACAA 
ACAAACAACAACGACACGAAACCCAACAACAATGAAAACAACTTCACTGTCATAACCGCA 
CAAACTCCAACGACGATGACATCGACACATCACGAAAACGACTCTTCATTCCTCTCTTCC 
GTCGCCGCCGCTTCTCACGGCGGTTCAGACGCGTTCACCGTCGCCACGTGTCAGCAAGAC 
GTCAGTGACTTCCACGTCGACGGAGATGGTGTGAACGTCATAAGATTCGGGACCAAACAG 
ACTGGTGACGTGTCTCTTACGCTTGGTCTACGCCACTCTGGCAATATTCCTGATAAGAAC 
ACTTCTTTCTCGGTTAGAGACTTTGGAGATTTTTAGTCTTCTTTGTTTCTCAATTTATTC 
ATC 

>G965 Amino Acid Sequence (domain in AA coordinates: 423-486) 

MGLATTTSSMSQDYHHHQGIFSFSNGFHRSSSTTHQEEVDESAVVSGAQIPVYETAGMLS 

EMFAYPGGGGGGSGGEILDQSTKQLLEQQNRKMWrKW^ 

PQQQQHFTWPSSSSDHHQNRDMIGTVHVEGGKGLSLSLSSSLAAAKAEEYRSIYCAAVDG 

TS S S SNAS AHHHQFNQFKNLLLENS S S QHHHHQWGHFGS S S S S PMAAS S S IGGI YTLRN 

SKYTKPAQELLEEFCSVGRGHFKKNKLSRNNSNPNTTGGGGGGGSSSSAGTANDSPPLSP 

ADRIEHQRRKVKLLSMLEEVDRRYNHYCEQMQMVWSFDQVMGYGAAVPYTTI^ 

HFRCLKDAVAVQLKRS CELLGDKEAAGAAS SGLTKGETPRLRLLEQSLRQQRAFHHMGMM 

EQEAWRPQRGLPERSWILRAWLFEHFLNPYPSDADKHLLMQTGLSRNQVSNWFINARV 

RLWKPMVEEMYQQEAKEREEAEEENENQQQQRRQQQTNNNDT 

TMTSTHHENDS SFLS S VAAASHGGSDAFTVATCQQDVSDFHVDGDGVNVIRFGTKQTGDV 

SLTLGLRHSGNIPDKNTSFSVRDFGDF* 

>G1143 (54.. 677) 

AAATAAGAATATAAACACTTTTGTCTGAAAAATTATCAAAGAAGAAGAAATAAATGGGTG 

GAGGAAGCAGATTTCAAGAACCAGTGAGGATGAGCCGTAGGAAACAAGTAACAAAAGAGA 

AGGAAGAAGATGAAAACTTCAAATCTCCAAATCTTGAAGCAGAGAGACGTAGAAGAGAGA 

AGCTTCATTGTCGGCTTATGGCTCTGCGATCTCATGTCCCCATTGTC^CCAACATGACTA 

AAGCAAGTATTGTT6AAGATGCGATTACTTACATAGGAGAGCTTCAAAACAATGTTAAGA 

ATCTCTTAGAGACATTTCATGAAATGGAAGAAGCTCCTCCTGAGATTGATGAAGAACAAA 

CGGATCCAATGATAAAACCTGAAGTTGAAACTAGTGATCTTAACGAAGAGATGAAGAAAC 

TCGGAATCGAGGAGAATGTGCAATTGTGTAAGATTGGGGAGAGGAAGTTTTGGTTAAAGA 

TCATAACAGAGAAGAGAGATGGGATCTTTACTAAATTCATGGAGGTTATGAGATTTCTCG 

GATTCGAGATTATCGATATTAGTCTAAC^ 

CTGTTCAGACACAGGAACTCTGTGATGTTGAACAGA^ 

TGAGAAGCAATCCATAAGTATTAATTATATACATCTTGGAAATTTCTTGATCTAATAACA 

TTTCCATTGGTTTTTATTACATTGTTGT^ 

AAGAGTTTGTGTTACAAGCCAATGA 
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>G1143 Amino Acid Sequence (domain in AA coordinates : 33 -82) 
MGGGSRFQEPVRMSRRKQVTKEKEEDENFKSPNLEAERRRREKLHCRLMALRSHVPIVTN 
MTKASIVEDAITYIGELQNNVKNLLETFHEMEEAPPEIDEEQTDPMIKPEVETSDLNEEM 
KKLGIEENVQLCKIGERKFWLKI ITEKRDGI FTKFMEVMRFLGFEI IDI SLTTSNGAILI 
S AS VQTQELCDVEQTKDFLLEVMRSNP * 
>G1190 (209.. 2020) 

TCCTGTCCCAAAACCAAAAGACTTGAGAGTGTGTCTTTAGAGAGAGATCTTCTCTCTTTT 

ATCTTACGACTCTCACTTCTTATCTCAAATCTACTTCAACTCTATTTCCAGTCTCCACAT 

TTTCCCACAAATTTCAACTCTTGTTCTCTTCCTCCAAAGTAAAAAACAAATCGTTGCAAG 

TGAGGTTTGGTTTTGGTGTTATAGAATTATGAAGAGCGGGAAGCAATCTTCGCAACCTGA 

AAAGGGTACTTCCAGGATCTTGTCACTGACTGTCCTGTTTATCGCATTTTGCGGTTTCTC 

CTTCTACCTCGGTGGTATATTTTGCTCTGAGAGAGACAAGATTGTAGCCAAGGATGTCAC 

AAGGACGACTACAAAGGCTGTAGCTTCCCCTAAAGAACCTACAGCTACTCCTATTCAAAT 

CAAATCCGTTTCTTTCCCGGAGTGCGGGTCAGAGTTCCAAGATTACACCCCGTGCACCGA 

TCCAAAGAGGTGGAAGAAGTATGGTGTCCATCGCTTAAGTTTCTTGGAGCGTCATTGTCC 

TCCGGTATATGAAAAGAATGAGTGTTTGATTCCACCACCAGACGGGTATAAACCGCCTAT 

AAGATGGCCCAAGAGCCGAGAACAGTGTTGGTACAGGAACGTGCCTTATGATTGGATCAA 

TAAGCAAAAGTCTAACCAGCATTGGCTTAAGAAAGAAGGAGATAAGTTCCATTTCCCTGG 

TGGTGGTACCATGTTCCCTCGTGGAGTTAGTCACTATGTTGATTTGATGCAAGATCTGAT 

TCCTGAAATGAAAGACGGAACAGTCAGGACCGCCATTGATACTGGCTGTGGGGTTGCGAG 

CTGGGGAGGCGATCTTTTGGACCGTGGGATACTATCACTCTCTCTTGCTCCAAGAGATAA 

CCATGAAGCTCAGGTTCAATTTGCTCTTGAACGTGGAATTCCTGCGATTCTCGGGATCAT 

CTCTACGCAACGTCTCCCTTTTCCTTCAAATGCATTTGATATGGCTCATTGTTCAAGATG 

TCTTATTCCCTGGACAGAATTTGGTGGAATCTATTTACTTGAGATTCACCGTATAGTTCG 

ACCTGGAGGTTTTTGGGTTCTTTCTGGTCCACCTGTGAACTATAATAGACGATGGCGTGG 

ATGGAACACAACCATGGAAGATCAGAAATCTGACTACAACAAGCTTCAGTCACTTCTAAC 

CTCCATGTGTTTCAAAAAGTACGCTCAAAAAGATGACATAGCCGTGTGGCAGAAACTCTC 

AGACAAATCTTGCTATGACAAAATCGCTAAGAACATGGAAGCTTACCCTCCCAAATGTGA 

CGACAGTATAGAACCTGATTCTGCTTGGTACACTCCACTCCGTCCTTGCGTGGTTGCCCC 

GACACCTAAAGTCAAGAAGTCTGGTCTCGGATCAATCCCAAAATGGCCCGAGAGGTTACA 

TGTCGCGCCCGAGAGAATCGGTGATGTTCACGGAGGGAGTGCGAACAGTTTGAAACACGA 

TGATGGTAAATGGAAGAACAGAGTTAAGCATTACAAGAAAGTTTTACCAGCTCTTGGGAC 

AGACAAGATAAGAAATGTTATGGATATGAACACTGTTTATGGAGGTTTCTCTGCGGCCCT 

CATTGAGGATCCCATTTGGGTCATGAACGTTGTATCATCGTACAGCGCAAATTCGCTTCC 

TGTTGTCTTTGATCGCGGTCTCATCGGGACTTACCACGACTGGTGCGAAGCTTTCTCAAC 

GTATCCAAGAACATATGATCTTCTTCACCTCGACAGTCTTTTTACCTTGGAGAGTCACAG 

GTGTGAGATGAAGTACATTTTGCTAGAGATGGACAGGATCTTGCGGCCGAGTGGATATGT 

TATAATCCGAGAATCGAGTTATTTCATGGACGCAATCACAACGTTAGCGAAAGGGATAAG 

GTGGAGTTGCCGGAGAGAGGAGACTGAGTATGCAGTCAAAAGTGAGAAGATTCTGGTTTG 

CCAGAAAAAGCTATGGTTTTCGTCAAACCAAACCTCTTGATGAGACCACCTGTATCATAG 

TGTTTATCATCTCCTGTGATGCACACTACAGAGAGAAGGATCTAGTCCTTTGAGTCCAAG 

ATATAGCTCTATAAACAATCTCCTTTTTTTGTTCTCTTTAATTTCTTGGGTATTTCACGG 

TATAGATTGATATTATATATTTTTTAATTATATTTTTAATATATAGATATATTAGTATGT 

GGTTTAAACACTATTATTATCAAGGTCTTAAAGATTTGCTTTGCAAGAGTTAAAAAATGT 

TGGAGTAAGGACCTCTTGATTAATAAATTGACTGACGCAGCAAA 

>G1190 Amino Acid Sequence (domain in AA coordinates : entire protein) 

MKSGKQSSQPEKGTSRILSLTVLFIAFCGFSPYLGGIFCSERDKIVAKDVTRTTTKAVAS 

PKEPTATPIQIKSVSFPECGSEFQDYTPCTDPKRWKKYGVHRLSFLERHCPPVYEKNECL 

IPPPDGYKPPIRWPKSREQCWYRNVPYDWINKQKSNQHWLKKEGDKFHFPGGGTMFPRGV 

SHYVDLMQDLIPEMKDGTVRTAIDTGCGVASWGGDLLDRGILSLSLAPRDNHEAQVQFAL 

ERGIPAILGIISTQRLPFPSNAFDMAHCSRCLIPWTEFGGIYLLEIHRIVRPGGFWLSG 

PPVNYNRRWRGWNTTMEDQKSDYNKLQSLLTC 

K^EAYPPKODDSIEPDSAWYTPLRPCWAPTPKVKKSGLGSIPKWPERLHVAPERIGDV 
HGGSANSLKHDDGKWKNRVKHYKKVLPALGTDKIRITVMDM^ 

WSSYSANSLPWFDRGLIGTYHDWCEAFSTYPRTYDLLHLDSLFTLESHRCEMKYILLE 
MDRILRPSGYVIIRESSYFMDAITTLAKGIRWSCRREETEYAVKSEKILVCQKKLWFSSN 
QTS* 
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>G1198 (230. .1675) 

TCTTTTCAAATTCCAATCATTTGATCAACTAATCAAGAATTAATTATAAGACTTTGCAAT 

CTCTCTCCCTCTCCCTCTCCCTAGCTAGTTCTCTCTTGTGTTTCTTAACTCGAGCTTCTC 

TCAATAGTGATTATCATCTTTTTCATCATTTCAAGATTTAATGTGTTTTGCAGAAAAGAG 

ACTAATCAAGAAGAGATATCATCAATTGAAGCTGTTTTCTTGAGTAGAGATGGCGAACCA 

TAGAATGAGCGAAGCTACAAACCATAACCACAATCATCATCTTCCTTATTCACTTATTCA 

TGGTCTCAACAACAATCATCCATCTTCTGGTTTCATTAACCAAGATGGATCGTCCAGTTT 

CGATTTTGGAGAGCTAGAAGAAGCAATTGTTCTGCAAGGTGTCAAGTATAGGAACGAGGA 

AGCCAAGCCACCTTTATTAGGAGGAGGAGGAGGAGCTACGACTCTGGAGATGTTCCCTTC 

GTGGCCAATCAGAACTCACCAAACTCTTCCTACTGAGAGTTCCAAGTCAGGAGGAGAGAG 

CAGCGATTCAGGATCGGCTAATTTCTCCGGCAAAGCTGAAAGTCAACAACCGGAGTCTCC 

TATGAGTAGCAAACATCATCTCATGCTTCAACCTCATCATAATAACATGGCAAACTCAAG 

TTCAACATCTGGACTTCCTTCCACTTCTCGAACTTTAGCTCCTCCTAAACCTTCGGAAGA 

TAAGAGGAAGGCTACAACTTCAGGCAAACAGCTTGATGCTAAGACGTTGAGACGTTTGGC 

CCAAAATAGAGAAGCTGCTCGCAAAAGCCGTCTTAGGAAAAAGG CGTATGTGCAACAG CT 

AGAATCAAGTAGGATAAAGCTTTCCCAATTGGAGCAAGAACTTCAGCGAGCTCGTTCTCA 

GGGGCTGTTCATGGGTGGTTGTGGACCACCAGGACCTAACATCACTTCCGGAGCTGCAAT 

ATTTGACATGGAATATGGGAGATGGCTAGAGGATGATAACCGGCATATGTCGGAGATTCG 

AACCGGTCTTC^GGCTCATTTATCTGACAATGATTTAAGGTTGATCGTTGACGGTTACAT 

TGCTCATTTTGATGAGATATTCCGATTAAAAGCCGTGGCAGCGAAAGCCGATGTTTTTCA 

CCTCATCATTGGGACATGGATGTCCCCAGCCGAACGTTGTTTTATTTGGATGGCTGGTTT 

CCGTCCATCCGACCTAATCAAGATATTGGTGTCGCAAATGGATCTATTGACGGAGCAACA 

ACTGATGGGAATATATAGCCTACAACACTCGTCGCAACAAGCAGAGGAGGCTCTCTCGCA 

AGGCCTCGAACAACTTCAGCAATCTCTCATCGATACTCTCGCCGCATCTCCAGTCATTGA 

CGGAATGCAACAAATGGCTGTCGCTCTCGGAAAGATCTCTAATCTCGAAGGCTTTATCCG 

CCAGGCTGATAACTTGAGGCAGCAGACCGTTCACCAGCTGAGGCGGATCTTGACCGTCCG 

ACAAGCTGCACGGTGTTTCCTAGTCATCGGAGAGTACTATGGACGGCTCAGAGCTCTTAG 

CTCCCTTTGGTTGTCACGCCCACGAGAGACACTGATGAGTGATGAAACCTCTTGTCAAAC 

GACGACGGATTTGCAGATTGTTCAGTCATCTCGGAACCACTTCTCCAATTTCTGAATGGA 

ATGAAACTTTGTATAACTAAAAGGCCAAGTTTCATTGTCTGTCGTAATTTCACCTATTTC 

CTTTAAAGTTGTACTAGAGAAAAGATAGGATCTTCCTTCG 

>G1198 Amino Acid Sequence (domain in AA coordinates- 173-223) 
MANHRMSEAraHttHNHHLPYSLM 

RNEEAKPPLLGGGGGATTLEMFPSWPIRTHQTLPTESSKSGGESSDSGSANFSGKAESQQ 
PESPMSSKHHLMLQPHHNNMANSSSTS^ 

RRLAQNREAARKSRLRKKAYVQQLE S SR I KLS QLEQELQRARS QGLFMGGCGPPGPNITS 

GAAI FDMEYGRWLEDDNRHMSE IRTGLQAHLSDNDLRL I VDGY I AHFDE I FRLKAVAAKA 

DVFHLIIGTWMSPAERCFIWMAGFRPSDLIKILVSQMDLLTEQQLMGIYSLQHSSQQAEE 

AIiSQGLEQLQQSLIDTLT^SPVIDGMQQMAVALGKISNLEGFIRQADNLRQQTVHQLRRI 

LTVRQAARCFLVIGEYYGRLRALSSLWLSRPRETLMSDETSCQTTTDLQIVQSSRNHFSN 
p* 

>G1226 (212.. 1159) 

CTG(^TTTATTAAGAACAGTTTAGARAGTGTCAACCCOTAAAGGAATGTTTTTAGTTTAG 
AGGAAAGAGAGAGAAGAAGAAGCAGCAGCAGAAGTTGTTAATTTGAAGACTATTTGAGGA 
AAGACACCTATATCTAAATACTCAAAGTTACAAAAATATTACTTCAGAAAACAGTTCCAT 
TAGAGAGACTCATAAAGCTTCTCATCTAATTATGAGTGGATTGATGAGTTTTGGTGAATT 
AGAAGACCAATTTCGTCAGATTTCAGACACTACTATGGARGAGAAGATACCATTTCTGCA 
AATGCTTCAATGCATAGAACACCCTTTTACAACAACAGAACCAAATCAGTTTCTCCAATC 
ACTTCTCCAGATCCAAACCCTAGAATCAAAGAGCTGTCTCACCCTTGAAACAAACATCAA 
AAGAGATCCGGGTCAAACAGATGACCCGGAAAAGGATCCAAGAACAGAAAACGGAGCAGT 
AACGGTCAAAGAAAAAAGAAAACGGAAACGTACAAGAGCTCCAAAGAACAAAGACGAAGT 
TGAAAACCAAAGGATGACTCACATTGCCGTCGAACGTAATCGAAGACGACAAATGAACGA 
ACACTTAAACTCTCTCCGATCTCTCATGCCTCCTTCGTTTCTTCAACGGGGTGACCAAGC 
TTCGATTGTAGGAGGGGCAATAGATTTCATCAAGGAACTAGAGCAACTCTTGCAATCTCT 
AGAAGCTGAGAAACGAAAGGATGGAACTGATGAAACTCCTAAAACGGCGTCGTGTTCTTC 
ATCTTCGTCTCTTGCATGCACTAACTCTTCTATTTCTAGCGTGTCTACGACGTCGGAAAA 
TGGATTTACGGCGAGATTCGGCGGTGGAGATACGACAGAAGTGGAGGCTACGGTGATACA 
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GAACCATGTGAGCTTAAAAGTTCGGTGTAAGAGAGGAAAACGACAGATCTTAAAAGCTAT 
TGTCTCGATTGAAGAACTAAAGCTTGCGATTCTACATCTCACTATCTCTTCTTCCTTTGA 
CTTTGTCATCTACTCTTTCAATCTCAAGATGGAAGATGGTTGTAAATTAGGATCAGCAGA 
TGAGATAGCGACAGCCGTTCATCAGATCTTCGAGCAAATCAACGGTGAAGTCATGTGGTC 
AAATCTTAGTCGAACTTAGTTGACTTTTGACTCCTAGTAACGTGTGTAAACTTTAGGTTA 
CAAAGAAAAGGGACGTGATATAAATAAGAAAAACCAAAGAGGTGAAATTTTGGGAGTTTT 
AATTATTATCTTATACTTTTTGGATTTTAGATTAGTAGCAAACTCGCAGTGTTCTACGAT 
GACATTATTATTGGTCACATGAAGGTTTAGGTTAAAAAAAAA 

>G1226 Amino Acid Sequence (domain in AA coordinates : 115-174) 

MSGLMSFGELEDQFGQISDTTMEEKIPFLQMLQCIEHPFTTTEPNQFLQSLLQIQTLESK 

SCLTLETNIIO^PGQTDDPEKDPRTENGAVTVKEKRKRICRTRAPKNKDEVENQRMTHIAV 

ERNRRRQMNEHLNSLRSLMPPS FLQRGDQAS I VGGAIDF I KELEQLLQSLEAEKRKDGTD 

ETPKTASCSSSSSLACTNSSISSVSTTSENGFTARFGGGDTTEVEATVIQNHVSLKVRCK 

RGKRQILKAIVSIEELKLAILHLTISSSFDFVIYSFNLKMEDGCKLGSADEIATAVHQIF 

EQINGEVMWSNLSRT* 

>G1451 (124.. 2559) 

TTTGTACTTCCGGAG CTAAAGAGTTATAG CTACTGTAGTAGCTGGAAGTGAAGAAGATTT 

TTTAATAGATTGTACGGAAAAATTAGGGTTTTCAAAGTTTGGTTTCTTGAAGTTGAATTA 

GACATGAAGCTGTCAACATCTGGATTGGGTCAACAGGGTCATGAAGGAGAGAAGTGTCTG 

AATTCTGAGCTATGGCATGCTTGTGCTGGACCATTAGTCTCTCTTCCATCATCTGGTAGT 

CGAGTTGTTTACTTTCCACAGGGTCACAGTGAACAGGTAGCTGCTACAACTAATAAGGAA 

GTTGATGGTCACATACCCAATTACCCAAGCCTACCACCACAATTGATATGCCAGCTCCAT 

AATGTTACAATGCATGCAGATGTTGAGACGGATGAAGTCTATGCTCAAATGACACTTCAA 

CCATTGACACCGGAGGAGCAGAAGGAAACATTTGTACCGATTGAGTTGGGGATACCGAGT 

AAGCAACCTAGTAATTATTTTTGTAAGACTCTCACAGCTAGTGATACCAGTACACATGGA 

GGGTTTTCTGTTCCTAGACGTGCTGCTGAGAAAGTGTTTCCTCCATTGGATTACACACTG 

CAGCCACCAGCTCAAGAACTGATTGCAAGGGATCTCCATGATGTTGAATGGAAGTTTAGG 

CATATCTTTCGGGGACAGCCCAAACGGCATCTCCTAACTACTGGATGGAGTGTCTTTGTC 

AGTGCCAAGCGACTAGTAGCTGGAGATTCTGTCATTTTCATCAGGAATGAAAAGAATCAA 

CTCTTTTTGGGAATTCGTCATGCCACTCGGCCGCAGACTATTGTACCATCATCTGTTTTA 

TCTAGTGATAGCATGCATATTGGACTCCTTGCTGCTGCTGCACATGCTTCTGCAACTAAT 

AGCTGTTTCACTGTTTTCTTTCATCCAAGGGCTAGCCAATCTGAGTTTGTGATACAACTT 

TCCAAGTACATTAAAGCCGTTTTTCACACGCGTATTTCAGTTGGGATGCGCTTTCGCATG 

CTCTTCGAGACAGAAGAGTCGAGTGTCCGCAGGTACATGGGTACTATAACTGGTATTAGT 

GATCTAGATTCTGTTCGTTGGCCAAACTCTCATTGGCGATCTGTGAAGGTTGGTTGGGAT 

GAATCGACTGCAGGGGAGAGACAGCCAAGGGTTTCTTTATGGGAGATTGAGCCTCTGACT 

ACCTTTCCTATGTATCCATCTCTTTTTCCTCTCAGACTAAAACGTCCATGGCATGCTGGC 

ACATCATCTTTGCCTGATGGAAGGGGTGATTTGGGAAGTGGTCTAACATGGCTAAGAGGG 

GGAGGTGGAGAGCAGCAAGGTTTGCTTCCTCTAAATTATCCATCTGTTGGTTTGTTTCCA 

TGGATGCAACAAAGGCTGGATCTCAGTCAAATGGGGACTGATAATAATCAGCAATACCAA 

GCAATGTTAGCTGCTGGGTTGCAGAACATCGGCGGTGGAGATCCTTTAAGACAGCAGTTT 

GTACAGCTGCAAGAGCCTCACCACCAATATCTT 

TTGATGCTTCAGCAGCAACAGCAGCAACAAGCGTCACGCCATCTCATGCATGCTCAAACA 
CAGATTATGAGTGAGAATCTTCCGCAGCAGAATATGCGACAAGAAGTTAGTAACCAACCA 
GCTGGACAGCAGCAACAGCTA(^GCAACCGGACCAAAATGCATATCTTAATGCTTTCAAA 
ATGCAAT^ATGGCCATCTTCAACAGTGGCAGCAGCAATCAGAGATGCCATCTCCCTCGTTC 
ATGAAGTCAGATTTTACTGACTCAAGCAACAAATTTGCAACAACTGCTAGTCCGGCTTCT 
GGAGATGGCAATCTTTTGAATTTTTCTATAACCGGTCAGTCTGTACTCCCTGAGCAGTTA 
ACAACAGAGGGCTGGTCTCCAAAAGCATCCAACACTTTTTCTGAACCGTTGTCACTTCCA 
CAAGCCTATCCTGGGAAGAGTCTTGCTCTAGAACCCGGAAATCCGCAGAATCCCTCTCTT 
TTCGGTGTTGATCCCGACTCTGGACTCTTCCTCCCCAGTACGGTTCCCCGCTTTGCTTCT 
TCATC^GGAGATGCTGAAGCTTCCCCTATGTCACTAACAGATTCAGGATTTC^GAATTCC 
TTATATAGCTGC^TGCAAGACACAACTCATGAGrTATTGCATGGAGCTGGACAGATTAAC 
TCGTCCAACCAAACCAAGAACTTTGTAAAGGTTTATAAATCTGGTTCGGTTGGGCGTTCA 
TTAGACATCTCCCGATTCAGCAGCTACCACGAGCTGCGAGAAGAGTTAGGGAAGATGTTT 
GCTATCGAAGGGTTGTTGGAAGACCCCCTTAGATCAGGCTGGCAGCTTGTATTCGTTGAC 
AAGGAAAATGATATTCTTCTCCTTGGTGATGACCCATGGGAGTCATTTGTGAATAACGTT 
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TGGTACATAAAGATACTATCACCAGAAGATGTGCATCAAATGGGAGATCATGGAGAAGGC 

AGTGGTGGGTTATTCCCGCAAAACCCGACCCATCTCTAGAAGCTGCTTCGGTGTTAGTCT 

CATCATGCTACAACGCGGGAGCCCTTTGTTTCCCATTTGAAGTCGTTTCCACTCATCTTT 

ATATGCCATTCGTTCGCATCTCTCTCGTTTTGACGTTTTTAGAAAGAAACATAATCATAT 

TTGTGAGTATGGGTCCTGAAACTTTAGGACGTACTTTAGCTTGTATTAGACAGACACTCT 

CGTCATAAACATAAGAACCTTTATGTAGCTGTCTCAGGGTAACTAAACTTTTCTAG 

>G1451 Amino Acid Sequence (domain in AA coordinates- 22-357) 

MKLSTSGLGQQGHEGEKCLNSELWHACAGPLVSLPSSGSRWYFPQGHSEQVAATTNKEV 
DGHIPNYPSLPPQLICQLHNVTMHADVETDEVYAQMTLQPLTPEEQKETFVPIELGIPSK 
QPSNYFCKTLTASDTSTHGGFSVPRRAAEKVFPPLDYTLQPPAQELIARDLHDVEWKFRH 
IFRGQPKRHLLTTGWSVFVSAKRLVAGDSVIFIRNEKNQLFLGIRHATRPQTIVPSSVLS 
SDSMHIGLLAAAAHASATNSCFTVFFHPRASQSEFVIQLSKYIKAVFHTRISVGMRFRML 
FETEESSVRRYMGTITGISDLDSVRWPNSHWRSVKVGWDESTAGERQPRVSLWEIEPLTT 
FPMYPSLFPLRLKRPWHAGTSSLPDGRGDLGSGLTWLRGGGGEQQGLLPLNYPSVGLFPW 
MQQRLDLSQMGTDNNQQYQAMLAAGLQNIGGGDPLRQQFVQLQEPHHQYLQQSASHNSDL 
MLQQQQQQQASRHLMHAQTQIMSENLPQQNMRQEVSNQPAGQQQQLQQPDQNAYLNAFKM 
QNGHLQQWQQQSEMPSPSFMKSDFTDSSNKFATTASPASGDGNLLNFSITGQSVLPEQLT 
TEGWSPKASNTFSEPLSLPQAYPGKSLALEPGNPQNPSLFGVDPDSGLFLPSTVPRFASS 
SGDABASPMSLTDSGFQNSLYSCMQDTTHELLHGAGQINSSNQTKNFVKVYKSGSVGRSL 

DISRFSSYHELREELGKMFAIEGLLEDPLRSGWQLVFVDKENDILLLGDDPWESFVNNVW 

YIKILSPEDVHQMGDHGEGSGGLFPQNPTHL* 

>G1478 (1..354) 

ATGTGTAGAGGGTTTGAGAAAGAAGAAGAGAGAAGAAGCGACAATGGAGGATGCCAAAGA 
CTATGCACGGAGAGTCACAAAGCTCCGGTAAGCTGTGAGCTTTGCGGCGAGAACGCCACC 

gtgtattgtgaggcagacgcagctttcctttgtaggaa!atgcgatcgatgggtccattct 

GCTAATTTTCTAGCTCGGAGACATCTCCGGCGCGTGATCTGCACGACCTGTCGGAAGCTA 
ACTCGTCGATGTCTTGTCGGTGATAATTTTAATGTTGTTTTACCGGAGATAAGGATGATA 
GCAAGGATTGAAGAACATAGTAGTGATCACAAAATTCCCTTTGTGTTTCTCTGA 
>G1478 Amino Acid Sequence (domain in aa coordinates: 32-76) 

MCRGFEKEEERRSDNGGCQRLCTESHKAPVSCELCGENATVYCEADAAFLCRKCDRWVHS 

ANFIARRHLRRVICTTCRKLTRRCLVGDNFNVVLPEIRMIARIEEHSSDHKIPFVFL* 
>G1496 (116. .1123) 

AAACCCACCAAATAACTCAGAGCTTTTTTGCATTTTTTCCCATTCTCTATTTTGTTTTGT 

ACTTTTGGTCTCACTTTAAAAGATGATAAGTTGAAAGATTTCTGGAGAGAACAATATGTT 

GGAAGGTCTTGTCTCTCAAGAAAGCTTGTCCTTAAACTCTATGGACATGTCTGTACTTGA 

AAGGCTTAAATGGGTACAACAGCAACAACAGCAACTGCAACAAGTTGTGTCCCATAGCAG 

TAATAATTCACCTGAACTTCTTCAGATACTTCAGTTCCATGGAAGCAACAATGATGAGTT 

GTTGGAGAGTAGTTTCAGCCAATTTCAAATGCTTGGATCTGGTTTTGGACCAT^ACTATAA 

CATGGGTTTTGGTCCTCC7VCATGAATCCATTTCAAGAACAAGTAGCTGCCATA 

TGTGGATACAATGGAGGTTTTGTTGAAGACCGGTGAAGAAACCAGAGCCGTTGCCTTGAA 

GAACAAGAGAAAACCAGAGGTTAAGACAAGGGAAGAGCAAAAGACAGAGAAGAAGATCAA 

AGTAGAGGCTGAGACAGAGTCAAGCATGAAAGGAAAATCAAACATGGGAAACACTGAAGC 

ATCTTCAGACACTTCAAAGGAGACATCGAAAGGAGCTTCAGAGAATCAGAAATTAGATTA 

TATCCACGTGAGAGCTCGTCGAGGCCAAGCCACTGACAGACACAGCTTAGCAGAAAGGGC 

GAGAAGAGAAAAGATCAGCAAGAAAATGAAATATCTGCAAGATATTGTGCCTGGATGCAA 

TAAGGTCACAGGAAAAGCTGGTATGCTTGATGAGATCATCAATTATGTTCAATGTCTCCA 

AAGACAAGTCGAGTTCCTGTCGATGAAACTTGCTGTCTTGAACCCGGAACTAGAGCTTGC 

CGTGGAAGATGTATeCGTAAAACAGGCTTACTTTACAAA 

AATAATGGTTGATGTGCCATTGTTTCCGTTAGACCAGCAAGGATCTCTAGATTTGTCTGC 

GATAAACCCGAACCAAACGACATCTATCGAAGCTCCATCTGGAAGCTGGGAAACTCAATC 

ACAGAGTCTCTACAACACATCTAGCCTCGGTTTTCATTACTAAGCAAGATTCATTGAAAC 

AACATGGTTGACATCAATCAATCATCAAAATGAGAAGCAAATTCTATTACATTTGCTC^ 

CAAAGTAGTAATTTCGAAATTTGGTTAATGCATTATCCTTTGATCCTTGTTTT^ 

TTAAACCAGAAGAACTGGAGATAGCAATCCAATGATCTTGTCACCA 

>G1496 Amino Acid Sequence (domain in AA coordinates: 184-248) 

MLEGLVSQESLSLNSWMSVLERLKWVQQQQQQLQQVVSHSSWNSPELLQILQFHGSNND 

ELLESSFSQFQMLGSGFGPNYNMGFGPPHESISRTSSCHMEPWMEVLLKTGEETRAVA 
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LKNKRKPEVKTREEQKTEKKIKVEAETESSMKGKSNMGNTEASSDTSICETSKGASENQKL 

DYIHVRARRGQATDRHSLAERARREKISKKMKYIiQDIVPGCNKVTGKAGMLDEIINYVQC 

LQRQVEFLSMKLAVLNPELELAVEDVSVKQAYFTNWASKQSIMVDVPLFPLDQQGSLDL 

SAINPNQTTSIEAPSGSWETQSQSLYNTSSLGFHY* 

>G1526 (1. .3090) 

ATGGGAACGAAAGTCTCAGACGATCTTGTTTCCACCGTCAGATCAGTCGTGGGTTCCGAT 

TACTCAGATATGGATATAATCAGGGCTTTACACATGGCGAATCATGATCCAACGGCTGCT 

ATCAATATAATCTTCGACACTCCAAGTTTCGCCAAACCTGATGTAGCCACTCCTACCCCG 

AGCGGCTCTAATGGAGGGAAGCGAGTTGATAGTGGATTAAAGGGCTGTACTTTTGGTGAC 

AGCGGAAGTGTTGGAGCGAATCATCGCGTGGAGGAAGAAAATGAGAGTGTTAATGGTGGA 

GGAGAAGAGAGTGTTTCAGGGAATGAGTGGTGGTTTGTTGGTTGTTCTGAATTGGCTGGG 

TTATCGACATGTAAAGGAAGGAAATTGAAGTCTGGTGATGAATTGGTGTTCACGTTTCCG 

CATAGTAAAGGATTAAAGCCTGAGACTACGCCTGGGAAGCGCGGTTT-TGGGCGGGGAAGG 

CCAGCTTTGCGTGGTGCTTCTGATATCGTTAGGTTCTCTACAAAGGATTCAGGAGAGATT 

GGTAGAATACCAAACGAGTGGGCTCGGTGTCTTCTACCACTTGTGAGAGACAAGAAAATT 

AGGATAGAAGGCAGTTGCAAGTCGGCGCCTGAAGCTTTGAGCATCATGGATACAATTCTT 

CTGTCTGTAAGCGTGTACATTAATAGTTCCATGTTTCAAAAGCATAGTGCGACTTCATTT 

AAGACAGCTAGTAATACGGCAGAGGAATCAATGTTCCATCCTCTCCCAAATCTCTTTCGG 

TTACTCGGTTTGATCCCCTTTAAGAAGGCAGAGTTTACTCCAGAGGATTTTTACTCTAAG 

AAGCGACCTTTGAGTTCCAAGGATGGTTCTGCTATTCCTACTTCGTTGCTTCAATT7\AAC 

AAGGTCAAGAATATGAATCAAGATGCAAACGGAGATGAAAATGAGCAGTGTATCAGCGAT 

GGTGATCTTGATAACATTGTTGGTGTTGGGGACAGTTCTGGATTAAAGGAAATGGAAACT 

CCACATACACTTCTGTGTGAGCTTCGTCCATACCAAAAGCAGGCACTTCATTGGATGACC 

CAACTGGAGAAAGGAAATTGCACTGATGAGGCAGCAACAATGCTTCACCCGTGTTGGGAA 

GCATACTGTTTAGCAGACAAGAGGGAACTGGTTGTCTACCTGAATTCTTTTACTGGTGAT 

GCTACAATACACTTCCCTAGCACACTTCAAATGGCAAGAGGAGGAATATTAGCAGACGCA 

ATGGGTCTTGGAAAGACTGTAATGACCATATCCCTTTTGCTTGCCCATTCTTGGAAAGCT 

GCATCAACTGGGTTTCTATGCCCCAACTATGAAGGAGACAAAGTGATCAGCAGTTCTGTA 

GATGATCTCACTAGTCCCCCGGTGAAGGCAACCAAATTTCTAGGCTTTGATAAGAGGCTT 

CTTGAACAAAAAAGTGTACTTCAAAATGGTGGTAACCTGATTGTATGTCCGATGACACTT 

TTAGGACAGTGGAAGACAGAGATTGAAATGCATGCAAAGCCTGGGTCTCTATCTGTCTAT 

GTTCACTATGGGCAAAGCAGGCCGAAGGATGCAAAACTTCTTTCCCAGAGTGATGTGGTA 

ATCACCACATATGGAGTTCTAACATCCGAATTCTCGCAAGAGAACTCAGCAGACCATGAA 

GGAATTTATGCAGTTCGATGGTTTAGGATTGTTCTTGACGAGGCACATACCATCAAAAAC 

TCAAAAAGCCAAATTTCCTTGGCTGCTGCAGCTCTGGTTGCTGATAGGCGTTGGTGTCTT 

ACGGGTACTCCTATTCAGAACAATCTGGAGGATTTATACAGCCTTCTACGGTTTTTGAGG 

ATTGAACCATGGGGAACTTGGGCATGGTGGAATAAACTTGTCCAAAAGCCATTTGAAGAG 

GGTGATGAGAGAGGGTTAAAGCTAGTGCAGTCTATCTTAAAACCTATCATGCTTAGGAGA 

ACAAAGTCTAGCACAGACCGAGAAGGAAGGCCGATTCTTGTTCTACCCCCTGCTGATGCA 

CGGGTCATTTACTGTGAACTTTCGGAGTCTGAGAGGGATTTCTACGACGCGCTATTTAAA 

AGATCCAAGGTCAAATTTGAT(^TTTGTTGAACAAGGCAAAGTTCTTCATAACTATGCT 

TCGATCCTGGAACTGCTTTTGCGTCTTCGACAATGTTGTGATCACCCATTTTTAGTAATG 

AGTCGAGGGGATACAGCGGAATACTCTGATCTGAATAAGCTTTCTAAACGTTTCCTTAGT 

GGAAAGTCTTCTGGCTTAGAAAGGGAAGGAAAAGATGTACCGTCAGAGGCTrTTGTTCAG 

GAGGTGGTAGAGGAACTGCGCAAAGGAGAGCAAGGAGAGTGTCCAATATGCCTTGAAGCA 

CTTGAGGATGCTGTATTAACGCCATGTGCTCATAGATTATGTCGTGAGTGTCTCTTGGCA 

AGTTGGAGAAATTCTACTTCTGGGTTATGTCCTGTGTGTAGGAACACTGTAAGCAAACAA 

GAACTCATCACAGCACCAACCGAAAGTAGATTCCAGGTTGACGTGGAAAAGAATTGGGTG 

GAATCATCGAAAATCACTGCTCTTCTGGAAGAGCTTGAAGGTCTTCGTTCTTCAGGCTCT 

AAGAGCATTCTCTTTAGCCAGTGGACCGCTTTCCTCGATCTCCTCCAAATTCCCCTCTCT 

CGGAATAACTTTTCATTTGTCCGTCTTGATGGCACGCTAAGTCAGCAGCAACGAGAGAAG 

GTCCTTAAAGAATTTTCCGAAGATGGCAGTATCCTGGTACTGTTGATGTCTCTAAAAGCT 

GGTGGCGTTGGGATAAATCTAACAGCTGCGTCCAATGCTTTTGTCATGGATCCATGGTGG 

AACCCAGCGGTAGAGGAACAAGCTGTTATGCGTATTCATCGTATAGGGCAAACTAAGGAA 

GTCAAAATCAGAAGATTCATCGTTAAGGGAACGGTTGAAGAGAGAATGGAGGCGGTTCAG 

GCGAGGAAGCAGAGAATGATCTCTGGGGCTTTAACCGATCAAGAAGTACGAAGTGCACGT 

ATAGAGGAACTCAAGATGTTATTTACCTGA 
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>G1526 Amino Acid Sequence (domain in AA coordinates: 493-620, 864-1006) 

MGTKVSDDLVSTVRS WGSD YSDMD I IRALHMANHDPTAAINI I FDTPS FAKPDVATPTP 

SGSNGGKRVDSGLKGCTFGDSGSVGANHRVEEENESVNGGGEESVSGNEWWFVGCSELAG 

LSTCKGRKLKSGDELVFTFPHSKGLKPETTPGKRGFGRGRPALRGASDIVRFSTKDSGEI 

GRIPNEWARCLLPLVRDKKIRIEGSCKSAPEALSIMDTILLSVSVYIMSSMFQKHSATSF 

KTASNTAEESMFHPLPNLFRLLGLIPFKKAEFTPEDFYSKKRPLSSKDGSAI PTSLLQLN 

ICVKNMNQDANGDENEQCISDGDLDNIVGVGDSSGLKEMETPHTLLCELRPYQKQALHWMT 

QLEKGNCTDEAATMLHPCWEAYCIADKRELVVYLNSFTGDATIHFPSTIiQMARGGILADA 

MGLGKTVMTISLLLAHSWKMSTGFLCPNYEGDK^ISSSVDDliTSPPVKATKFLGFDI<Rb 

LEQKSVLQNGGNLIVCPMTLLGQWKTEIEMHAKPGSLSVYVHYGQSRPKDAKLLSQSDW 

ITTYGVLTSEFSQENSADHEGIYAVRWFRIVLDEAHTIKNSKSQISLAAAALVADRRWCL 

TGTPIQKWLEDLYSLLRFLRIEPWGTWAWWNKLVQKPFEEGDERGLKLVQSILKPIMLRR 

TKSSTDREGRPILVLPPADARVIYCELSESERDFYDALFKRSKVKFDQFVEQGKVLHNYA 

SILELLLRLRQCCDHPFLVMSRGDTAEYSDLNKLSICRFLSGKSSGLEREGKDVPSEAFVQ 

EWEELRKGEQGECPICLEALEDAVLTPCAHRLCRECLLASWRNSTSGLCPVCRNTVSKQ 

ELITAPTESRFQVDVEKNWVESSKITALLEELEGLRSSGSKSILFSQWTAFLDLLQIPLS 

RNNFSFTOLDGTLSQQQREKVLKEFSEDGSILVLLMSbKAGGVGINLTAASNAFVMDPWW 

NPAVEEQAVMRIHRIGQTKEVKIRRFIVKGTVEERMEAVQARKQRMISGALTDQEVRSAR 
IEELKMLFT* 

>G1543 (1..828) 

ATGATAAAACTACTATTTACGTACATATGCACATACACATATAAACTATATGCTCTATAT 

CATATGGATTACGCATGCGTGTGTATGTATAAATATAAAGGCATCGTCACGCTTCAAGTT 

TGTCTCTTTTATATTAAACTGAGAGTTTTCCTCTCAAACTTTACCTTTTCTTCTTCGATC 

CTAGCTCTTAAGAACCCTAATAATTCATTGATCAAAATAATGGCGATTTTGCCGGAAAAC 

TCTTCAAACTTGGATCTTACTATCTCCGTTCCAGGCTTCTCTTCATCCCCTCTCTCCGAT 

GAAGGAAGTGGCGGAGGAAGAGACCAGCTAAGGCTAGACATGAATCGGTTACCGTCGTCT 

GAAGACGGAGACGATGAAGAATTCAGTCACGATGATGGCTCTGCTCCTCCGCGAAAGAAA 

CTCCGTCTAACCAGAGAACAGTCACGTCTTCTTGAAGATAGTTTCAGACAGAATCATACC 

CTTAATCCCAAACAAAAGGAAGTACTTGCCAAGCATTTGATGCTACGGCCAAGACAAATT 

GAAGTTTGGTTTCAAAACCGTAGAGCAAGGAGCAAATTGAAGCAAACCGAGATGGAATGC 

GAGTATCTCAAAAGGTGGTTTGGTTCATTAACGGAAGAAAACCACAGGCTCCATAGAGAA 

GTAGAAGAGCTTAGAGCCATAAAGGTTGGCCCAACAACGGTGAACTCTGCCTCGAGCCTT 

ACTATGTGTCCTCGCTGCGAGCGAGTTACCCCTGCCGCGAGCCCTTCGAGGGCGGTGGTG 

CCGGTTCCGGCTAAGAAAACGTTTCCGCCGCAAGAGCGTGATCGTTGA 

>G1543 Amino Acid Sequence (domain in AA coordinates: 135-195) 

MIKLLFTYICTYTYKLYALYHMDYACVCMYKYKGIVTLQVCLFYIKLRVFLSNFTFSSSI 

LALKNPNWSLIKIMAILPENSSNLDLTISVPGFSSSPLSDEGSGGGRDQLRLDMNRLPSS 

EDGDDEEFSHDDGSAPPRKKLRLTREQSRLLEDSFRQNHTLNPKQKEVLAKHLMLRPRQI 

EWFQNRRARSKLKQTEMECEYLKRWFGSLTEENHRLHREVEELRAIKVGPTTVNSASSL 

TMCPRCERVTPAASPSRAWPVPAKKTFPPQERDR* 

>G162 (101.. 619) 

AGAC^TAC^CACCAAAATCTTCTTCTTCACCAACATATTCACTTTCACAGCAAAAAA^ 
ACGAGAGGTTCTCTCTTATTCGTACCGTTTTAGCAAACAAATGGGTCGGAGAAAGATCAA 
GATGGAGATGGTTCAGGACATGAACACACGACAGGTTACCTTTTCAAAACGGAGGACTGG 
TTTGT TCT^GjU^GGCGAGCGAGTTAGCCACGCTCTGCAACGCTGAGTTGGGCATCGTTGT 
CTTTTC^CCAGGAGGCAAGCCTTTCTCCTACGGGAAACCGAA 

GCGATTCATGAGAGAATATGATGATTCAGACAGTGGCGATGAAGAAAAAAGTGGTAATTA 

CAGGCCTAAACTGAAGAGGCTGAGTGAACGTCTCGATTTGCTCAACCAAGAGGTTGAAGC 

TGAGAAGGAACGAGGCGAGAAGAGTCAGGAGAAGCTTGAATCTGCTGGGGATGAGAGATT 

CAAGGAGTCCATTGAGACGCTTACCCTCGATGAACTCAATGAATACAAAGATAGGCTTCA 

GACAGTCC^TGGTAGGATTGAAGGTCAAGTCAATCACTTGCAGGCTTCGTCTTGCCTCAT 

GCTTOTCTCCAGAAAATAGCTAGACCGACTTGTTAGAGTTACATTCTATTTTTTC 

GCCTACAGAACTTACCAA(^CATGAAAGTTATTGCTGGTGTAGAATTTTCTGTCATCTAT 

GGGGTGTGACTTTCTATTTGACATCAAATGAAAATGTACCTGGAAATTTGTCTGTATTAA 

TCTCAAGTGTACTTGCTAAACTTGATCAGCTTTTTCGCAAAAAAAA 

>G162 Amino Acid Sequence (domain in AA coordinates: 2-57) 

MGRRKI KMEMVQDMNTRQ VTFS KRRTGLFKKAS EL ATLCNAELG I WFS PGGKPFS YGKP 
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NLDSVAERFMREYDDSDSGDEEKSGNYRPKLKRLSERLDLLNQEVEAEKERGEKSQEKLE 
SAGDERFKESIETIiTLDELNEYKDRLQTVHGRIEGQVNHLQASSCLMtiliSRK* 
>G1640 (168.. 1196) 

TTCGCCAGATCCTTCCTCTATATAAGGAAGTTCATTTCATTTGGAGAGGTTTCGCTGACA 
AGCTGCTCTAGCTTATCTGGTACCGTCGACCTCTCACTCAAGGGTCCAAAAGTGTTTTCT 
CTTTTTCAGTTTCTCTTTCTCTTTTTGACAGAAGAGACCGAGAAGCAATGGGAAGGGCTC 
CGTGTTGTGAGAAAATCGGGTTGAAGAGAGGGAGATGGACAGCCGAGGAAGATGAGATCC 
TCACCAAGTATATTCAGACCAATGGTGAAGGTTCTTGGCGATCTTTGCCTAAGAAAGCTG 
GATTGTTGAGATGTGGAAAGAGCTGTAGACTAAGGTGGATAAACTACTTAAGAAGAGACT 
TAAAAAGAGGAAATATTACTTCCGACGAAGAAGAAATAATCGTCAAGTTGCATTCCCTTC 
TCGGCAACAGATGGTCACTTATTGCAACACATCTACCAGGAAGAACAGACAACGAAATTA 
AAAACTATTGGAACTCACATCTCAGCCGCAAAATCTATGCCTTCACTGCCGTTTCCGGAG 
ATGGACACAATCTACTCGTCAACGATGTAGTCTTGAAGAAATCTTGTTCATCGTCTTCTG 
GAGCCAAGAACAATAACAAGACCAAGAAGAAGAAGAAGGGAAGGACTAGTAGGTCATCCA 
TGAAGAAACACAAGCAAATGGTGACGGCCTCACAATGTTTCTCACAACCTAAGGAGCTAG 
AGAGTGATTTCAGTGAGGGAGGGCAAAATGGTAATTTTGAAGGAGAGTCTTTGGGGCCTT 
ATGAGTGGTTGGATGGTGAGTTAGAACGGCTCTTGAGTAGTTGTGTCTGGGAATGCACTA 
GTGAAGAGGCTGTGATTGGAGTAAATGATGAAAAGGTGTGTGAGAGTGGGGACAATAGTA 
GTTGTTGTGTTAATTTGTTTGAAGAAGAACAAGGAAGCGAGACAAAGATTGGTCACGTAG 
GAATCACAGAGGTTGATCATGATATGACGGTGGAAAGAGAAAGAGAGGGAAGTTTTTTAA 
GTTCGAATTCAAATGAAAATAATGATAAAGATTGGTGGGTTGGTCTATGTAATTCTTCAG 
AAGTTGGGTTTGGGGTTGATGAGGAGTTGCTTGATTGGGAGTTTCAAGGTAATGTCACTT 
GTCAAAGTGATGATCTATGGGATCTCTCAGATATTGGAGAGATAACATTGGAGTGATTGT 
ACCGAGCAAGTGGATTGGCGGCCGCTCTAGACAGGCCTCGTACCGGATCTCTAGCTAGAG 
CTTTCGTTCGTATCATCGGTTTCGACAACGTTCGTCAAGT 

>G1640 Amino Acid Sequence (domain in AA coordinates: 14-115) 
MGRAPCCEKIGLKRGRWTAEEDEILTKYIQTNGEGSWRSLPKKAGLLRCGKSCRLRWINY 
LRRDLKRGNITSDEEEIIVKLHSLLGNRWSLIATHLPGRTDNEIKNYWNSHLSRKIYAFT 
AVSGDGHNLLVNDWLKKSCS S S S GAKNNNKTKICKKKGRTSRS SMKKHKQMVTASQCFS Q . 
PKELESDFSEGGQNGNFEGESLGPYEWLDGELERLLSSCWECTSEEAVIGVNDEKVCES 
GDNS S CCWLFEEEQGSETKI GHVG I TE VDHDMTVEREREGS FLS SNSNElSnSTDKDWWVGIj 
CNSSEVGFGVDEELLDWEFQGNVTCQSDDLWDLSDIGEITLE* 
>G1644 (1..348) 

ATGAAATTGATTGATTGGAAAGACTGTGCTTTGATGACTTACACCGAACTCATTTTGGGT 
TTCTGCAATGTTTTAATGTTGATCTGCAGGAGGACTAGTGGACCTATGAGACGAGCAAAA 
GGTGGTTGGACTCCAGAGGAGGATGAGACACTTAGACGAGCAGTTGAAAAGTATAAGGGG 
AAGAGGTGGAAGAAAATAGCGGAATTTTTCCCAGAGAGAACACAAGTCCAATGCTTGCAC 
AGGTGGCAGAAAGTTCTTAATCCAGAGCTTGTTAAAGGACCTTGGACTCAAGAGGTTCTC 
TTATCATTTTCATGTTCTGAAACTTTTTTTGGTTTTCATTTTACGTAA 

>G1644 Amino Acid Sequence (conserved domain in AA coordinates : 39-102) 
MKLIDWKDCALMTYTELILGFCNVXMLIC 

KRWKKIAEFFPERTQVQCLHRWQKVLNPELVKGPWTQEVLLSFSCSETFFGFHFT* 
>G1646 (34.. 786) 

GATCTTTTGATCCAATCACAAGGCAAAGATCCAATGGACAATAACAACAACAACAACT^AC 
CAGCAACCACCACCAACCTCCGTCTATCCACCTGGCTCCGCCGTCACAACCGTAATCCCT 
CCTCCACCATCTGGATCTGCATCAATAGTCACCGGAGGAGGAGCGACATACCACCACCTC 
CTCCAGCAACAACAGC^CAGCTTCAAATGTTCTGGACATACCAGAGACAAGAGATCGAA 
CAGGTAAACGATTTCAAAAACCATCAGCTCCCTCTAGCTCGTATCAAAAAAATCATGAAA 
GCTGATGAAGATGTGCGTATGATCTCCGCCGAAGCACCGATTCTCTTCGCGAAAGCTTGT 
GAGCTTTTCATTCTCGAACTTACGATTAGATCTTGGCTTCACGCTGAAGAGAACAAACGT 
CGTACGCTTCAGAAAAACGATATCGCTGCTGCGATTACTAGAACCGATATCTTCGATTTC 
CTTGTTGATATTGTTCCTAGGGAAGAGATCAAGGAAGAGGAAGATGCAGCATCGGCTCTT 
GGTGGAGGAGGTATGGTTGCTCCCGCCGCGAGCGGTGTTCCTTATTATTATCCACCGATG 
GGACAACCGGCGGTTCCTGGAGGGATGATGATTGGAAGACCGGCGATGGATCCTAGCGGT 
GTTTATGCTCAGCCTCCTTCTCAGGCATGGCAAAGCGTTTGGCAGAATTCAGCTGGTGGT 
GGTGATGATGTGTCTTATGGAAGTGGAGGAAGTAGCGGCCATGGTAATCTCGATAGCCAA 
GGGTAAGTGAATTCTAGTAG 
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>G164 6 Amino Acid Sequence {domain in AA coordinates: 72-162) 

MDNNNNNNNQQPPPTSVYPPGSAVTTVIPPPPSGSASIVTGGGATYHHLLQQQQQQLQMF 

WTYQRQEIEQWDFKNHQLPLARIKKIMKADEDVRMISAEAPILFAJKACELFILELTIRS 

WLHAEENKRRTLQKNDIAAAITRTDIFDFLVDIVPREEIKEEEDAASALGGGGMVAPAAS 

GVPYYYPPMGQPAVPGGMMIGRPAMDPSGVYAQPPSQAWQSVWQNSAGGGDDVSYGSGGS 
SGHGNLDSQG* 

>G1672 (239.. 1399) 

CCATTCCTGACGTCCGGGATGACGCACAATCCCACTATCCTTCGCAAGACCCTTCCTCTA 

TATAAGGAAGTTCATTTCATTTGGAGAGGACACGCTGACAAGCTGACTCTAGCAGATCTG 

GTACCGATCACTCCCGTCTTTATCAAATTCTTCTTCCTCTTACATTTTCCCTATCCAATC 

GATCTCACGCAGATCTGATCAATTTCTCATCAAATCATTTAGAGATCAAAAGAAAACTAT. 

GAAGAATAGTAAATGTAACCTCATAGATTCAAAGCTCGAAGAACATCATCATCTTTGCGG 

ATCAAAACATTGTCCTGGATGTGGTCGCATGATTCAAGCTGCTACTA7U\CCAAATTGGGT 

TGGATTGCCGGCAGGAGTGAAATTCGATCCGACAGATCAAGAACTTATAGAACATTTAGA 

AGCAAAAGTGAAGGGAAAAGAAGAAAATAAGAAATGGTCGTCGTCTCATCCACTTATAGA 

TGAATTTATTCCCACCATTGATGGAGAAGATGGAATATGTTACACTCATCCTCAGAAGCT 

TCCAGGGGTGAC7VAGAGATGGCTTGAGCAAACACTTCTTCCACAAACCATCAAGAGCTTA 

CACAACCGGAACAAGAAAACGACGTAAAATAATTCAAACCGATCACGACTCTGAGTTAAC 

CGGATCATCAGAAACCAGGTGGCACAAAACGGGCAAAACAAGACCGGTTATGATCAACGG 

T C AACAAAG AG GATG C AAG AAG ATATTAGTACTCTAC ACAAACTTCGG CAAGAATCGT CG 

ACCGGAGAAAACAAATTGGGTGATGCATCAATATCATTTAGGGATTAATGAGGAAGAGAG 

AGAAGGAGAACTTGTGGTCTCCAAGATATTTTATCAGACACAACCAAGACAGTGTGTTAG 

TAATACTAATTGGTCTGATCACCATGGTTCCAAGGACGTGATCGGAATTGGTGTCGGAGA 

TGAGATTTCCAGCGTAGCTGCCACGTTGCAGAGTCTTGGCTCCGGTGACGTCGTTTCTAG 

GGTTAATATGCATCCCCATACAAGATCCTTTGATGAGGGGACAGCCGAAGCTTCAAAGGG 

AAGAGAGAACCAGCATGTGTCTGGCACGTGCGAGGAAGTACATGATGGGATCATAACATC 

ATCAATGTCATCTCATCATATGATTCATGATCATCATAATCAACATCATCAAATCGGAGA 

TAGAAGAGAATTTCACATGTCATCATCATATCCCATGACCCCTACTATCACATCACAACA 

TGAGTCAATCTTCCATGTTACAAGTACTATGCCCTTTCAGCGGCAGCAATTAAGGGGTCG 

GTCGTCTGGTTCGGGATTAGAAGACCTAATTATGGGTTGTACCACAGCTACGTGTACAGA 

AGACAATAATCACAAATGATTAAATTCGCAGGAGCATTCAGAAGCAAACCCTCAGCGAAA 

TGCAGAGTGGTTAACGTTTCCACAATTCTGGAACCAAGCCGAATCAGATGATCAAAACCG 

AAGATTTTAACAGAACCAAT^AGGAAGCAGAGAAATCTTGCAAAAAGCTCCTGCTTAGCTG 

TTGATCAATGCCGGAAATGCTGAGCTATGACTGACTAGTCTCTGCCATTTAACTTACAAT 

ATCACCAGAGGTTGCGATGAATGTTGATTCGCTCAAAGGAGAGCGGCCGCTCTAGACAGG 
CCTCGTACCG 

>G1672 Amino Acid Sequence (conserved domain in AA coordinates: 41-194) 
MKNS KCNLIDS KLEEHHHLCGS KHCPGCGRM I QAATKPNWVGLPAGVKFDPTDQELIEHL 
EAKVKGKEENKKWSSSHPLIDEFIPTIDGEDGICYTHPQKLPGVTRDGLSKHFFHKPSRA 
YTTGTRKRRKIIQTDHDSELTGSSETRWHKTGKTRPVMINGQQRGCKKILVliYTOFGI^ 
RPEKTNWVMHQYHLGINEEEREGELWSKIFY^ 

DE I S S V7^TLQSLGSGDWSRVNMHPHTRS FDEGTAEASKGRENQHVSGTCEEVHDGI I T 
SSMSSHHMIHDHHNQHHQIGDRREFHMSSSYPMTPTITSQHESIFHVTSTMPFQRQQLRG 
RSSGSGLEDIiIMGCTTATCTEDNNHK* 
>G1677 (24.. 1037) 

CAGTACTAATTCTGTGTGTGTTAATGGTTCTAGTTATGGATGATGAAGAGAGTAACAACG 
TTGAAAGATATGACGACGTCGTATTGCCAGGGTTTAGGTTCCATCCCACTGATGAAGAAC 
TCGTAAGTTrCTACTTGAAACGGAAGGTTTTACACAAATCTCTTCCCTTTGATCTCATC^ 
AGAAAGTCGACATTTACAAATACGATCC^TGGGACCTCCCAAAGCTTGCAGCGATGGGGG 
AAAAAGAGTGGTACITTTATTGTCCTAGAGACAGGAAATACCGCAACAGCACAAGACCTA 
ACCGAGTAACTGGAGGTGGCTTCTGGAAAGCAACCGGAACAGACCGGCCTATATACTCAT 
TGGACTCCACTCGATGCATCGGTTTGAAGAAATCACTTGTGTTCTACCGTGGTCGAGCTG 
CTAAAGGAGTCAAAACCGATTGGATGATGCATGAATTTCGTCTCCCTTCTCTCTCTGACT 
CTCATCACTCATCATATCCCAATTACAATAACAAGAAGCAAGACCTTAACAATAACAACA 
ACAGCAAGGAGCTTCCTTCAAACGATGCTTGGGCGATATGTAGAATATTTAAGAAGACAA 
ATGCAGTATCCTCACVU^GATCAATCCCACAATCTTGGGTTTATCCAACGATTCCTGACA 
ACAATCAAC^GTCACACAACAACACCGCAACTCTCTTAGCTTCATCAGACGTTCTCAGCC 
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ACATATCAACAAGACAAAACTTTATTCCTTCTCCAGTCAACGAACCCGCAAGCTTCACAG 
AATCAGCTGCTTCTTACTTCGCGTCTCAGATGCTCGGAGTCACGTACAATACAGCCAGAA 
ACAACGGAACAGGGGATGCTCTGTTTCTGAGAAACAATGGAACAGGGGATGCTCTGGTTC 
TGAGCAACAATGAGAATAACTACTTCAACAACTTGACTGGAGGGTTGACTCATGAGGTTC 
CGAATGTAAGATCAATGGTGATGGAGGAGACTACGGGGAGTGAGATGTCGGCGACGTCGT 
ATTCCACTAACAATTAAGATCATAGTACTATTAACACTTGAATTAGTGTAGACGTTGATC 
ATCGCTAATATGTATTAATTTTTCTTGTCTTACTATAAACGAAAAAAAAA 

>G1677 Amino Acid Sequence (conserved domain in AA coordinates : 1.7-181) 
MVLVMDDEESNOTERYDDWLPGFRFHPTDEEL^^ 

DPWDLPKLAAMGEKEWYFYCPRDRKYRNSTRPNRVTGGGFWKATGTDRPIYSLDSTRCIG 

LKKSLVFYRGRAAK6VKTDWMMHEFRLPSLSDSHHSSYPNYNNKKQHLNNNNNSKELPSN 

DAWAICRI FKKTNAVSSQRS I PQS WVYPTI PDNNQQSHNNTATLLAS SDVLSH I STRQNF 

IPSPWEPASFTESAASYFASQMLGVTYNTARNNGTGDALFLRNNGTGDT^LVLSNNENNY 

FNNLTGGLTHEVPNVRSMVMEETTGSEMSATSYSTNN* 

>G1765 (139.-966) 

TCCTTCGCAAGACCCTTCCTCTATATAAGGAAGTTCATTTCATTTGGAGAGGACACGCTG 
ACAAGCTGACTCTAGCAGATCTGGTACCGTCGACAAGAATGACTTGATTGGTGTTCTAAA 
GAGATCGATGTAGTGAAGATGAGTGGCGAAGGTAACTTAGGTAAGGATCATGAAGAAGAA 
AACGAAGCACCACTTCCTGGGTTCAGGTTTCATCCGACGGATGAAGAGCTTTTAGGATAC 
TATCTTCGAAGAAAAGTAGAGAACAAAACCATCAAACTCGAACTTATCAAACAGATCGAT 
ATCTATAAGTACGATCCTTGGGATCTTCCAAGAGTGAGCAGCGTCGGAGAAAAGGAGTGG 
TACTTCTTCTGCATGAGAGGTAGGAAATACAGGAATAGCGTTCGACCAAACCGAGTGACC 
GGTTCAGGTTTCTGGAAAGCCACTGGTATTGATAAACCGGTTTACTCCAATCTTGACTGT 
GTTGGTCTCAAGAAATCTCTGGTTTACTATCTTGGTTCAGCCGGTAAAGGCACCAAAACC 
GATTGGATGATGCATGAATTCCGCCTCCCCTCCACCACGAAAACCGACTCTCCAGCTCAA 
CAAGCAGAGGTATGGACACTTTGCAGAATCTTCAAACGAGTCACATCTCAAAGAAACCCA 
ACCATCTTACCACCAAACCGAAAACCGGTTATCACTTTAACCGACACTTGTTCTAAGACC 
AGCAGCTTAGATTCCGACCACACGAGCCACCGTACAGTAGATTCCATGTCCCACGAGCCG 
CCGCTTCCACAGCCACAGAATCCTTATTGGAACCAACATATAGTTGGTTTTAATCAACCG 
ACATATACTGGTAATGATAATAACCTCCTGATGAGTTTCTGGAACGGCAACGGTGGAGAT 
TTCATAGGAGACTCAGCAAGTTGGGATGAACTTAGATCTGTTATAGATGGCAACACTAAA 
CCCTAGTAATAAAGTTTCCTTTTTTCAGCTTTGTACAAAAAGATAAAACAAACGGCAACC 
GCTCTAGACAGGCCTCGTACCGGGATCCTCTAGCTAGAGCTTTCGTTTCGTATCATCGGT 
TTCGACAACGTTCGT 

>G1765 Amino Acid Sequence (conserved domain in AA coordinates: 20-140) 

MSGEGl^GKDHEEENEAPLPGFRFHPTDEELLGYYLRRKVENKTIKLELIKQIDIYKYDP 

WDLPRVSSVGEKEWYFFCMRGRKYRNSTOPNRVTGSGFWKATGIDKPVYSNLDCVGLKKS 

LVYYLGSAGKGTKTDWMMHEFRLPSTTKTDSPAQQAEVWTLCRIFKRVTSQRNPTILPPN 

RKPVITLTDTCSKTSSLDSDHTSHRTVDSMSHEPPLPQPQNPYWNQHIVGFNQPTYTGND 

NNLLMSFWNGNGGDFIGDSASVJDELRSVIDGNTKP* 

>G1777 (97.. 1878) 

CTCGTACTTTATCACCTCCGTCGTTCTAT7VATACTCTCTTCCGTCAATCATATCATTTGT 
CGACAATTTCATTCTGATCAGTTTAAAAATTGATCCATGGATGATAATTTAAGCGGCGAG 
GAAGAAGATTACTATTACTCCTCCGATCAGGAATCTCTCAACGGGATTGATAATGATGAA 
TCCGTTTCGATACCTGTTTCTTCCCGATCAAATACTGTCAAGGTTATTACGAAGGAATCA 
CTTTTGGCTGCACAGAGGGAGGATTTGCGGAGAGTGATGGAATTGTTATCGGTTAAGGAG 
CACGATGCTCGGACTCTTCTTATACATTACCGATGGGATGTGGAGAAGTTGTTTGCTGTT 
CTTGTTGAGAAAGGGAAAGATAGCTTGTTTTCTGGTGCTGGTGTTACACTTCTTGAAAAC 
CAAAGTTGTGATTCTTCCGTTTCTGGTTCTTCTTCGATGATGAGTTGTGATATCTGCGTA 
GAGGATCTACCGGGTTATCAGCTGACAAGGATGGACTGTGGCCATAGCTTTTGCAATAAC 
TGTTGGACTGGGCATTTTACTGTAAAGATAAATGAAGGTCAGAGCAAAAGGATTATATGC 
ATGGCTCATAAGTGTAATGCTATTTGTGATGAAGATGTTGTCAGGGCTCTAGTTAGTAAA 
AGCCAACCAGATTTAGCTGAGAAGTTTGATCGTTTTCTTCTTGAGTCGTATATCGAAGAT 
AACAAAATGGTGAAGTGGTGTCCGAGTACTCCTCATTGTGGGAATGCCATACGTGTTGAG 
GATGACGAGCTCTGTGAGGTTGAATGCTCTTGTGGTTTGCAGTTCTGTTTCAGTTGTTCA 
TCTCAAGCTCACTCCCCTTGCTCTTGTGTGATGTGGGAACTATGGAGAAAGAAGTGCTTT 
GATGAGTCCGAGACTGTTAATTGGATAACTGTTCACACAAAGCCGTGTCCCAAATGTCAC 
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AAGCCTGTTGAAAAGAATGGTGGATGCAATCTCGTGACTTGTCTTTGTCGACAATCTTTT 

TGTTGGTTGTGTGGTGAAGCTACTGGAAGGGACCACACTTGGGCTAGAATCTCGGGTCAT 

AGTTGTGGTCGGTTCCAAGAAGATAAAGAGAAACAAATGGAGAGAGCGAAAAGGGATCTC 

AAGCGGTATATGCATTATCATAACCGATACAAAGCACATATCGACTCCTCCAAGCTAGAG 

GCTAAGCTTAGTAATAATATTAGTAAAAAGGTGTCTATTTCAGAAAAGAGGGAGTTACAA 

CTTAAAGACTTCAGCTGGGCTACCAATGGACTCCATCGGTTATTTAGATCAAGACGAGTT 

CTTTCATATTCATACCCTTTCGCATTTTACATGTTTGGAGATGAGCTGTTTAAAGATGAG 

ATGAGCTCTGAGGAAAGAGAAATAAAACAAAATCTGTTTGAGGATCAGCAGCAGCAGCTT 

GAGGCTAATGTTGAGAAACTTTCTAAGTTCTTGGAGGAACCTTTTGATCAATTTGCTGAT 

GATAAGGTCATGCAGATAAGGATTCAAGTCATCAATTTGTCAGTTGCGGTCGATACACTC 

TGCGAAAATATGTATGAATGCATTGAGAATGACTTGTTGGGTTCTCTGCAACTTGGCATC 

CACAACATTACTCCATACAGATCAAACGGCATAGAACGAGCATCTGATTTTTATAGTTCC 

CAGAATTCCAAGGAAGCTGTTGGTCAGTCTTCGGATTGTGGATGGACGTCCAGGCTCGAT 

CAAGCTTTGGAGTCAGGGAAGTCGGAAGACACAAGTTGCTCTTCCGGGAAGCGTGCTAGA 

ATAGACGAAAGTTACAGAAACAGCCAAACCACCTTACTAGATTTAAACTTGCCAGCGGAA 

GCCATTGAGCGGAAATGAACACTTATCCTTCTTCACCTCCCAATAACACCCTTTTTGTCC 

AAATAAAGTGTGTTACCCGGATATTTATAGCTCTAAACCCAATCCCCTCTGCTTAATTTG 
TCAGTGACCTTACCTAACCCTCTTCA 

>G1777 Amino Acid Sequence (domain in AA coordinates : 124-247) 

MDDNLSGEEEDYYYSSDQESLNGIDNDESVSIPVSSRSNTVKVITKESLLAAQREDLRRV 

MELLSVKEHHARTLLIHYRWDVEKLFAVLVEKGKDSLFSGAGVTLLENQSCDSSVSGSSS 

MMSCDICVEDVPGYQLTRMDCGHSFCNWCWTGHFTVKINEGQSKRIICMAHKCNAICDED 

VVRALVSKSQPDLAEKFDRFLLESYIEDNKMVKWCPSTPHCGNAIRVEDDELCEVECSCG 

LQFCFSCSSQAHSPCSCVMWELWRKKCFDESETVNWITVHTKPCPKCHKPVEKNGGCNLV 

TCLCRQS FCWLCGEATGRDHTWARI SGHSCGRFQEDKEKQMERAKRDLKRYMHYHNRYKA 

HIDSSKLEAKLSNNISKKVSISEKRELQLKDFSWATNGLHRLFRSRRVLSYSYPFAFYMF 

GDELFKDEMSSEEREIKQNLFEDQQQQLEANVEKLSKFLEEPFDQFADDKVMQIRIQVIN 

LSVAVDTLCENMYECIENDLLGSLQLGIHNITPYRSNGIERASDFYSSQNSKEAVGQSSD 

CGWTSRLDQALESGKSEDTSCSSGKRARIDESYRWSQTTLLDLNLPAEAIERK* 
>G1793 (59.. 1783) 

AGTGATTTATTGATTAACCCAZ^ACACAAAATAAACAGATTTGACTCAAAAAGAAGAAAAT 

GAATTCTAACAACTGGCTTGGCTTTCCTCTTTCACCGAACAACTC'l'TCTTTGCCTCCTCA 

TGAATACAACCTTGGCTTGGTCAGCGACCATATGGACAACCCTTTTCAAACACAAGAGTG 

GAATATGATCAATCCACACGGTGGAGGAGGAGATGAAGGAGGAGAGGTTCCAAAAGTGGC 

CGATTTTCTCGGTGTGAGCAAACCGGACGAAAACCAATCCAACCACCTAGTAGCTTACAA 

CGACTCAGACTACTACTTCCATACCAATAGCTTGATGCCTAGCGTCCAATCAAACGATGT 

CGTTGTAGCAGCTTGTGACTCCAATACTCCTAACAACAGTAGCTATCATGAGCTTCAAGA 

GAGTGCTCACAATCTACAGTCACTTACTTTGTCCATGGGGACCACCGCTGGTAATAATGT 

TGTAGACAAAGCTTCACCATCCGAGACCACCGGGGATAACGCTAGCGGTGGAGCACTAGC 

CGTTGTTGAGACGGCCACGCCAAGACGTGCATTGGACACTTTCGGACAACGAACCTCGAT 

CTATCGTGGTGTCACAAGACATCGATGGACTGGTCGATATGAGGCTCATCTATGGGATAA 

TAGTTGTAGAAGGGAAGGCCAGTCTAGGAAAGGAAGACAAGTTTACTTGGGTGGATATGA 

CAAAGAAGATAAAGCAGCAAGATCATATGATCTAGCTGCACTTAAGTACTGGGGTCCTTC 

AACTACTACTAATTTCCCmTTACAAACTACGAGAAAGAAGTAGAGGAAATGAAGCACAT 

GACGAGACAAGAGTTCGTGGCTGCCATTAGAAGGAAAAGTAGTGGATTTTCGAGAGGCGC 

TTCGATGTATCGAGGAGTTACAAGGCATCACCAACATGGAAGATGGCAAGCAAGGATCGG 

CCGAGTCGCCGGAAACAAAGACCTCTACTTGGGAACTTTTAGCACTGAGGAAGAAGCAGC 

AGAAGCTTACGATA^AGCTGCAATAAAGTTTAGAGGACTTAATGCAGTGACCAACTTCGA 

GATCAACCGGTACGACGTGAAAGCCATTCTAGAGAGTAGCACTCTTCCCATCGGAGGAGG 

CGCAGCTAAACGGCTCAAAGAAGCTCAAGCTCTTGAGTCTTCAAGGAAACGCGAGGCGGA 

GATGATAGCCCTTGGTTCAAGTTTCCAGTACGGTGGTGGCTCGAGCACAGGCTCTGGCTC 

CACCTCATCAAGACTTCAGCTTCAACCTTACCCTCTAAGCATTCAACAACC^ 

TTTTCTATCTCTTCAGAACAATGACATCTCTCATTACAACAACAACAATGCTCAGGATTC 

CTCCTCTTTTAATCACCATAGCTATATCCAGACACAACTTCATCTCCACCAACAGACCAA 

CAATTACTTGCAGCAACAGTCGAGCCAGAACTCTCAGCAGCTCTACAATGCGTATCTTCA 

TAGCAATCCGGCTCTGCTTCATGGACTTGTCTCTACCTCTATCGTTGACAACAATAATAA 

CAATGGAGGCTCTAGTGGGAGCTACAACACTGCAGCATTTCTTGGGAACCACGGTATTGG 



216 



PCT/US02/25805 



BNSOOCID: <WO_03013227A2_I_> 



WO 03/013227 



2 17/286 



PCT/USII2/25805 



TATTGGGTCCAGCTCGACTGTTGGATCGACCGAGGAGTTTCCAACCGTTAAAACAGATTA 
CGATATGCCTTCCAGTGATGGAACCGGAGGGTATAGTGGTTGGACCAGTGAGTCTGTTCA 
GGGGTCAAACCCTGGTGGTGTTTTCACTATGTGGAATGAGTAAACAAGGATCTCTTTCTT 
GCGGCACAAGGAATGGGT 

>G1793 Amino Acid Sequence (conserved domain in AA coordinates : 179-255 , 281-349) 

MNSNNWLGFPLSPNNSSLPPHEYNLGLVSDHMDNPFQTQEWNMINPHGGGGDEGGEVPKV 

ADFLGVSKPDENQSNHLVAYNDSDYyFHTNSLMPSVQSNDVWAACDSNTPNNSSYHEIiQ 

ESAHNLQSLTLSMGTTAGNNVVDKASPSETTGDNASGGALAVVETATPRRALDTFGQRTS 

IYRGVTRHRWTGRYE7VHLWDNSCRREGQSRKGRQVYLGGYDKEDICAARSYDLAALKYWGP 

STTTNFPITNYEKEVEEMKHMTRQEFVAAIRRKSSGFSRGASMYRGVTRHHQHGRWQARI 

GRVAGNICDLYLGTFSTEEEAAEAYDIAAIKFRGLNAVTNFEINRYDVKAILESSTLPIGG 

GAAKRLKEAQALESSRKREAEMIALGSSFQYGGGSSTGSGSTSSRLQLQPYPLSIQQPLE 

PFLSLQNNDISHYNWNNAHDSSSFNHHSYIQTQLHLHQQTNNYLQQQSSQNSQQLYNA 

HSNPALLHGLVSTSIVDNNNNNGGSSGSYNTAAFLGNHGIGIGSSSTVGSTEEFPTVKTD 

YDMPSSDGTGGYSGWTSESVQGSNPGGVFTMWNE* 

>G180 (54.. 629) 

GTAATTACGATCTACAACAAGTGACATCGTCGTCGACGACGATTCAAGAGAATATGAACT 

TCCTCGTTCCTTTTGAAGAAACCAATGTCTTAACCTTTTTCTCTTCTTCTTCTTCCTCTT 

CTCTTTCTTCTCCTTCTTTCCCCATTCACT^ACTCTTCCTCCACTACTACTACTCATGCAC 

CTCTAGGGTTTTCTAATAATCTTCAGGGTGGAGGACCCTTGGGATCAAAGGTGGTTAATG 

ATGATCAGGAGAATTTTGGAGGTGGAACTAACAATGATGCTCATTCTAATTCTTGGTGGA 

GATCAAATAGTGGAAGTGGAGATATGAAGAACAAAGTGAAGATAAGGAGGAAACTAAGAG 

AGCCAAGATTCTGTTTCCAAACCAAAAG CGATGTTGATGTTCTTGACGATGG CTACAAAT 

GGCGTAAATATGGTCAGAAAGTCGTCAAGAACAGCCTTCACCCCAGGAGTTATTACAGAT 

GCACACACAACAACTGTAGGGTGAAAAAGAGAGTGGAGCGACTATCGGAAGATTGTAGAA 

TGGTGATTACTACTTACGAAGGTCGTCACAACCACATTCCCTCTGATGACTCCACTTCTC 

CTGACCATGATTGTCTCTCTTCCTTTTAACATCTCTTTCTATATATCTATATATAGACAG 

TTATATGTGCACATATAGATGTGTGATATATTGCATATTTGATATTGCATGTGTTTTTCA 

AGAGTATGTCATCAGATGTTATGCATATATTCTTGACTTGTTGCTTATAGTATACATATG 

TAATAATATATATTGACATTGGTAGTTC^TTTCTGTTCAAACAAAAAAAAAATUUU^A 

>G180 Amino Acid Sequence (domain in AA coordinates: 118-174) 

MNFLVTFEETNVLTFFSSSSSSSLSSPSFPIHNSSSTTTTHAPLGFSNNLQGGGPLGSKV 

VNDDQENFGGGTNNDAHSNSWWRSNSGSGDMKNKVKIRRKLREPRFCFQTKSDVDVLDDG 

YKWRKYGQKWKNSLHPRSYYRCTHNNCRVKKRVERLSEDCRMVITTYEGRHNHIPSDDS 

TSPDHDCLSSF* 

>G192 (63.-959) 

CTTTTTTCTCTTCTCTCCTCAGAGATTCGAAGCTTTTTGTCTCCCCTGAGTAACCAAATT 
CAATGGCCGACGATTGGGATCTCCACGCCGTAGTCAGAGGCTGCTCAGCCGTAAGCTCAT 
CAGCTACTACCACCGTATATTCCCCCGGCGTTTCATCTCACACAAACCCTATATTCACCG 
TCGGACGACAAAGTAATGCCGTCTCCTTCGGAGAGATTCGAGATCTCTACACACCGTTCA 
CACAAGAATCTGTCGTCTCTTCGTTTTCTTGTAT^ 

CACAGAACCAGAAACGTCCTCTTTCTCTCTCTGCTTCTTCCGGTAGCGTCACTAGCAAAC 
CCAGTGGCTCCAATACCTCTAGATCTAAAAGAAGAAAGATACAGCATAAGAAAGTGTGCC 
ATGTAGCAGCAGAAGCTTTAAACTCCGATGTCTGGGCATGGCGAAAGTACGGACAGAAAC 
CCATCAAAGGTTCACCATATCCAAGAGGATACTACAGATGTAGTACATCAAAAGGTTGTT 
TAGCCCGTAAACAAGTGGAGCGAAATAGATCCGACCCGAAGATGTTTATCGTCACTTACA 
CGGCGGAGCATAATCATCCAGCTCCGACACACCGTAATTCTCTCGCCGGAAGCACACGTC 
AGAAACCATCCGATCAACAGACGAGTAAATCTCCGACGACCACTATTGCTACTTATTCAT 
CGTCTCCGGTGACTTCAGCCGACGAATTTGTTTTGCCTGTTGAGGATCATCTAGCGGTGG 
GAGATCTTGACGGAGAAGAAGATCTGTTATCTTTGTCGGATACGGTGGTTAGCGATGATT 
TCTTCGATGGGTTAGAGGAATTCGCAGCCGGAGATAGCTTTTCCGGGAACTCGGCTCCGG 
CGAGTTTTGATCTCTCTTGGGTTGTGAACAGTGCCGCCACTACCACCGGAGGAATATGAT 
TAGATTACGACGGCTTAGAATACTCTTATTAGGACAGATTTATAGGATTAAGGAATTATT 
CTCGGAGCATATGTAAAMTAGGATAAAAGAAT^TGTTCTTTGTTACTTTTTITCGGGTT 
TTCTTCCTATTGTTTCTAAACATCTTAGAAAAAATTTAATTGTATATTCCTTAAGCTCGA 
TACATCTTGTTTTAAAAAAAAAAAAAAAAAA 

>G192 Amino Acid Sequence (domain in AA coordinates: 128-185) 
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MADDWDLHAWRGCSAVSSSATTTVYSPGVSSHTNPIFTVGRQSNAVSFGEIRDLYTPFT 

QESWSSFSCINYPEEPRKPQNQKRPLSLSASSGSVTSKPSGSNTSRSKRRKIQHKKVCH 

VAAEALNSDVWAWRKYGQKPIKGSPYPRGYYRCSTSKGCLARKQVERNRSDPKMFIVTYT 

AEHNHPAPTHRNSLAGSTRQKPSDQQTSKSPTTTIATYSSSPVTSADEFVLPVEDHLAVG 

DLDGEEDLLSLSDTWSDDFFDGLEEFAAGDSFSGNSAPASFDLSWWNSAATTTGGI* 
>G1948 (18.. 1118) 

AAAAGGTCTTCTTGGCCATGGATACTTGTGCTCTAGTAATCCATCAGTCTCTGTCTCGCA 

TCAAACTTTCTCCTCCCAAATCTTCTTCTTCTTCTTCTTCTGCTTTCTCCCCTGAATCCT 

TACCGATCAGACGGATCGAGCTGTGTTTCCGAGGAGCTATATGTGCCGCCGTACAAAGAA 

ACTACGAAGAMCGACCTCCTCCGTGGAAGAGGCAGAGGAAGATGATGAGTCATCATCAT 

CGTACGGAGAAGTGAACAAGATCATTGGAAGCCGAACGGCGGGGGAAGGAGCCATGGAGT 

ACCTTATCGAGTGGAAGGACGGCCATTCTCCGTCGTGGGTTCCATCGAGCTACATCGCAG 

CAGACGTAGTGTCGGAGTACGAGACACCCTGGTGGACGGCAGCTAGAAAAGCCGACGAGC 

AGGCCCTGTCACAGCTCCTGGAGGACCGAGACGTCGATGCCGTGGACGAAAACGGCCGGA 

CGGCTCTGCTTTTCGTGGCAGGTCTGGGGTCGGACAAGTGCGTAAGGCTTCTGGCGGAGG 

CTGGAGCCGATCTCGACCACCGAGACATGAGGGGAGGCTTGACGGCGCTGCACATGGCGG 

CTGGTTACGTGAGGCCGGAGGTGGTGGAGGCGCTGGTGGAGCTGGGAGCTGATATTGAAG 

TGGAAGACGAGAGAGGGTTAACGGCGTTGGAACTAGCGAGGGAGATTCTGAAGACGACGC 

CGAAGGGGAATCCGATGCAGTTCGGGAGGAGAATTGGGTTAGAGAAAGTGATCAATGTCC 

TGGAAGGACAAGTGTTCGAGTACGCCGAGGTGGATGAGATCGTAGAGAAACGAGGGAAAG 

GCAAAGACGTTGAATATCTGGTCAGATGGAAGGACGGTGGAGATTGCGAGTGGGTGAAAG 

GTGTACACGTGGCGGAAGATGTGGCTAAGGACTACGAGGATGGGCTGGAGTACGCTGTAG 

CGGAGAGTGTGATCGGGAAGAGGGTGGGAGACGATGGGAAGACCATCGAGTATCTTGTCA 

AATGGACTGATATGTCTGATGCCACTTGGGAGCCTCAGGACAATGTCGACTCTACTCTTG 

TTCTACTCTACCAACAACAACAACCAATGAATGAATGATTGATTTTGATGATTACATTCT 

TCTCAATTTGCTTCTTTCTCATATGTGTTGGTTCATCTGACCGGTTCGGTTGGTACGTAC 

CGGTACATTrTCATTTTCTTTTAAGATGTGATCTTGATGGTTTTTGGCCTTTTGGGGACA 

CTATTTGATTTTATATCCATGCTTTGAATTTTGCTTCCCTTTTTGGGGAGATTCATGAAA 

>G1948 Amino Acid Sequence {domain in AA coordinates: entire protein) 

>TOTCALVIHQSLSRIKIjSPPKSSSSSSSAFSPESLPIRRIELCFRGAICAAVQRNYEETT 

SSVEEAEEDDESSSSYGEVNKIIGSRTAGEGAMEYLIEWKDGHSPSWVPSSYIAADWSE 

YETPWWTAARKADEQALSQLLEDRDVDAVDEN^^ 

HRDMRGGLTALHMAAGYVRPEVVEALTOLGADIEVEDERGLTALEL7VREILKTTPKGNPM 

QFGRRIGLEKVIOTLEGQVFEYAEVDEIVEI^GKGKDVEYLVRWKDGGDCEWVKGVHVAE 

DVAKDYEDGLEYAVAESVIGKRVGDDGKTIEYLVKWTDMSDATWEPQDNVDSTLVLLYQQ 
QQPMNE* 

>G2123 (1..657) 

ATGAGAAAAGTATGTGAGCTTGATATAGAGCTAAGTGAAGAGGAAAGAGACCTACTAACA 

ACTGGATACAAGAATGTCATGGAGGCTAAGAGAGTTTCATTGAGAGTAATATCATCCATT 

GAAAAAATGGAAGACTCGAAAGGAAACGACCAAAATGTGAAACTGATAAAAGGACAACAA 

GAAATGGTTAAATATGAGTTTTTCAATGTTTGTAATGAC^TTTTGTCTC 

CATCTCATACCATCAACTACTACTAATGTCGAATCAATTGTCCTTTTTAACAGAGTGAAA 

GGAGA1TATTTTCGATATATGGCAGAGTTTGGTTCTGATGCTGAACGTAAAGAAAATGCA 

GATAATTCTCTAGATGCATATAAGGTTGCAATGGAAATGGC^GAGAATAGTTTAGCACCC 

ACCAATATGGTTAGACTTGGATTGGCTTTAAATTTCTCGATATTCAATTATGAGATCCAT 

AAATCTATTGAAAGCGCATGTAAATTGGTTAAGAAAGCTTACGATGAAGCAATCACTGAA 

CTCGATGGCCTTGACAAGAATATATGCGAAGAGAGCATGTATATCATAGAGATGCTTAAA 

TACAATCTTTCTACGTGGACTTCAGGCGATGGTAATGGTAATAAGACAGACGGTTAG 

>G2123 Amino Acid Sequence (domain in AA coordinates : 99-109) 

MRKVCELDIELSEEERDLLTTGYKNVMEAKRVSIi^^ 

EMVKTEFFNVCNDILSLIDSHLIPSTTT^ 

DNSLDAYKVAMEMAENSLAPTNMVI^LGIiALNFS I FNYE IHKS I ESACKLVKKAYDEAITE 
LDGLDKNI CEESMYI I EMLKYWLSTWTSGDGNGNKTDG * 
>G2138 (27.. 512) 

GGAACCCTAATTTCCGCAAATTCACTATGAAGCGTATTATCAGAATCTCATTCACCGACG 
CAGAAGCCACCGATTCTTCTAGCGACGAAGACACGGAGGAGCGTGGAGGAGCATCCCAGA 
CTCGGCGCCGTGGGAAACGCCTCGTTAAAGAGATCGTAATCGATCCTTCCGATTCCGCCG 
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ATAAACTCGATGTCTGCAflAACACGGTTCATVAATCAGGATCCCGGCGGAATTTCTCAAGA 

CGGCGAAAACGGAGAAGAAATATCGTGGAGTGAGGCAGAGGCCGTGGGGGAAGTGGGTGG 

CGGAGATCAGATGTGGAAGAGGAGCTTGTAAAGGACGACGTGATCGTCTCTGGCTGGGTA 

CTTTTAACACTGCTGAGGAAGCTGCTCTAGCTTATGATAACGCTTCAATTAAGCTGATTG 

GACCTCACGCGCCGACCAATTTTGGTTTGCCGGCGGAGAATCAAGAGGATAAGACGGTGA 

TTGGAGCTTCTGAGGTTGCTAGAGGCGCGTGAAGTGGGGTTGGTAATTTAGTTGTTAGC 

>G2138 Amino Acid Sequence (domain in AA coordinates: TBD) 

MKR I IRI S FTDAE ATDS S SDEDTEERGG ASQTRRRGKRLVKE I VIDPSDSADKLDVCKTR 

FKIRIPAEFLKTAKTEKKYRGVRQRPWGKWVAEIRCGRGACKGRRDRLWLGTFNTAEEAA 

LAYDNASIKLIGPHAPTNFGLPAENQEDKTVIGASEVARGA* 

>G2139 (40.. 663) 

C CTAC AAGAAATCAAACACTAGTTCTGGTTTCTGCAAACATGTCATCTACGAAG CAAGCA 
AAGGGAAGAAAAACAAAGGGGAAGCAAAAGATCGAGATGAAGAAGGTGGAGAACTATGGA 
GATAGGATGATTACGTTCTCAAAACGTA7^AACCGGAATTTTTAAGAAAATGAACGAGCTC 
GTAGCAATGTGTGACGTTG AAGTGGCTTTCTTGATTTTCTCTC AAC CCAAGAAGCC CTAT 
ACATTCGCACATCCGTCTATGAAGAAAGTGGCTGACCGGTTAAAGAACCCTTCGAGACAA 
GAACCATTAGAGAGAGACGATACCAGACCCCTCGTCGAAGCTTATAAGAAACGAAGGCTC 
CACGACCTCGTAAAAAAAATGGAGGCGCTCGAAGAGGAGCTTGCGATGGATCTAGAGAAG 
TTGAAACTGTTGAAGGAATCGAGAAATGAAAAGAAGTTAGATAAAATGTGGTGGAACTTT 
CCTTCGGAAGGTTTGAGCGCGAAGGAG CTGCAGCAAAGGTAC CAAGCGATGCTCGAGTTA 
CGTGATAACTTATGCGACAATATGGCTCACTTACGATTGGGAAAAGACTGTGGTGGTTCA 
TCTTCTGTTCGTGTGGGACGTCGAGTTTCTGGTGGTGTTCGTCTGTTCGATCGTGAAGCA 
TGATCATACATATTCATACTTGATGATTTAAATTTCTTTGTATTTGAACTGCTGATTTTA 
ATACTGCATGTATCCATTTGACGAAGCTCAATCGTCTCGAGTATATCTCTATTATCTAAC 
AGTATTGAGAAAAAAGGAGTTTCAGTAAAAAAAAAAAAAAAAAAAAAA 

>G2139 Amino Acid Sequence (conserved domain in AA coordinates : 14-69) 

MS STKQAKGRKTKGKQKI EMKXVENYGDRM ITFS KRKTGI FKKMTELVAMCDVEVAFL I F 

SQPKKPYTFAHPSMKKVADRLKNPSRQEPLERDDTRPLVEAYKKRRLHDLVKKMEALEEE 

LAMDLEKLKLLKESRNEKKLDKMWWNFPSEGLSAKELQQRYQAMLELRDNLCDNMMLRL 

GKDCGGSSSVRVGRRVSGGVRLFDREA* 

>G2343 (1..U13) 

ATGGGTCATCACTCATGCTGCAACCAGCAAAAGGTGAAGAGAGGGCTTTGGTCACCGGAA 
GAAGATGAGAAGCTTATTAGATATATCACAACTCATGGCTATGGATGTTGGAGTGAAGTC 
CCTGAAAAAGCAGGGCTTCAAAGATGTGGATU^AAGTTGTAGATTGCGATGGATAAACTAT 
CTTCGACCTGATATCAGGAGAGGAAGGTTCTCTCCAGAAGAAGAGAAATTGATCATAAGC 
CTTCATGGAGTTGTGGGT^AACAGGTGGGCTCATATAGCTAGTCATTTACCGGGAAGAACA 
GATAACGAGATTAAAAACTATTGGAATTCATGGATTAAGAAAAAGATACGAAAACCGCAC 
CATCATTACAGTCGTCATCAACCGTCAGTAACTACTGTGACATTGAATGCGGACACTACA 
TCGATTGCCACTACCATCGAGGCCTCTACCACCACAACATCGACTATCGATAACTTACAT 
TTTGACGGTTTCACTGATTCTCCTAACCAATTAAATTTCACCAATGATCAAGAAACTAAT 
ATAAAGATTCAAGAAACTTTTTTCTCCCATAAACCTCCTCTCTTCATGGTAGACACAACA 
CTTCCTATCCTAGAAGGAATGTTCTCTGAAAACATCATCACAAACAATAACAAGAACAAT 
GATCATGATGACACGCAAAGAGGAGGAAGAGAAAATGTTTGTGAACAAGCATTTCTAACA 
ACTAACACGGAAGAATGGGATATGAATCTTCGTCAGCAAGAGCCGTTTCAAGTTCCTACA 
CTGGCGTCACATGTGTTCAA(^U^CTCTTCCAATTCAAATATTGACACGGTTATAAGTTAT 
AATCTACCGGCGCTAATAGAGGGAAATGTCGATAACATCGTCCATT^ATGAAAACAGCAAT 
GTCCAAGATGGAGAAATGGCGTCCACATTCGAATGTTTAAAGAGGCAAGAACTAAGCTAT 
GATCAATGGGACGAOTCACAACAATGCTCTAACTTTTTCTTTTGGGACAACCTTAATATA 
AACGTGGAAGGTTCATCTCTTGTTGGAAACCAAGACCCATCAATGAATTTGGGATCATCT 
GCCTTATCTTCTTCTTTCCCTTCTTCGTTTTAA 

>G2343 Amino Acid Sequence (domain in AA coordinates: 14-116) 
MGHHSCCNQQKVKRGLWSPEEDEKLIRYITTHGYGCWSEVPEKAGLQRCGKSCRLRWINY 
LRPDIRRGRFSPEEEKLIISLHGWG^WAHIASHLPGRTDNEIKNYWNSWIKKKIRKPH 
HHYSRHQPSVTTVTLNADTTSIATTIEAST^ 

IKIQETFFSHKPPLF^^TTLPILEGMFSENIITNNNKNNDHDDTQRGGRENVOT 
TNTEEWDMNLRQQEPFQVPTLASHVFNNSSNSN^ 

VQDGEMASTFECLKRQELSYDQWDDSQQCSNFFFWDNLNINVEGSSLVGNQDPSMNIiGSS 
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ALSSSFPSSF* 
>G265 {280.. 1317) 

CTTTGGTCTTGGAAGCCAAATCAAACCTTTCCTTCAATCCTCAAATTTTCGAAAATTTTC 
TCTTTTGCTTTACGTTCTCTCAATTCTTATTTGTAAGAAAGTTTGTTCCTTTAATCAATC 
AAATCAAAGAGACTTTTGAAGATTGTTTCCCAATTTGCGTCAATCGGGATCGAGTCAAAT 
CTGAAATCTTCTCCACTCATCATCTGACTATAAGACTTAATCAAGGGACTTTTTGTTCGG 
GTTTGGTTTTAAACGTCTTGGATTCGAAGTGGTTAAGGTATGGATGAAAATAATGGAGGT 
TCAAGCTCACTTCCACCTTTCCTTACTAAAACATATGAAATGGTTGATGATTCTTCTTCT 
GACTCGGTCGTTGCTTGGAGCGAAAACAACAAAAGCTTCATCGTCAAGAATCCAGCAGAG 
TTTTCAAGAGACCTTCTTCCGAGATTCTTCAAGCATAAGAATTTCTCAAGTTTCATCCGT 
CAGCTTAATACATATGGTTTTCGAAAAGTAGATCCTGAGAAATGGGAATTCTTGAATGAT 
GATTTTGTTAGAGGTCGACCTTACCTTATGAAGAACATTCATAGACGAAAACCGGTTCAT 
AGCCACTCGTTAGTGAATCTACAAGCGCAAAATCCTTTGACGGAATCAGAAAGACGGAGC 
ATGGAGGATCAGATAGAAAGACTGAAAAATGAGAAAGAAGGCCTTCTTGCGGAGTTACAG 
AACCAAGAGCAAGAACGGAAAGAGTTTGAGCTGCAAGTAACGACATTGAAAGATCGGTTA 
CAACATATGGAACAACATCAGAAATCAATAGTGGCATATGTTTCACAGGTTTTGGGAAAA 
CCAGGACTTTCACTAAACCTCGAAAACCATGAGAGAAGAAAAAGAAGATTTCAAGAGAAC 
TCTCTTCCTCCAAGCAGTTCACACATAGAACAGGTCGAAAAGTTAGAATCTTCGCTAACG 
TTTTGGGAGAATCTTGTATCGGAATCATGCGAGAAGAGCGGTTTGCAGTCATCAAGCATG 
GATCATGATGCAGCTGAGTCAAGTCTAAGTATTGGCGATACACGACCCAAATCATCGAAG 
ATTGATATGAACTCAGAGCCGCCCGTTACCGTTACTGCGCCTGCTCCAAAAACAGGCGTT 
AACGATGACTTTTGGGAACAATGTTTGACAGAGAACCCTGGATCAACCGAGCAACAAGAA 
GTTCAGTCAGAGAGAAGAGATGTCGGTAATGAT^TAATGGTAATAAGATTGGAAATCAA 
AGGACGTATTGGTGGAATTCAGGGAATGTAAATAACATTACAGAGAAAGCTTCTTGACAT 
GAATGAGGTTTTTGTAAAATAGTTTTCTTTTGGTTCCACTGAGATTATTGTATGTGTTCA 
TTATTTATTACTCTGTTTCTGTAAAAACAAATCTCTCTATTGTTTGAGGCAGGAGTGACA 
TAAATGCATATGCAGAATTGGTTTCAAAAA 

>G265 Amino Acid Sequence (domain in AA coordinates: 11-105) 

mENNGGSSSLPPFLTKTYEMVDDSSSDSVVAWSENNKSFIVKNPAEFSRDLLPRFFKHK 

NFSSFIRQIOTYGFRKVDPEKWEFLNDDFTOGRPYLMKNIHRRKPVHSHSLVNLQAQNPL 

TESERRSMEDQIERLKNEKEGLLAELQNQEQERKEFELQVTTLKDRLQHMEQHQKSIVAY 

VSQVLGKPGLSLNLENHERRKRRFQENSLPPSSSHIEQVEKLESSLTFWENLVSESCEKS 

GLQSSSMDHDAAESSLSIGDTRPKSSKXDMNSEPPVTVTAPAPKTGVlsroDFWEQCLTENP 

GSTEQQEVQSERRDVGNDNNGNKIGNQRTYWWNSGNVNHITEKAS* 
>G2792 (1..960) 

ATGGATCATCATCATCACATAGCATCAAGAAATTCATCAACAACATCAGAATTACCATCA 
TTCGAGCCAGCGTGCCATAACGGTAATGGTAACGGTTGGATCTATGACCCAAATCAAGTT 
AGGTACGATCAAAGTAGTGACCAACGGCTGTCAAAGTTGACGGATCTTGTAGGCAAGCAC 
TGGTCAATTGCACCACCGAATAATCCCGA(^TGAACGA^ 

CATGATCATTCTCAAAACGACGACATTTCTATGTACAGACAAGCCTTGGAGGTGAAAAAT 
GAGGAAGATCTTTGTTACAATAATGGCTCAAGTGGTGGTGGTTCCTTGTTCCATGATCCT 
ATAGAAAGTTCTAGAAGTTTCCTTGATATAAGGTTAAGTAGGCCATTAACGGATATTAAT 
CCGTCATTTAAGCCATGCTTTAAGGCCTTAAACGTATCCGAGTTTAACAAGAAAGAACAT 
CAAACGGCATCTCTGGCAGCAGTGAGACTGGGAACAACAAACGCTGGAAAAAAGAAGAGA 
TGTGAAGAAATTTCCGATGAGGTTTCAAAGAAGGCCAAGTGCAGTGAGGGCTCTACACTT 
TCGCCAGAGAAGGAACTACCCAAAGCCAAAOTCGAGACAAGATCACGACTCTAC^GCAA 
ATTGTGTCTCCCTTTGGAAAGACTGATACTGCTTCTGTGCTTCAAGAGGCCATCACTTAC 
ATAAATTTTTATCAAGAGCAAGTTAAGCTGCTAAGCACTC 

ATGAAGGATCCATGGGGGGGATGGGACAGAGAAGATCACAACAAAAGGGGACCGAAGCAT 
CTTGATCTAAGGAGTAGAGGGCTTTGTTTGGTTCCTATTTCATATACCCCAATCGCATAC 
CGCGATAACAGTGCAACTGACTACT^GAATCCCACGTATAGAGGTTCTTTGTATCGTTAG 
>G2792 Amino Acid Sequence (domain in AA coordinates : 190-258) 

MDHHHHIASRNSSTTSELPSFEPACHNGNGNGWIYDPNQVRYDQSSDQRLSKLTDLVGKH 

wsiappnnpdmnhnlhhhfdhd^^ 

IESSRSFLDIRLSRPLTDINPSFKPCFKALNVSEFNKKEHQTASLAAVRLGTTNAGKKKR 
CEEISDEVSKKAKCSEGSTLSPEKELPKAKLRDKITTLQQIVSPFGKTDTASVLQEAITY 
INFYQEQViaLSTPYMKNSSMKDPWGGWDREDHNKRGPKHLDLRSRGLCLVPISYTPIAY 
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RDNSATDY WNPTYRGSLYR * 
>G2830 (1..903) 

ATGTCTTCCATCCCAAATAGGTTCAATATTTATGGTGGTGATACCACAAACCATCGTGAA 
TCGCTTCCCATCGAAATGAATCACAACTCTCGAATGGTTCGATCCATGTTCATTACATCT 
GATCGCATGAATCATAGAGATTTGTTTTCTTCTCCTCCTTCTTTCTCTTCTTATCAAAAT 
TCACATATCTCTTCATCTTCTGTTGGGTTTAATAATTCACATATGACTTATCATATGCTG 
AAAAGAAATTATGATTCTGTTTCCCGTGCTGATTATTTCTCTACTAAAGATCATTCTCAT 
TTTACTCAAGTATCTTTCACTCAAACCATCACAAATAAGTATACTACTATTGTTCCTTCC 
AATATATTTGACACTGTTCACTATGATATTGGTCGTGTCAAACGTGCCATAGATTTTAGA 
AATATTTGGAATCCTAAATCTCATCTTCCAA7VAAMTTTAATAGGCAATGCGAGATTTTG 
AATCCTACCCCTCTTAATATCGTCTTTCCGCACCAGGATTGAGCTGATCGTCAACATTTA 
GACATTATTTTCTCGTCATCAAAGCACAACCATGTTTTCCAAGATGGTCGATCCTTGAAG 
AAAATTTCCGAACCAACCAATCTGTTTGAAAAATCTAATTCTTATGATTCTCAAGT^AGAT 
GAGAAAATCGATGCTTATCAATATGATGGTCGTACACATAGTCTACCGTATACGAAATAC 
GGTCCATATACATGTCCCAGGTGTAACGGTGTGTTTGATACTTCTCAAAAATTTGCTGCA 
CATATGTTATCTCACTACAATAATGAGACGGACAAAGAAAGAGACCAAAGATTTCGTGCA 
AGAAATAAAAAACGATATCGTAAGTTTATGGACAGTCTTAAAATATCAAAACAGAAGATA 
TGA 

>G2830 Amino Acid Sequence (domain in AA coordinates : 245-266) 

MSSIPNRFNIYGGDTTmiRESLPIEMNHNSRMVRSMFITSDRMNHRDLFSSPPSFSS 

SHISSS S VGFNNSHMTYHMLKRNYDS VSRAD YFSTKDHSHFTQVS FTQTITNKYTTI VPS 

NIFDTVHYDIGRVKRAIDFRNIWNPKSHLPKKFNRQCEILNPTPLNIVFPHQDSADRQHL 

DIIFSSSKHNHVFQDGRSLKKISEPTNLFEKSNSYDSQEDEKIDAYQYDGRTHSLPYTKY 

GPYTCPRCNGVFDTSQKFAAHMLSHYNWETDKERDQRFRARNKKRYRKFMDSLKISKQKI 

* 

>G286 (94.. 2454) 

TGCAATTTCTCTCGACCAAAACCCTAATTTCAGGTTTGGGGTTTTCCTTCTTTCACTGTC 
AATTTTGATGAAACTTGTGATTCAGTGATTAGAATGT^ATGCTAATGAGCAAACTCGATCC 
GCCAATGGCATTGGCAATGGCAATGGTGAGTCTATTCCCGGGATTCCAGATGACTTACGG 
TGCAAGAGATCGGATGGTAAACAGTGGAGATGCACTGCAATGTCCATGGCTGATAAGACT 
GTTTGTGAGAAGCACTACATCCAAGCAAAGAAGCGGGCGGCTAATTCTGCTTTCAGGGCG 
AACCAGAAGAAAGCGAAAAGGCGATCATCGTTAGGCGAAACAGATACGTATTCGGAAGGG 
AAGATGGATGATTTCGAGTTACCAGTCACCAGCATTGACCACTATAATAACGGTCTTGCC 
TCTGCTTCCAAGAGTAATGGTAGACTAGAGAAGAGACATAATAAAAGCCTGATGCGGTAC 
TCGCCCGAGACACCGATGATGAGGAGTTTCTCTCCACGTGTTGCAGTGGATTTGAATGAT 
GACTTGGGTAGAGATGTTGTAATGTTTGT^AGAGGGCTACAGATCTTATAGGACACCACCA 
TCTGTTGCTGTTATGGATCCGACACGAAACAGATCACACCAT^AGCACCAGTCCTATGGAA 
TACTCAGCAGCAAGCACAGATGTGTCTGCAGAGTCTTTGGGGGAAATCTGCCATCAATGC 
CAGAG7VAAAGATAGAGAGAGAATCATTTCTTGCCTCAAATGCAATCAAAGAGCCTTCTGC 
CACAATTGTCTATCGGCAAGGTACTCGGAGATATCACTTGAAGAAGTCGAGAAAGTTTGC 
CCTGCATGTCGTGGCTTGTGTGATTGCAAATCTTGCCTGCGTTCAGATAATACAATAAAG 
GTTCGGATCCGGGAAATACCCGTTTTGGACAAGTTGCAGTATCTTTATCGTCTATTATCA 
GCTGTCCTACCAGTCATAAAGCAGATCCATCTTGAACAATGTATGGAAGTTGAACTAGAG 
AAGAGGCTTCTTGAAGTTGAGATTGATCTTGTCAGGGC7^AGATTGAAAGCAGATGAGCAG 
ATGTGCTGCAACGTGTGTCGGATACCAGTTGTTGACTACTACCGTCACTGTCCGAACTGC 
TCATATGACCTTTGCCTGAGATGCTGTCAAGATCTACGGGAAGAGTCTTCAGTGACGATT 
AGTGGGACTAACCAAAACGTACAAGATAGAAAAGGAGCTCCCAAACTAAAACTAAACTTT 
TCATACAAGTTTCGTGAGTGGGAAGCCAACGGTGATGGGAGCATCCCTTGCCCTCCTAAG 
GAGTATGGAGGCTGCGGTTCACATTCTTTGAATCTTGCCCGCATTTTCAAGATGAATTGG 
GTTGCAAAGCTTGTGAAAAATGCTGAGGAGATTGTTAGTGGCTGCAAATTATCTGATCTT 
CTGAACCCTGATATGTGTGATTCAAGATTCTGCAAATTTGCTGAGAGAGAAGAGAGCGGT 
GACAACTACGTGTACAGCCCGTCGCTTGAAACGATTAAAACTGATGGAGTAGCTAAGTTT 
GAGCAACAATGGGC^GAGGGTCGGCTTGTTACTGTGAAAATGGXACTTGATGACTCATCT 
TGCTCTAGATGGGATCCTGAGACTATTTGGAGGGATATAGACGAGCTTTCGGACGAGAAA 
CTGAGAGAACATGATCCATTCTTGAAGGCCATTAATTGCTTGGATGGTTTAGAGGTTGAT 
GTAAGACTTGGGGAGTTTACAAGAGCATATAAAGATGGAAAGAACCAAGAGACAGGTCTT 
CCGCTATTGTGGAAGTTAAAGGACTGGCCGAGCCCAAGTGCTTCCGAGGAGTTCATTTTC 
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TACCAAAGACCTGAGTTTATCAGAAGTTTTCCGTTTCTCGAGTACATTCATCCCCGGTTA 
GGCCTTCTGAATGTTGCAGCCAAGTTACCTCATTACTCGCTCCAAAACGATTCAGGTCCA 
AAGATTTATGTGTCTTGTGGGACGTACCAAGAAATCAGTGCTGGCGATTCATTGACTGGT 
ATTCACTACAACATGCGTGACATGGTATACCTATTGGTGCACACGTCTGAAGAAACAACA 
TTCGAAAGGGTGAGAAAAACAAAACCTGTTCCAGAGGAACCTGACCAGAAGATGAGCGAA 
AATGAGTCACTTCTTAGCCCTGAGCAGAAATTAAGGGACGGAGAGTTACATGATCTATCA 
CTTGGTGAAGCCAGTATGGAGAAGAATGAACCTGAGTTGGC6TTGACTGT6AATCCAGAG 
AACTTAACGGAAAACGGTGACAACATGGAATCTTCTTGCACATCTTCATGTGCAGGAGGA 
GCCCAGTGGGATGTCTTTCGACGCCAAGACGTCCCAAAGTTGTCCGGGTATTTGCAGAGA 
ACATTCCAGAAGCCTGATAATATCCAGACTGATTTTGTAAGCCGTACCTGCTAATTCAAA 
TAAATGAAGTGTGTAAAGTCTTGTATGTGGAATGATTGAGTTTCCTAGTTTGTTTACTCT 
GGTTTCAGGTGTCACGCCCGTTGTATGAAGGATTGTCTTTAAATGAACACCACAAGAGAC 
AACTAAGAGACGAGTTTGGAGTTGAGCCATGGACATTTGAGCAACATCGTGGTGAGGCTA 
TCTTCATTCCGGCTGGATGTCCGTTCCAAATCACTAATCTTCAGTCGAATATTCAGGTGG 
CACTTG ACTTCTTGTG CCCTGAAAG CGTTGGAGAGTCAGCAAGACTAGCTG AAGAAATCC 
GGTGTTTACCAAACGACCACGAGGCAAAACTTCAGATTCTAGAGATTGGAAAGATATCAT 
TATACGCAGCTAGCTCAGCCATTAAAGAGGTTCAGAAACTGGTCTTGGATCCAAAGTTTG 

GAGCAGAGCTTGGATTTGAAGACTCTAACTTAACCAAAGCAGTCTCTCACAACTTAGACG 
AGGCAACCAAGCGGCC 

>G286 Amino Acid Sequence (domain in AA coordinates: TBD) 

MNANEQTRSANGIGNGNGESIPGIPDDLRCKRSDGKQWRCTAMSMADKTVCEKHYIQAKK 

RAANSAFRANQKKAKRRSSLGETDTYSEGKMDDFELPVTSIDHYNNGLASASKSNGRLEK 

RHNKSLMRYSPETPMMRSFSPRVAVDLNDDLGRDVVMFEEGYRSYRTPPSVAVMDPTRNR 

SHQSTSPMEYSAASTDVS AESLGE I CHQCQRKDRERI I SCLKCNQRAFCHNCLSARYSEI 

SLEEVEKVCPACRGLCDCKSCLRSDNTIKVRIREIPVLDKLQYLYRLLSAVLPVIKQIHL 

EQCMEVELEICRLLEVEIDLVRARLKADEQMCCNVCRIPWDYYRHCPNCSYDLCLRCCQD 

LREESSVTISGTNQNVQDRKGAPKLKLNFSYKFPEWEANGDGSIPCPPKEYGGCGSHSLN 

I^IFKMNWVAKLVKNAEEIVSGCKLSDLLNPDMCDSRFCKFAEREESGDNYVYSPSLET 

IKTDGVAKFEQQWAEGRLVWKIWLDDSSCSRWDPETIWRDIDELSDEKLREHDPFLKAI 

NCLDGLEVDVRLGEFTRAYKDGKNQETGLPLLWKLKDWPSPSASEEFIFYQRPEFIRSFP 

FLEYIHPRLGLLNVAAKLPHYSLQNDSGPKIYVSCGTYQEISAGDSLTGIHYNMRDMVYL 

LVHTSEETTFERVRKTKPVPEEPDQKMSENESLLSPEQKLRDGELHDLSLGEASMEKNEP 

ELALTVNPENLTENGDNMES S CTS S CAGG AQ WD VFRRQD VPKLS G YLQRTFQKPDN I QTD 

>G291 (124.. 1197) 

CAAGAACCCAAAGATCTCTCTCTATTTGTTTGCCTTCTTCTTTCTTTCTGACTCAAACCC 
TCAAATC^TTCTCGCGATTAAGCAAAACCCTAGATTTATTCTACTCTTCGAAGTCGATT 
TCAATGGAAGGTTCCTCGTCAGCCATCGCGAGGAAGAGATGGGAGCTAGAGAACAACATT 
CTCCCAGTGGAACCAACCGATTCAGCCTCCGACAGTATATTCCACTACGACGACGCTTCA 
CAAGCCAAAATCCAGCAGGAGAAGCCATGGGCCTCCGATCCTAACTACTTCAAGCGCGTT 
CACATCTCAGCCCTTGCTCTTCTCAAGATGGTGGTTCACGCTCGCTCCGGTGGCACAATC 
GAGATCATGGGTCTTATGCAGGGTAAAACCGAGGGTGATACAATCATCGTTATGGATGCT 
TTTGCTTTGCCTGTTGAAGGTACTGAGACTAGGGTTAATGCTCAGTCTGATGCCTATGAG 
TATATGGTTGAATACTCTCAGACCAGCAAGCTGGCTGGGAGGTTGGAGAACGTTGTTGGA 
TGGTATCACTCTCACCCTGGGTATGGATGTTGGCTCTCGGGTATTGATGITTCGACACAG 
ATGCTTAACCAACAGTATCAGGAGCCATTCTTAGCTGTTGTTATTGATCCAACAAGGACT 
GTTTCGGCTGGTAAGGTTGAGATTGGGGCATTCAGAACATATCCAGAGGGACATAAGATC 
TCGGATGATCATGTTTCTGAGTATCAGACTATCCCTCTTAACAAGATTGAGGACTTTGGT 
GTACATTGCAAACAGTACTACTCATTGGACATCACT 

CACCTTCTGGATCTCCITTGGAACAAGTACTGGGTGAACACTCTTTCTTCTTCCCCA 

TTGGGCAATGGAGACTATGTTGCCGGGCAAATATCAGACTTGGCTGAGAAGCTCGAGCAA 

GCGGAGAGTCAGCTCGCTAACTCCCGGTATGGAGGAATTGCGCCAGCCGGTCACCAAAGG 

AGGAAAGAGGATGAGCCTCAACTCGCGAAGATAACTCGGGATAGTGCAAAGATAACTGTC 

GAGCAGGTCCATGGACTAATGTCACAGGTTATCAAAGACATCTTGTTCAATTCCGCTCGT 

CAGTCCAAGAAGTCTGCTGACGACTCATCAGATCCAGAGCCCATGATTACATCGTGAAGT 

TGGTCTATTCTTTTGTTTTTTGGCTGCGGAAATTGACTATCGGTTTGACCCGGTTTATGA 

GGCAATGCCCATTGTTCCCTATATCTCTAGTGTAGTATCTGCTTCAGACAAAGATCTTTG 
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GGTTATTAAATGACATTAACATAAAAAAAA 

>G291 Amino Acid Sequence (domain in AA coordinates: 132-160) 

MEGSSSAIARKTWELENNILPVEPTDSASDSIFHYDDASQAKIQQEKPWASDPNYFKRVH 

ISALALLKMWHARSGGTIEIMGLMQC3KTEGDTIIVMDAFALPVEGTETRVNAQSDAYEY 

MVEYSQTSKLAGRLENVVGWYHSHPGYGCWLSGIDVSTQMLNQQYQEPFLAVVIDPTRTV 

SAGICVEIGAFRTYPEGHKISDDHVSEYQTIPLNKIEDFGVHCKQYYSLDITYFKSSLDSH 

LLDLLWNKYWVNTLSSSPLLGNGDYVAGQISDLAEKLEQAESQLANSRYGGIAPAGHQRR 

KEDEPQIAKITRDSAKITVEQVHGLMSQVIKDILFNSARQSKKSADDSSDPEPMITS* 

>G427 (49.-1230) 

TTTCCCTCTCCGAAACAGAAATTCAAAAACAAATTCAACACGAAAACGATGGCGTTTCAT 

AAC AATC ACTTTAAT C ATTTC AC CG ACC AAC AAC AAC ATC AG CCT C CTCCTCCG CCG C AA 

CAGCAGCAGCAACAACATTTTCT^AGAATCAGCACCCCCTAATTGGCTCCTCCGCTCCGAC 

AACAACTTCCTCAATCTCCACACAGCTGCCACAGCCGCCGCTACAAGCTCCGATTCTCCT 

TCTTCCGCCGCCGCTAACCAGTGGCTCTCACGATCCTCATCCTTCCTCCAACGAGGCAAC 

ACCGCAAACAACAACAACAACGAAACATCCGGTGACGTCATCGAAGACGTTCCCGGCGGA 

GAGGAGTCAATGATCGGAGAGAAGAAGGAGGCGGAGAGGTGGCAGAATGCGAGACACAAG 

GCGGAGATACTGTCTCATCCACTATACGAGCAACTTTTGTCGGCACACGTGGCGTGCCTG 

AGGATCGCAACGCCGGTGGATCAGCTTCCGAGGATAGACGCACAGCTTGCTCAGTCTCAA 

AACGTCGTGGCTAAGTACTCAACTTTAG.AAGCCGCTCAAGGACTCCTCGCCGGCGATGAC 

AAGGAGCTTGACCACTTCATGACGCATTATGTACTATTGCTTTGCTCTTTCAAAGAACAA 

CTGCAACAGCATGTTCGTGTTCATGCAATGGAAGCTGTTATGGCCTGTTGGGAGATTGAA 

CAGTCGCTTCAAAGTTTTACAGGAGTATCTCCTGGTGAAGGCACAGGAGCAACAATGTCT 

GAGGATGAAGATGAGCAAGTAGAGAGTGATGCTCATTTGTTTGATGGAAGCTTAGATGGG 

TTAGGGTTTGGTCCTCTAGTTCCCACTGAGAGCGAGAGATCTTTGATGGAACGAGTCAGA 

CAAGAACTCAAACATGAACTCAAGCAGGGTTACAAGGAGAAAATTGTGGACATAAGAGAG 

GAGATACTGAGGAAGAGAAGAGCTGGAAAATTACCAGGAGACACCACCTCTGTTCTCAAA 

TCATGGTGGCAATCTCATTCTAAGTGGCCTTACCCTACTGAGGAAGATAAGGCGAGGTTG 

GTGCAGGAGACGGGTTTGCAGCTCAAACAGATAAACAATTGGTTCATCAATCAAAGAAAG 

AGGAATTGGCATAGCAATCCATCTTCTTCTACCGTCTCAAAGAAT7VAACGCCGAAGCAAT 

GCAGGTGAAAACAGCGGAAGAGACCGTTGAGATCAAGCTTGCATGTAGAGATCCAAAAGC 

TTTATAGAAAGGTGGAGGCATGAAGACAAAGAATTCTTACACAACAAACGTAGGACGTT^A 

TTTTGTGCCAGTACATGGTATGGCTCTCATATTTGGTAATGATTAGGGCCACACAAAATT 

AAACCCCAAAGCATGATTTGTAATATGAGGTTTTAGATGGACTTTATGATAGGATCGTCA 

GTCTTCACTGCCATCTCCATTCTCCACCATCAATCCATCATTATATCTTGTGAAAAAAAA 

A 

>G427 Amino Acid Sequence {domain in AA coordinates: 307-370) 

mafhnnhfnhftdqqqhqpppppqqqqqqhfqesappnw 

sds ps s aaanqwls rsss flqrgntannnnnetsgdviedvpggeesm i gekkeaerwqn 
arhkaeilshplyeqllsahvaclriatpvdqlpridaqlaqsqnwakystleaaqgll 

AGDDKELDHFMTHYVLLLCS FKEQLQQHVRVHAME AVMAC WE IEQS IiQ S FTG VS PGEGTG 

ATMSEDEDEQVESDAHLFDGSLDGLGFGPLVPTESERSIiMERVRQELKHELKQGYKEKIV 

DIREEILRKIU^GKLPGDTTSVLKSWWQSHSKWP 

NQRKRNWHSNPSSSTVSKNKRRSNAGENSGRDR* 

>G509 (122.. 1054) 

CTTCCTCCTTTGCTAATAAACTTTTCTTTGAACCTTACACGCCTTGTTGATATTACTCTC 
TTAAATATATATTTTCGTACATTAACACAGACATATATAAAGCTAAAGATTTCTTCACGT 
AATGGGTTTGAAAGATATTGGGTCCAAATTGCCACCGGGGTTTCGATTTCATCCAAGTGA 
TGAAGAGTTGGTTTGTCATTATCTTTGCAACAAGATTAGGGCCAAATCTGATCATGGTGA 
TGTTGATGATGATGATGATGATGTTGATGAAGCTTTGAAGGGTTCTACTGATCTTGTGGA 
GATTGACTTGCATATCTGTGAGCC^TGGGAGCTTCCrrGATGTGGa^GTTAAACGCAAA 
GGAATGGTACTTCTTCAGTTTCCGTGATCGAAAGTATGCTACTGGATATCGCACGAACAG 
AGCGACAGTAAGCGGATACTGGAAAGCAACAGGAAAAGATCGAACGGTGATGGATCCACG 
TACAAGGCAATTGGTAGGGATGAGAAAAACACTAGTGTTCTACAGAAACAGAGCACCAAA 
TGGGATCAAAACTACTTGGATGATGCACGAGTTCCGTCTTGAGTGTCCTAACATCCCACA 
TAAGGAAGACTGGGTCTTGTGCAGAGTGTTCAACAAAGGCAGAGACTCATCGCTACAAGA 
CAATAATTATTATAACAATGATAATCAGACGCAAAGGCTTGAAGTTAATGACGCTCCGGA 
TCTTAATTACAACAATCAGTTGCCACCTTTGCTATCATCCCCTCCTCATAATCATCAACA 
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TGAGAAGATGAAAATCCAAGTTTGTGATCAGTGGGAGCAGCTAATGAAGCAGCCTTCAAG 

GACCACCGGCCACCCCTATCATCACCATTGTCATCATCAAACCATAGCATGTGGTTGGGA 

GCAGATGATGATCGGTTCGCTGTCATCACCTTCGAGTCATGGCCCTGATCACGAGTCCTT 

TGCTAAATTTGCTTTACCGTCGACAATAACAACAGTGTCAACATCAGTGGTGATCATCAT 

CAGAATTATGAGAAGATTTTGTTGTCATCACTAGACATGACGAGTTTGGATCACGACAAG 

ACATGTATGGGATCATCATCGGATGGTGGTATGGTCTCTGATCTTCACATGGAATGTGGT 

GGATTGAGTTTTGAGACCGAAAATATCCTCGCTTTCCAATGAACATAATTCAAGGGGTTC 

GCCAATTTGTTGATTCGTGAATTATACAAACATTTTATCTATAGATTTATCACATTATCA 

AACATGTAAGTTGTGTGGCATTTGGGTATAGGGTTTGTGTGATTCTAGGTTTTTAGGACG 

ATGTATGTTGTTATATTTAGCGTGTTTTTAGGATTTATTCTCATTTTAAAATTATATGAA 

AACCCATTACT^ATGAATACAATTAGTTTTCTTTGTTGTAAATAATATTTTAGATTATCTVA 
AAAAAAAAAAAAAA 

>G509 Amino Acid Sequence (domain in AA coordinates: 13-169) 
MGLKDIGSICLPPGFRFHPSDEELVCHYLCNKIRAKSDHGDVDDDDDDVDEALKGSTDLVE 
IDLHICEPWELPDVAKLNAKEWYFFSFRDRKYATGYRTORATVSGYWKATGKDRTVMDPR 
TRQLVGMRKTLVFYRNRAPNGIKTTWIMHEFRLECPNIPHKEDWVLCRVFNKGRDSSLQD 
I^YNNDNQTQRLEVNDAPDLNYNNQLPPLLSSPPHNHQHEKMKIQVCDQWEQLMKQPSR 

TTGHPYHHHCHHQTIACGWEQMMIGSLSSPSSHGPDHESFAKFALPSTITTVSTSWIII 
RIMRRFCCHH* 

>G519 (85.. 894) 

CACAAAGATCCTCCGATTCGAAGGTTTATAAAAACTC71AAATCGAATCTTATCCACAAGA 
AAACAACAAGGTACTTTTCCAAAAATGAAGGCGGAGTTGAATTTGCCGGCGGGATTCCGA 
TTTCATCCGACGGACGAAGAGCTTGTCAAGTTCTATCTTTGCCGGAGATGTGCGTCAGAA 
CCGATTAACGTTCCGGTTATCGCAGAGATTGACTTGTACAAATTCAATCCATGGGAGCTT 
CCAGAAATGGCGTTGTACGGTGAGAAAGAATGGTACTTCTTCTCGCATAGAGACCGGAAA 
TACCCAAACGGGTCGAGACCAAACCGGGCAGCTGGAACCGGTTATTGGAAAGCGACTGGA 
GCTGATAAACCGATCGGAAAACCGAAGACGTTAGGGATTAAGAAAGCACTCGTCTTCTAC 
GCAGGAAAAGCTCCGAAAGGGATTAAAACGAATTGGATTATGCACGAGTATCGTCTCGCT 
AATGTCGATCGATCTGCTTCTACCAACAAGAAGAACAACTTAAGACTTGATGATTGGGTT 
TTGTGTCGGATATACAATAAGAAAGGAACAATGGAGAAGTATTTACCGGCGGCGGCTGAG 
AAACCGACGGAAAAGATGAGTACGTCGGACTCAAGATGCTCAAGTCACGTGATTTCACCG 
GACGTCACGTGTTCTGATAACTGGGAGGTTGAGAGTGAGCCCAAATGGATTAATCTGGAA 
GACGCGTTAGAGGCATTTAATGATGACACGTCCATGTTTAGTTCCATTGGTTTGTTGCAA 
AATGACGCCTTTGTTCCTCAGTTTCAGTACCAGTCCTCCGATTTCGTCGATTCGTTTCAG 
GACCCGTTCGAGCAGAAACCGTTCTTGAATTGGAATTTTGCTCCTCAAGGGTAAAAATAA 
TCGGCAAAAAGTTGAAGCTTTTCAGAGTCTTCGATCACCGGCATTGTGTCGGATCCTGAC 
CCGGAGACCAAGTCGGGTCATACGATTACATAATCGGGTTATTGAGATTTCCACATTTGG 
ATTTCCGAGACTAACCAACTTAACGGATTCTGGGGTAATTGGGGGGTTTTGCACAGGTGA 
ATCACACTGAGTCAGCAAGTTTCGATTTTTTGGTTTTGTTTTGTAATGATTC 

TCTAAAGATATCACGAAGTAGATTCAGAAGAACTGTAAAAGCAATTGTGACCACCCGTTA 
TGAATC^TA7\ATATATTCAATGAAGCATGAGCTTATTTTTTTTTTAAAAAAA^ 

Amin ° Acid Se< 3 uence (conserved domain in AA coordinates: 11-104) 
MKAELNLPAGFRFHPTDEELVICFYLC^ 

KEWYFFSHRDRKYPNGSRPNRAAGTGYWKATGADKPIGKPKT 

KTNW I MHE YRL ANVDRS AS TNKKNNLRLDDWVLCR I YNKKGTMEKYL PAAAEKPTBKMST 

SDSRCSSHVISPDVTCSDNWEVESEPKWINLEDALEAFNDDTSMFSS1GLLQNDAFVPQF 
QYQS SDFVDSFQDPFEQKPFLNWNFAPQG* 
>G561 (86.. lies) 

AATTTGTTTTTTTTTCTTTTGTGGGTTCAATTCGAATTGTTTO 

CTGTGTCATTACTCTGCATTGAGCAATGGGTAGCAACGAAGAAGGAAACCCCACTAACAA 
CTCTGATAAGCCATCGCAAGCTGCTGCTCCTGAGCAGAGTAATGTTCATGTGTATCATCA 
TGACTGGGCTGCTATGCAGGCATATOATGGGCCTAGAGTTGGTATACCTCAATATTACAA 
CTCAAATTTGGCGCCTGGTCATGCTCCACCGCCTTATATGTGGGCGTCTCCATCGCCAAT 
GATGGCTCCTTATGGAGCACCATATCCACCATTTTGCCCTCCTGGTGGAGTTTATGCTCA 
TCCTGGTGTTCAAATGGGCTCACAACCACAAGGTCCTGTTTCTCAATCAGCATCTGGAGT 
TAC^CCCCTTTGACCATTGATGCACCAGCTAATTCAGCTGGAAACTCAGATCATGGGTT 
CATGAAAAAGCTGAAAGAGTTCGATGGACTTGCAATGTCAATAAGCAATAAC^^GTTGG 
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GAGTGCTGAACATAGCAGCAGTG7VACATAGGAGTTCTCAGAGCTCCGAGAATGATGGCTC 
TAGCAATGGTAGTGATGGTAATACAACTGGGGGAGAACAATCTAGGAGGAAAAGAAGGCA 
ACAAAGATCACCAAGCACTGGTGAAAGACCCTCATCTCAAAACAGTCTGCCTCTTAGAGG 
TGAAAATGAGAAACCCGATGTGACTATGGGGACTCCTGTTATGCCCACAGCAATGAGTTT 
CCAAAACTCTGCTGGCATGAACGGTGTGCCACAGCCATGGAATGAAAAAGAGGTTAAACG 
AGAGAAGAGAAAACAGTCAAACCGAGAATCTGCTAGGAGGTCAAGACTGAGGAAGCAGGC 
TGAAACAGAACAACTATCTGTCAAAGTTGACGCATTAGTAGCTGAG7UVCATGTCTCTGAG 
GTCTAAACTAGGCCAGCTAAACAATGAGTCTGAGAAACTACGGCTGGAGAACGAAGCTAT 
ATTGGATCAACTGAAAGCGCAAGCT^ACAGGGAAAACAGAGAACCTGATCTCTCGAGTTGA 
TAAGAACAACTCTGTATCAGGTAGCAAAACTGTGCAGCATCAACTGTTAAATGCAAGTCC 
GATAACCGATCCTGTCGCGGCTAGCTGACCGTGGCCGCAACAATGAGAACCCGATATTTC 
TTCCTTTGGGTTGTGATTGTAACTTAAAAGGAGACTTTTTGTTTTTATTCTTAGATTTGT 
AGCTCTCTGCATAGTGAGCATAAATTGATGTAATATGGTTTAAGAGATTCGGTGTTCTCT 
GGTGTGTGCTGCAACCACATAATTGGTGATAGATAGGTTTAGTTATATAAGCAAATGTAT 
TAGAGATAAGGGGAGACATATTTGATGGTCTTT 

>G561 Amino Acid Sequence (domain in AA coordinates: 248-308) 

MGSNEEGNPTraSDKPSQAAAPEQSNVHVYHHDWAAMQAYYGPRVGIPQYYNSNLAPGHA 

PPPYMWASPSPMMAPYGAPYPPFCPPGGVYAHPGVQMGSQPQGPVSQSASGVTTPLTIDA 

PANSAGNSDHGFMKKLKEFDGLAMS I SNNKVG S AEHSS SEHRS SQS SENDGSSNGSDGNT 

TGGEQSRRIOIRQQRSPSTGERPSSQNSLPLRGENEKPDVTMGTPVMPTAMSFQNSAGMNG 

VPQPWNEKEVKREK^KQSNRESARRSRLRKQAETEQLSVKVDALVAENMSLRSKLGQLNN 

ESEKLRLENEAILDQLKAQATGKTENLISRVDKlSFNSVSGSKTVQHQLIiNASPITDPVAAS 

* 

>G590 (102 -.1223) 

TCGACAGACACTCTCCCTCTCTCCATGCCCATAAAATCTCAAAGACTGTTTAAAAAAAAA 

AAAGAGAAGAGAAGAAGCAGAGAGTGATGGGAGATAAGAAATTGATTTCATCTTCTTCTT 
CTTCCTCGGTTTACGT^ACTCGTATCAATCATCATCTTCATCATCCTCCGTCTTCTTCCG 
ACGAAATCTCTCAGTTTCTCCGGCATATTTTCGACCGTTCTTCTCCTTTACCTTCTTACT 
ACTCCCCGGCGACGACTACAACGACGGCGTCTTTGATTGGTGTGCACGGGAGCGGTGACC 
CACATGCAGATAACTCGAGAAGTCTCGTTTCTCATCATCCACCGTCAGATTCTGTGCTTA 
TGTCGAAACGTGTCGGAGATTTCTCTGAGGTTTTAATCGGCGGAGGATCAGGCTCAGCCG 
CCGCGTGTTTTGGTTTCTCCGGTGGTGGTAATAATAACAACGTTCAAGGAAATAGCTCTG 
GGACTCGAGTATCGTCTTCTTCCGTTGGAGCTAGTGGCAACGAGACAGATGAGTATGACT 
GTGAAAGCGAGGAAGGAGGAGAAGCTGTAGTTGATGAAGCTCCCTCTTCCAAGTCAGGTC 
CTTCTTCTCGTAGTTCATCTAAAAGATGCAGAGCTGCTGAAGTTCATAATCTCTCTGAGA 
AGAGGAGGAGAAGTAGAATTAATGAAAAAATGAAAGCTTTACAAAGTCTCATCCCTAATT 
CAAATAAGACGGATAAGGCTTC7VATGCTTGATGAAGCCATTGAGTATCTGAAACAGCTTC 
AGCTCCAAGTTCAGATGTTGACTATGAGAAATGGAATAAACTTGCATCCTTTGTGTTTAC 
CTGGAACTACATTACACCCATTGCAACTCTCTCAGATTCGACCCCCTGAAGCAACCAATG 
ATCCTCTGCTTAATCATACCAATCAGTTTGCTTCGACTTCTAATGCACCGGAAATGATCA 
ATACTGTGGCTTCTTCATACGCTTTGGAACCTTCTATTCGCAGTCACTTTGGACCTTTCC 
CTCTCCTTACTTCACCCGTGGAGATGAGTCGGGAAGGTGGGTTAACTCATCCAAGGTTGA 
ACATTGGTCATTCCAACGCAAAGATAACCGGGGAACAAGCTCTGTTTGATGGACAACCTC 
ACCTAAAAGATCGAATTACTTGAACAGTGTCCCAACTTCGGGATCTCTATGTGTTCTTGT 
TTCTTAGAACGCAAGCCATAAAGCTGTCTGAC 

>G590 Amino Acid Sequence (domain in AA coordinates: 202-254) 

MISQREEREEKKQRVMGDKKLISSSSSSSVYDTRINHHLHHPPSSSDEISQFLRHIFDRS 

SPLPSYYSPATTTTTASLIGVHGSGDPHADNSRSLVSHHPPSDSVLMSKRVGDFSEVLIG 

GGSGSAAACFGFSGGGNimNVQGNSSGTRVSSSSVGASGNETDEYDCESEEGGEAVVDEA 

PSSKSGPSSRSSSKRCRAAEVHNLSEKRRRSRINEKMKALQSLIPNSNKTDKASMLDEAI 

EYLKQLQLQVQMLTMRNGINLHPLCLPGTTLHPLQLS Q I RPPEATNDPLLNHTNQFASTS 

NAPEMINTVASSYALEPSIRSHFGPFPLLTSPVEMSREGGLTHPRLNIGHSNANITGEQA 

LFDGQPDLKDRIT* 

>G818 (65. .1060) 

GTATTTCTTACAATAAACGACCAAAAAGTTAATACAAGAAATAGAAACGGTGTAGGAAGC 
TACTATGACGGCAATTCCAAACGTCGTCGATATTGAATCTTCTTCCTCTTCGCTTTGTCA 
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AGAGACGGCAACGGAGACCGTCACCGTTGAAAGAGGCTCGTCTGATTCATCTTCAAAGCC 
AGACGACGTCGTTTTACTAATCAAGGAAGAGGAGGATGACGCGGTTAACTTGTCACTTGG 
TTTTTGGAAATTGCACGAGATAGGTTTAATAACACCGTTCTTGAGAAAGACGTTTGAGAT 
CGTCGATGACAAAGTAACAGACCCGGTTGTATCATGGAGCCCGACCCGTAAAAGCTTTAT 
CATTTGGGATTCTTACGAGTTCTCAGAGAATCTACTTCCCAAATACTTCAAGCACAAGAA 
CTTCTCCAGTTTTATTCGTCAGCTTAACTCTTACGGTTTTAAAAAGGTCGATTCAGATAG 
GTGGGAATTTGCTAACGAAGGGTTTCAAGGAGGGAAGAAACATTTGCTTAAGAACATCAA 
GAGGAGAAGCAAAAACACTAAATGTTGTAACAAGGAAGCGAGTACCACCACGACAGAGAC 
TGAGGTTGAGTCATTGAAGGAGGAACAGAGTCCAATGAGATTGGAGATGTTGAAGCTGAA 
ACAACAACAAGAAGAATCTCAACATCAGATGGTCACTGTGCAGGAGAAGATCCACGGAGT 
TGATACCGAACAACAGCATATGCTTAGTTTCTTTGCAAAGTTGGCTAAAGATCAAAGATT 
TGTAGAGAGACTGGTGAAGAAGAGAAAGATGAAAATACAGAGAGAGCTAGAAGCAGCTGA 
ATTCGTGAAGAAGCTCAAGTTGCTTCAGGATCAAGAAACTCAAAAGAACTTGTTAGATGT 
AGAAAGAGAATTTATGGCCATGGCTGCAACAGAACACAATCCCGAGCCTGACATTTTGGT 
GAACAATCAAAGCGGGAATACGAGATGTCAGCTTAACTCAGAGGACCTACTTGTTGACGG 
TGGCTCAATGGATGTAAATGGGAGGATAGAGATAGAGTAGAGCAAAACCGGTAACATAGC 
AATAGAGAAGGTACCAAATCCCAAGGCTTGAGATCCGAAT 

>G818 Amino Acid Sequence (domain in AA coordinates: 70-162) 
MTAIPNVVDIESSSSSLCQETATETVTVERGSSDSSSKPDDVVLLIKEEEDDAWLSLGF 
WIOiHEIGLITPFLRKTFEIVDDKVTDPVVSWSPTRKSFIlWDSYEFSENLLPKYFKHKNF 
SSFIRQLNSYGFKKVDSDRWEFANEGFQGGKKHLLKNIKRRSKNTKCCNKEASTTTTETE 
VESLKEEQSPMRLEMLKLKQQQEESQHQMVTVQEKIHGVDTEQQHMLSFFAKLAKDQRFV 
ERL VKKRKMKI QRELEAAE F VKKLKLLQDQETQKNLLDVERE FMAKAATEHNPEPDI LVN 
NQSGNTRCQLNS EDLLVDGGSMDVNGRI E IE* 
>G849 (218.. 2077) 

AACTCGAGAATTCTTCATTTCTTTTAAATCTTAGAATCTCGAGTTTTTGTATAAATCGAT 

TCTAATTTTTCCTTTGTACATTGTTTTATATATACATAAAACACACAAATCGGGTATGGG 

GGAATTTGGGTTTTAAGATAGCGTGATCTGTAATAATAAGTGGTTCGCGATCGTGATCAA 

GT^AACTGGTGGCTGATAGTGATATGCATATTTGAGAGATGGTGTTCAAGAGAAAGTTAGA 

TTGCCTTTCCGTGGGATTTGATTTTCCCAACATTCCCAGAGCTCCTCGTTCATGCAGGAG 

GAAGGTTCTAAACAAGAGGATTGATCATGATGATGATAACACTCAGATCTGTGCAATTGA- 

CTTACTAGCTTTGGCTGGAAAGATTCTACAGGAAAGCGAGAGTTCCTCTGCGTCTTCTAA 

TGCATTTGAAGAAATTAAGCAAGAGAAAGTAGAAAATTGCAAGACTATTAAATCTGAGTC 

TTCTGACCAAGGAAACTCTGTGTCAAAGCCTACTTATGATATCTCTACTGAGAAGTGTGT 

GGTGAACAGTTGTTTTTCATTTCCGGATAGTGACGGCGTTTTGGAGCGGACTCCGATGTC 

TGATTACAAGAAGATTCATGGTTTGATGGATGTAGGGTGTGAAAACAAGAATGTAAATAA 

TGGGTTCGAGCAAGGAGAAGCAACCGATCGCGTGGGTGATGGAGGCTTAGTCACTGATAC 

TTGCAACTTAGAGGATGCAACTGCGTTAGGTCTGCAGTTTCCGAAATCAGTCTGTGTGGG 

TGGTGATTTAAAATCACCATCCACCTTGGATATGACCCCTAATGGTTCCTATGCTAGACA 

TGGGAACCATACTAACCTAGGTAGAAAAGATGATGATGAAAAATTCTATAGTTACCATAA 

ACTTAGCAATAAATTTAAGTCGTATAGGTCTCCAACAATTCGAAGAATAAGAAAGTCCAT 

GTCGTCCAAATACTGGAAACAAGTTCCAAAAGATTTTGGATACAGTAGAGCTGATGTGGG 

TGTGAAGACTCTTTATCGCAAAAGAAAATCATGTTATGGTTACAACGCATGGCAGCGTGA 

GATC^TTTATAAGAGAAGAAGATCACCTGACAGAAGCTCGGTCGTAACTTCTGATGGAGG 

ACTCAGTAGTGGAAGTGTTTCCAAGTTACCCAAGAAGGGAGATACAGTAAAGCTAAGCAT 

TAAGTCCTTTAGGATTCCAGAGCTTTTTATTGAAGTTCCAGAAACTGCAACAGTAGGATC 

ACTAAAGAGGACTGTGATGGAGGCTGTCAGTGTTTTACTCAGCGGAGGAATACGTGTTGG 

GGTGTTAATGCATGSGAAGAAGGTTAGAGATGAAAGGAAAACTCTGTCCCAGACTGGGAT 

CTCATGTGATGAAAATCTAGACAACCTTGGGTTCACCTTGGAGCCTAGTCCCAGCAAAGT 
TCCCCTACCTTTGTGTTCTG^ 

ACGGTCTGCGGCGTCTCCTATGCTAGATTCTGGAATTCCACATGCAGATGACGTGATTGA 
TTCAAGAAATATTGTGGACAGTAACCTCGAATTAGTTCCATATCAGGGTGACATATCTGT 
TGATGAACCTTCATCAGATTCAAAAGAGCTTGTCCCACTTCGAGAGTTGGAAGTCAAGGC 
GCTTGCCATAGTTCCGTTGAACCAGAAACCTAAGCGTACTGAGCTAGCCCAGAGGAGAAC 
TAGGAGACCCTTCTCTGTGACAGAGGTAGAAGCTCTTGTACAAGCAGTTGAGGAACTCGG 
GACTGGAAGATGGCGTGATGTAAAATTGCGTGCTTTCGAGGATGCAGATCATCGGACTTA 
CGTGGACTTGAAGGACAAATGGAAGACGCTAGTTCACACAGCAAGTATATCCCCACAGCA 
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ACGAAGAGGAGAGCCGGTGCCACAAGAACTGCTAGACAGAGTCTTGAGGGCATACGGGTA 

TTGGTCGCAGCACCAAGGAAAACATCAGGCGAGAGGAGCGTGCAAAGATCCAGACATGAA 

CAGAGGTGGAGCTTTTGAATCAGGTGTTTCAGTGTAAA7^AAGGAGGTACGCATTGGTGGG 

TGGGTGTACAGAAG CAAACAACAC AATAAATGGAC AACTCAATTTCTG C AAAGTTTAATT 

GTCTTTATTTCTCGTTTTTTTTTTTTTTTCTCCTACATACACTTTTTTTTTTCT 

>G849 Amino Acid Sequence (domain in AA coordinates: 324-413, 504-583) 

MVFKRKLDCLSVGFDFPNIPRAPRSCRRKVLNKRIDHDDDNTQICAIDLLALAGKILQES 

ESSSASSNAFEEIKQEKVENCKTIKSESSDQGNSVSKPTYDISTEKCWNSCFSFPDSDG 

VLERTPMSDYKKIHGLMDVGCENKNVNKGFEQGEATDRVGDGGLVTDTCNLEDATALGLQ 

FPKSVCVGGDLKSPSTLDMTPNGSYARHGNHTNLGRKDDDEKFYSYHICLSNKFKSYRSPT 

IRRIRKSMSSKYWKQVPKDFGYSRADVGVKTLYRKRKSCYGYNAWQREIIYKRRRSPDRS 

SWTSDGGLSSGSVSIOiPKKGDTVKLSIKSFRIPELFIEVPETATVGSLKRTVMEAVSVL 

LSGGIRVGVLMHGKKVRDERKTLSQTGISCDENLDNLGFTLEPSPSKVPLPLCSEDPAVP 

TDPTSLSERSAASPMLDSGIPHADDVIDSRWIVDSNLELVPYQGDISVDEPSSDSKELVP 

LPELEVKALAIVPLNQKPKRTEIiAQRRTRRPFSVTEVEALVQAVEELGTGRWRDVKLRAF 

EDADHRT YVDLKDKWKTLVHTAS I S PQQRRGEPVPQELLDRVLRAYG YWSQHQGKHQARG 

ASKDPDMNRGGAFESGVSV* 

>G892 (21.. 1004) 

TATAACAATTCCTTCCAACAATGTCATTGAGTCAGCCAATAACACGGACCGATAGTGCAC 

CCAATGGAGCATTTAGGACTTTTGGTCTCTACTGGTGCTACCATTGTGATCGTATGGTCA 

GAATTGCATCCTCTAACCCATCAGAGATCGCCTGTCCTCGATGTTTGAGGCAATTTGTCG 

TTGAGATTGAAACGAGACAACGGCCTCGGTTTACTTTCAACCATGCTACTCCGCCTTTTG 

ATGCTTCTCCTGAGGCTCGTCTTCTCGAAGCTCTCTCGCTCATGTTTGAGCCTGCAACCA 

TAGGTAGGTTTGGTGCAGACCCATTTCTTAGGGCAAGATCCAGAAACATCTTGGAACCTG 

AATCAAGACCCCGACCGCAACATCGAAGACGACACAGCCTTGACAATGTTAACAATGGTG 

GTTTACCTCTACCAAGAAGAACATATGTTATTCTCCGGCCCAATAATCCGACTAGTCCAC 

TCGGAAACATAATTGCGCCACCAAATCAAGCACCACCACGGCATGTGAACTCACATGATT 

ACTTTACTGGAGGATCAAGCTTAGAGCAGCTGATTGAACAACTAACACAAGACGATAGGC 

CTGGACCACCACCTGCGTCAGAACCCACCATTAATTCCCTACCATCTGTGAAAATAACAC 

CACAACATCTAACTAACGACATGTCCCAATGCACAGTGTGCATGGAAGAATTCATTGTTG 

GTGGGGACGCAACGGAATTACCATGTAAACATATTTACCATAAAGATTGTATAGTCCCGT 

GGCTTAGGCTTAACAATTCTTGCCCTATCTGCCGCCGTGACCTGCCACTTGTCAACACCG 

TTGCTGAATCTCGAGAAAGGAGCAATCCTATTAGACAAGACATGCCTGAAAGAAGGCGTC 

CAAGGTGGATGCAACTCGGTAACATTTGGCCATTTAGAGCAAGATACCAAAGGGTTAGTC 

CAGAAGAAACAGCAAACCAGAATCCTCGAGATAACAGGAGCTAACTCTGAATATTCCATG 

GGAAATAAAAATCGTGACTATCTATATGTATAGACTCTATGAGACATTGTCTATTTGAAT 

GTGCATGTATATCTCAGAAATAAACTCAAGCGAAACATATTTAACGACTAAAAAAAA 

>G892 Amino Acid Sequence (domain in AA coordinates: 177-270) 

MSLSQPITRTDSAPNGAFRTFGLYWCYHCDRMVRIASSNPSEIACPRCLRQFWEIETRQ 

RPRFTFNHATPPFDASPEARLLEALSLMFEPATIGRFGADPFLRARSRNILEPESRPRPQ 

HRRRHSLDNVNNGGLPLPRRTYVILRPNNPTSPLGNIIAPPNQAPPRHVNSHDYFTGASS 

LEQLIEQLTQDDRPGPPPASEPTINSLPSVKITPQHIiTNDMSQCTVCMEEFIVGGDATEL 

PCKHIYHKDCIVPWLRI^SCPICRRDLPLWTVAESRERSNPIRQDMPERRRPRWMQLG 

NI WPFRARYQRVSPEETANQNPRDNRS * 

>G961 (1..1200) 

ATGTCAAAATCTATGAGCATATCAGTGAACGGACAATCTCAAGTGCCTCCTGGGTTTAGG 
TTTCATCCGACCGAGGAAGAGCTGTTGCAGTATTATCTCCGGAAGAAAGTTAATAGCATC 
GAGATCGATCTTGATGTCATTCGCGACGTTGATCTCAACAAGCTCGAGCCTTGGGACATT 
CAAGAGATGTGTAAAATAGGAACAACGCCACAAAACGACTGGTATTTCTTTAGCCACAAG 
GACAAAAAATATCCGACGGGAACGAGAACTAACAGAGCCACTGCGGCTGGATTTTGGAAA 
GCAACTGGCCGCGACAAGATCATATATAGCAATGGCCGTAGAATTGGGATGAGAAAGACT 
CTTGTT1TCTACAAAGGCCGAGCTCCTCACGGCCAAAAATC 

TATAGACTCGATGACAACATTATTTCCCCCGAGGATGTCACCGTTCATGAGGTCGTGAGT 
ATTATAGGGGAAGCATCACAAGACGAAGGATGGGTGGTGTGTCGTATTTTCAAGAAGAAG 
AATCTTCACAAAACCCTAAACAGTCCCGTCGGAGGAGCTTCCCTGAGCGGCGGCGGAGAT 
ACGCCGAAGACGACATCATCTCAGATCTTCAACGAGGATACTCTQGACCAATTTCTTG7UV 
CTTATGGGGAGATCTTGTAAAGAAGAGCTAAATCTTGACCCTTTCATGAAACTCCCAAAC 
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CTCGAAAGCCCTAACAGTCAGGCAATCAACAACTGCCACGTAAGCTCTCCCGACACTAAT 

CATAATATCCACGTCAGCAACGTGGTCGACACTAGCTTTGTTACTAGCTGGGCGGCTTTA 

GACCGCCTCGTGGCCTCGCAGCTTAACGGACCCACATCATATTCAATTACAGCCGTCAAT 

GAGAGCCACGTGGGCCATGATCATCTCGCTTTGCCTTCCGTCCGATCTCCGTACCCCAGC 

CTAAACCGGTCCGCTTCGTACCACGCCGGTTTAACACAGGAATATACACCGGAGATGGAG 

CTATGGAATACGACGACGTCGTCTCTATCGTCATCGCCTGGCCCATTTTGTCACGTGTCG 

AATGTTTTGCTGCTTGTTTGTCTCCTTCGTCTGCAGCTTCAGTTCTGGCCGTTCCAACCA 

TGGCAGAGGCAGGTTCATTTCGATCTTTCATCGCCTCAGATGCAGATCTCTCTCCATTGA 

>G961 Amino Acid Sequence (conserved domain in AA coordinates: 15-140) 

MSKSMSISVNGQSQVPPGFRFHPTEEELLQYYLRKKVNSIEIDLDVIRDVDLNKLEPWDI 

QEMCKIGTTPQNDWYFFSHKDKKYPTGTRTNRATAAGFWKATGRDKIIYSNGRRIGMRKT 

LVFY'KGRAPHGQKSDWIMHEYRLDDNI ISPEDVTVHEWS I IGEASQDEGWWCRIFKKK 

NLHKTLNSPVGGASLSGGGDTPKTTSSQI FNEDTLDQFLELMGRSCKEELNLDPFMKLPN 

LESPNSQAINNCHVSSPDTNHNIHVSNWDTSFVTSWAALDRLVASQLNGPTSYSITAVN 

ESHVGHDHLALPSVRSPYPSLNRSASYHAGLTQEYTPEMELWNTTTSSLSSSPGPFCHVS 

NVLLLVCLLRLQLQFWPFQP WQRQVHFDLS SPQMQISLH* 

>G1465 (163. .1125) 

TATCCTTCGCAAGACCCTTCCTCTATATAAGGAAGTTCATTTCATTTGGAGAGGACACGC 
TGACAAGCTGACTCTAGCTTATCTGGTACCGTCGACCTCATTCTTGCGTTTGATCTTTCT 
TTCTCTAGATCCCATATTTTTCTTGATCAATTTAGTTTCATTATGGAGGAAGATGCAGCT 
TTTGATCTACTCAAAGCCGAACTCTTAAACGCAGAAGACGATGCAATAATCTCACGTTAT 
CTGAAGCGTATGGTCGTCAACGGAGACTCATGGCCTGATCACTTCATCGAAGACGCAGAC 
-GTGTTCAACAAGAATCCAAATGTGGAGTTCGATGCTGAGAGCCCTAGCTTCGTGATAGTT 
AAACCTCGAACAGAGGCTTGTGGTAAAACCGATGGATGTGAAACTGGTTGCTGGAGGATC 
ATGGGTCGTGATAAACCGATAAAATCGACGGAGACTGTGAAGATTCAAGGGTTCAAGAAG 
ATTCTCAAGTTCTGCCTAAAGAGGAAACCTAGAGGATACAAGAGAAGTTGGGTAATGGAA 
GAGTATAGGCTTACCAATMCTTGAACTGGAAGCAAGATCATGTGATTTGCAAGATTCGG 
TTTATGTTTGAAGCTGAAATC^GTTTCTTGCTAGCCAAGCATTTCTACACTACATCAGAA 
TCACTTCCTCGAAATGAGCTGTTGCCAGCTTACGGATTCCTTTCATCAGATAAGCAATTG 
GAGGATGTATCTTATCCGGTGACGATAATGACTTCTGAAGGAAACGATTGGCCTAGCTAC 
GTTACCAACAATGTGTATTGTCTGCATCCATTGGAGCTCGTTGATCTTCAAGATCGGATG 
TTTAATGATTACGGAACCTGCATCTTCGCTAACAAGACTTGTGGTAAAACCGATAGATGC 
ATTAATGGTGGTTACTGGAAAATTTTGCACCGTGATAGGCTGATCAAGTCAAAGTCCGGG 
ATAGTTATTGGTTTCAAGAAGGTGTTTAAGTTTCATGAAACGGAGAAAGAAAGATACTTC 
TGTGGTGGAGAAGATGTGAAGGTAACTTGGACTCTAGAAGAGTATAGGCTTAGCGTGAAG 
CAGAATAAATTCTTGTGCGTTATCAAGTTTACTTATGATAACTAAGAATCTTTTCTTTGG 
ATTTTATGATCATCTTAGTATCGCGACCGCTCTAGACAGGCCTCGTACCGGATCCTCTAG 
CTAGAGCTTTCGTTCGTATCATCGGTTTCGACAACG 

>G1465 Amino Acid Sequence (conserved domain in AA coordinates: 242-306) 

MEEDAAFDLLKAELLNAEDDAIISRYLKRMVVNGDSWPDHFIEDADVFNKWPNTOFDAES 

PSFVIVKPRTEACGKTDGCETGCWRIMGRDKPIKSTETV7CIQGFKKILKFCLKRKPRGYK 

RSWV74EEYRLTNNLNWKQDHVICKIRFMFEAEISFLLAKHFYTTSESLPRNELLPAYGFL 

SSDKQLEDVSYPVTIMTSEGNDWPSYVimrVYCLHPLELTOLQDRMFNDYGTCIFANKTC 

GKTDRCINGGYWKILHRDRLIKSKSGIVIGFKKVFKFH^ 

YRL S VKQNKFLCVI KFT YDN * 

>G425 (45.. 1196) 

GAAAACAGTCTTCTCTTCTCCGATCCCAAAAACGCAGGAAAAC 

ACCTCCTTCCTCCAeAAGAAGACCTTCCTCTCCGACACTTCACCGATCAATCACAGCAACCTC 

CGCCGCAGCGTCACTTCTCTGAAACACCTTCGCTTGTCACCGCCAGTTTCCTCAACCTCCCTA 

CCACCCTTACCACTGCGGATTCCGATCTCGCTCCTCCGCACCGCAACGGAGACAATTCCGTT 

GCTGATACAAACCCACGCTGGCTCTCCTTTCATTCGGAGATGCAAAATACTGGAGAAGTACG 

TTCTGAAGTTATCGACGGAGTCAACGCCGATGGTGAAACGATACTCGGCGTTGTAGGAGGT 

GAAGATTGGCGGAGTGCTAGCTATAAGGCGGCGATTTTAAGACATCCGATGTACGAGCAGC 

TTCTTGCGGCTCACGTGGCTTGCCTTAGGGTTGCGACTCCCGTTGACCAGATTCCGAGGATC 

GATGCTCAGCTCAGTCAGTTGCATACCGTCGCCGCGAAATACTCCACTCTTGGTGTGGTTGTT 

GACAACAAGGAACTTGATCATTTCATGTCACATTATGTTGTCTTGTTATGTTCATTTAAAGAACA 

ACTCCAACACCACGTTTGTGTCCATGCAATGGAAGCCATTACGGCTTGTTGGGAGATTGAACA 
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ATCACTGCAATCCCTAACTGGAGTTTCTCCAAGTGAAAGTAATGGTAAGACAATGTCGGATGA 

TGAAGATGATAATCAAGTAGAGAGCGAGGTGAACATGTTTGATGGAAGTTTGGACGGTTCAG 

ATTGCTTGATGGGGTTTGGTCCTCTTGTTCCAACCGAGAGAGAGAGATCTTTGATGGAACGTG 

TGAAGAAAG AACTG AAG CATGAG CTTAAAC AGGGTTTCAAAGAG AAG ATTGTGG ACATAAG 

AGAAGAGATAATGAGGAAGAGAAGGGCGGGAAAGTTGCCAGGAGATACGACTTCTGTACT 

CAAAGAATGGTGGCGAACTCACTCGAAATGGCCATACCCAACTGAGGAAGATAAGGCAAAA 

GAAACTGGAACAGCAACTCTTCCACGTCATCTACTCTCACCAAGAACAAACGTAAACGGACC 

GGGAAGTCGTAGGTGACATAGCGGCTAACTAGAGGATGGTTCTTTGCCATGTGAATTCTTGG 

GAACCGTATATGAAAGAAACGAATCCGGTTCTATGCTCGTACAGAGTGTGTTATTTGTATAGT 

GGATACCGGTTAGCCTATGAAACCGGATTCTGGAGTCCAAATTGTTGTTTGTAACGACTTAGT 

AGTTTTTGGAAGTGATCTGTTTCGTTGGTTTGCGTCTTGTAACGAACGCTTAAGCAAGTGTGGG 

TTTTTTCTTGTAAAGTGTCAATATGTTCGTTTGTAATGAATGTATCAAGCAATATTTATCATAATT 

AAACTAGCTTGAAATGTAAAAAAAAAAAAAAAA 

>G425 Amino Acid Sequence (domain in AA coordinates: TBD) 
MSFNSSHLLPPQEDLPLRHFTDQSQQPPPQRHFSETPSLVTASFLNLPTTLTTADSDLAPPHR 
NGDNSVADTNPRWLSFHSEMQNTGEVRSEVIDGVWADGETILGWGGEDWRSASYKAAILR 
HPMYEQLLAAHVACLRVATPVDQIPRIDAQLSQLOT 

LCSFKEQLQHHVCVHAMEAITACWEIEQSLQSLTGVSPSESNGKTMSDDEDDNQVESEVNM 
FDGSLDGSDCLMGFGPLVPTERERSLMERVKKELKHELKQGFKEKIVDIREEIMRKRRAGKLP 
GDTTS VLKEWWRTHS KWPYPTEEDKAKLVQETGLQLKQINNWF INQRKRNWNSNSSTSSTLT 
KNKRKRTGKS* 
>G347 (1..570) 

atgaaagtagcagatatgcaggaccagctggtgtgtcatggttgtaggaatttattgatg 
tatcctagaggagcatctaatgtgcgttgtgcgttatgtaacactatcaacatggttcct 
cctcctcctccacctcacgacatggcacacattatatgtggtggttgtagaacaatgctt 
atgtatacgcgtggggctagtagcgtaagatgctcttgctgtcaaactacgaaccttgtg 
ccagcgcactccaatcaggttgcccatgctccttccagtcaggttgcgcagatcaattgt 
gggcattgtcggacgaccctcatgtatccttacggtgcatcatccgtcaaatgcgctgtt 
tgtcaattcgtaactaacgttaatatgagcaatggaagggtacctctcccaactaaccgg 
ccaaatggaacagcttgtcccccctctacatcaacttcaacaccaccctctcagacccaa 
accgttgttgtagaaaaccccatgtccgttgatgaaagcggaaagttggtgagcaatgtt 
' gttgttggagtgacaactgacaaaaagtaa 
>G347 Amino Acid Sequence (domain in AA coordinates: 9-39, 50-70, 80-127) 
MKVADMQDQLVCHGCRNLLMYPRGASNTOCALCNTINWPPPPPPHDMAHIICGGCRTML 
MYTRGAS S VRCSCCQTTNLVPAHSNQVAHAPS SQVAQINCGHCRTTLMYP YGASSVKCAV 
CQFVTNVNMSNGRVPLPTNRPNGTACPPSTSTSTPPSQTQTVVVENPMSVDESGKLVSNV 

WGVTTDKK* 
>G1512 (1..732) 

ATGGAAGGGAACTTCTTCATCAGGTCTGATGCTCAACGAGCACATGACAATGGCTTCATA 

GCCAAACAAAAACCTAATCTCACCACGGCTCCAACAGCAGGTCAAGCTAATGAAAGTGGC 

TGTTTTGACTGCAACATCTGTTTAGACACAGCCCATGATCCGGTGGTCACTCTCTGCGG^ 

CACCTTTTCTGCTGGCCTTGCATTTACAAGTGGTTACATGTTCAGTTATCTTC 

GTTGATCAGCACCAGAACAATTGCCCTGTTTGTAAATCCAACATTACTATCACCTCTTTG 

GTTCCTCTCTATGGAAGAGGCATGTCTTCGCCTTCTTCCACGTTTGGCTCCAAGAAACAA 

GACGCACTGTCCACTGACATACCCCGCAGACCTGCTCCATCAGCCTTACGCAATCCGAW 

ACGTCAGCATCATCTCTGAACCCAAGCTTGCAACATCAAACTCTGTCTCCTTCATTTCAT 

AATC^TCAGTATTCCCCTCGTGGCTTCACCACAACCGAATCAACCGACCTTGCGAATGC^ 

GTAATGATGAGTTTCCTCTACCCTGTGATTGGAATGTTTGGAGACCTGGTCTACACCAGG 

ATATTCGGGACCTTCACAAACAC^TAGCTCAGCCTTACCAAAGCCAGAGGATGATGCAG 

CGTGAGAAGTCTCTTAATCGGGTATCGATATTCTTCCTTTGTTGCATCATCCTTTGCCTC 

CTTCTCTTCTAG 

>G1512 Amino Acid Sequence (domain in AA coordinates: 39-93) 
MEGNFFIRSDAQRAHDNGFIAKQKPNLTTAPTAGQANESGCFDCNICLDTAHDPVVTIiCG 
HLFCWPCIYKWLHVQLSSVSVDQHQNNCPVCKSNITITSLVPLYGRGMSSPSSTFGSKKQ 
DALSTDIPRRPAPSALRNPITS AS SLNPSLQHQTLSPS FHNHQYS PRGFTTTESTDLANA 
VMMSFLYPVIGMFGDLVYTRIFGTFTNTIAQPYQSQRMMQREKSLNRVSIFFLCCIILCL 

LLF* 
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>G2069 (1..1026) 

ATGGAAGGAGGAGGAAGAGGACCAAATCAAACGATTCTCAGTGAAATAGAACATATGCCT 

GAAGCTCCACGTCAACGTATCTCTCATCACCGTCGAGCTCGCTCTGAAACCTTCTTCTCC 

GGCGAATCAATCGACGATCTCCTCTTATTCGATCCTTCCGATATCGATTTCTCTTCTCTA 

GACTTCCTCAACGCTCCACCACCACCACAACAATCACAACAACAACCGCAAGCTTCTCCC 

ATGTCCGTTGATTCGGAAGAAACCTCATCGAACGGTGTTGTTCCTCCTAATTCTCTTCCT 

CCAAAACCCGAAGCTAGATTCGGTCGCCATGTTCGTAGCTTCTCGGTTGATTCCGATTTC 

TTCGATGATTTGGGTGTTACTGAGGAGAAGTTTATAGCTACAAGTTCAGGAGAGAAGAAG 

AAAGGGAATCATCATCATAGCAGGAGTAATTCTATGGATGGAGAGATGAGTTCGGCGTCG 

TTTAATATCGAATCGATTTTAGCTTCTGTGAGTGGTAAAGATAGTGGGAAGAAGAATATG 

GGTATGGGTGGTGATAGACTTGCTGAGCTTGCTTTGCTTGATCCTAAAAGAGCTAAAAGG 

ATTTTAGCGAATAGACAATCTGCGGCGAGGTCGAAAGAGAGGAAGATTAGGTATACTGGT 

GAGTTAGAGAGGAAGGTTCAGACACTTCAGAATGAAGCTACTACATTGTCTGCTCAAGTC 

ACTATGTTACAGAGAGGAACATCAGAGCTGAACACTGAAAATAAACACCTCAAAATGCGG 

CTTCAAGCTTTAGAGCAACAAGCTGAACTTAGGGATGCTTTGAATGAAGCGCTGCGGGAT 

GAACTGAACCGACTTAAGGTGGTAGCTGGAGAAATTCCTCAGGGGAATGGAAATTCTTAC 

AACCGTGCTCAATTCTCATCTCAGCAATCGGCAATGAATCAGTTTGGGAACAAAACGAAC 

CAACAGATGAGTACAAACGGGCAGCCATCGCTCCCAAGCTACATGGATTTCACCAAGAGA 
GGCTGA 

>G2069 Amino Acid Sequence (domain in AA coordinates- TBD) 

MEGGGRGPNQTILSEIEHMPEAPRQRISHHRRARSETFFSGESIDDLLLFDPSDIDFSSL 

DFLNAPPPPQQSQQQPQASPMSVDSEETSSNGWPPNSLPPKPEARFGRHVRSFSVDSDF 

FDDLGVTEEKFIATSSGEKKKGNHHHSRSNSMDGEMSSASFNIESILASVSGKDSGKKNM 

GMGGDRLAELALLDPKRAKRILANRQSAARSKERKIRYTGELERKVQTLQNEATTLSAQV 

TMLQRGTSELNTENKHLKMRLQALEQQAELRDALNEALRDELNRLKWAGEIPQGNGNSY 

NRAQFSSQQSAMNQFGNKTNQQMSTNGQPSLPSYMDFTKRG* 

>G1852 (55.. 1857) 

CATCTGATCTGCTCTCGAAGACGAAAGCTTCGAGTACTGGTTGAAGCTAAAGCTATGGGA 

CACGTGAATCTACCTGCATCAAAGCGTGGTAACCCTCGTCAATGGCGTCTCCTCGACATC 

GTAACCGCTGCTTTCTTCGGTATCGTACTTCTCTTCTTCATCCTTTTATTCACTCCTCTT 

GGTGATTCCATGGCGGCTTCTGGTCGGCAAACGCTGCTTCTCTCTACGGCGTCAGATCCG 

AGGCAACGGCAGCGATTAGTGACTTTGGTTGAAGCTGGTCAGCATTTGCAACCGATCGAG 

TATTGTCCTGCGGAAGCTGTTGCTCATATGCCTTGTGAGGATCCGAGAAGGAATAGTCAG 

CTTAGTAGAGAGATGAATTTCTATAGGGAGAGACATTGTCCTTTGCCTGAGGAGACTCGG 

CTCTGTTTGATTCCTCCGCCTTCTGGTTATAAAATTCCTGTTCCGTGGCCTGAGAGTCTT 

CACAAGATTTGGCATGCAAACATGCCATATAACAAAATTGCTGACCGGAAAGGTCATCAA 

GGATGGATGAAAAGGGAAGGGGAATACTTTACTTTCCCAGGCGGTGGCACGATGTTTCCT 

GGCGGAGCTGGCCAATACATTGAAAAGCTTGCACAGTATATTCCGCTTAATGGTGGAACT 

TTGAGAACTGCTCTTGACATGGGATGCGGGGTAGCTAGTTTTGGAGGTACTCTACTATCT 

CAAGGCATTCTAGCCCTCTCATTTGCTCCAAGAGArrCACATAAATCTCAAATTCAGTTC 

GCTTTGGAAAGAGGAGTGCCTGCATTTGTTGCCATGCTTGGCACTCGTAGACTCCCCTTT 

CCTGCATACTCCTTTGACCTGATGCACTGTTCCCGATGTTTGATTCCTTTTACGGCTTAC 

AATGCAACTTACTTCATCGAAGTAGATAGGTTACTGCGCCCTGGAGGATATCTTGTAATC 

TCTGGCCCACCTGTACAATGGCCTAAACAAGACAAAGAATGGGCTGATCTTCAGGCGGTG 

GCTAGAGCTTTGTGCTATGAGCTAATTGCGGTTGATGGAAACACTGTCATCTGGAAGAAG 

CCTGTTGGAGATTCATGTCTACCTAGCCAGAATGAGTTTGGGCTTGAGTTGTGTGATGAG 

^^^^^^^^^^^^GAGGTGTGTTACCAGGCCATCA 

TCCGTC^GGAGAACACGCTTTGGGAACTATATCCAAGTGGCCGGAGAGGCTTACTAAA 

CTTCCTTCTAGGGCCATTGTCATGAAAAACGGATTGGATGTGTTTCAAGCAGATGCAAGG 

CGGTGGGCAAGACGCGTTGCTTATTACAGGGATTCTCTTAACTTGAAGCTGAAATCTCCA 

ACTCTCCGCAATGTraTGGAC^TGAAaSCATTCTTCGGAGGCTTTGCAGCAACCCTTGCA 

*™™™ TGTCGGraATCA ^^ 

cl^rrn^ ^ GAGGTCTCATCGGTGmA ^TGATTGGTGTGAACCATlTTCAACA T AT 
?S^ GCACG TATGA ^ ATCCATCTATCAGG ^TTGAATCACTGATAAAACGACAAGAC 
^ GCAAATCGAGGTGTAG ^ 

^™ GA ^^^^^ GG ^ G ^ GATGGGAGAG ^^^' rc ^GGTGCTAGATAAAGTCGCACGAATG 
GCTCATGCTGTAAGATGGTCTTCTTCCATACACGAGAAAGAACCTGAATCCCATGGAAGA 
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GAGAAGATTCTTATCGCAACCAAATCTCTCTGG7\AATTGCCATCAAA.CTCCCACTGAAGA 
CACAAAAGAAGAAGAAAAGAAGAAGCTCTTCTCAATCTTGTAGGTACTGTCACTTGCTCT 
CCAGCCC • 

>G1852 Amino Acid Sequence (domain in AA coordinates: 1-601) 

MGHVNLPASKRGNPRQWRLLDIVTAAFFGIVLLFFILLFTPLGDSMAASGRQTLLLSTAS 

DPRQRQRLVTLVEAGQHLQPIEYCPAEAVAHMPCEDPRRNSQLSREMNFYRERHCPLPEE 

TPLCLIPPPSGYKIPVPWPESLHKIWHANMPYNKIADRKGHQGWMKREGEYFTFPGGGTM 

FPGGAGQYIEKLAQYIPLNGGTLRTALDMGCGVASFGGTLLSQGILALSFAPRDSHKSQI 

QFALERGVPAFVAMLGTRRLPFPAYSFDLMHCSRCLIPFTAYNATYFIEVDRLLRPGGYL 

VISGPPVQWPKQDKEWMLQAVARALCYELIAVDGNTVIWKKPVGDSCLPSQNEFGLELC 

DESVPPSDAWYFKLKRCVTRPSSVKGEHALGTISKWPERLTKVPSRAIVMKNGLDVFEAD 

ARRWARRVAYYRDSLNLKLKSPTVRNVMDMNAFFGGFAATLASDPVWVMNVIPARKPLTL 

DVIYDRGLIGVYHDWCEPFSTYPRTYDFIHVSGIESLIKRQDSSKSRCSLVDLMVEMDRI 

LRPEGKVVIRDSPEVLDKVARMAHAVRWS SSI HEKEPESHGREKILI ATKSLWKLPSNSH 

* 

>G1793 (59.. 1783) 

AGTGATTTATTGATTAACCCAAACACA7VAATAAACAGATTTGACTCAAAAAGAAGAAAAT 
GAATTCTAACAACTGGCTTGGCTTTCCTCTTTCACCGAACAACTCTTCTTTGCCTCCTCA 
TGAATACAACCTTGGCTTGGTCAGCGACCATATGGACAACCCTTTTCAAACACAAGAGTG 
GAATATGATCAATCCACACGGTGGAGGAGGAGATGAAGGAGGAGAGGTTCCAAAAGTGGC 
CGATTTTCTCGGTGTGAGCAAACCGGACGAAAACCAATCCAACCACCTAGTAGCTTACAA 
CGACTCAGACTACTACTTCCATACCAATAGCTTGATGCCTAGCGTCCAATCAAACGATGT 
CGTTGTAGCAGCTTGTGACTCCAATACTCCTAACAACAGTAGCTATCATGAGCTTCAAGA 
GAGTGCTCACAATCTACAGTCACTTACTTTGTCCATGGGGACCACCGCTGGTAATAATGT 
TGTAGACAAAGCTTCACCATCCGAGACCACCGGGGATAACGCTAGCGGTGGAGCACTAGC 
CGTTGTTGAGACGGCCACGCC'AAGACGTGCATTGGACACTTTCGGACAACGAACCTCGAT 
CTATCGTGGTGTCACAAGACATCGATGGACTGGTCGATATGAGGCTCATCTATGGGATAA 
TAGTTGTAGAAGGGAAGGCCAGTCTAGGAAAGGAAGACAAGTTTACTTGGGTGGATATGA 
CAAAGAAGATAAAGCAGCAAGATCATATGATCTAGCTGCACTTAAGTACTGGGGTCCTTC 
AACTACTACTAATTTCCCCATTACAAACTACGAGAAAGAAGTAGAGGAAATGAAGCACAT 
GACGAGACAAGAGTTCGTGGCTGCCATTAGAAGGAAAAGTAGTGGATTTTCGAGAGGCGC 
TTCGATGTATCGAGGAGTTACAAGGCATCACCAACATGGAAGATGGCAAGCAAGGATCGG 
CCGAGTCGCCGGAAACAAAGACCTCTACTTGGGAACTTTTAGCACTGAGGAAGAAGCAGC 
AGAAGCTTACGATATAGCTGCAATAAAGTTTAGAGGACTTAATGCAGTGACCAACTTCGA 
GATCAACCGGTACGACGTGAAAGCCATTCTAGAGAGTAGCACTCTTCCCATCGGAGGAGG 
CGCAGCTAAACGGCTCAAAGAAGCTCAAGCTCTTGAGTCTTCAAGGAAACGCGAGGCGGA 
GATGATAGCCCTTGGTTCAAGTTTCCAGTACGGTGGTGGCTCGAGCACAGGCTCTGGCTC 
CACCTCATCAAGACTTCAGCTTCAACCTTACCCTCTAAGCATTCAACAACCATTAGAGCC 
TTTTCTATCTCTTCAGAACAATGACATCTCTCATTACAACAACAACAATGCTCACGATTC 
CTCCTCTTTTAATCACCATAGCTATATCCAGACACAACTTCATCTCCACCAACAGACCAA 
CAATTACTTGCAGCAACAGTCGAGCCAGAACTCTC^ 

TAGCAATCCGGCTCTGCTTCATGGACTTGTCTCTACCTCTATCGTTGACAACAATAATAA 
CAATGGAGGCTCTAGTGGGAGCTAC^CAOTGC^GCATTTCTTGGGAACCACGGTATTGG 
TATTGGGTCCAGCTCGACTGTTGGATCGACCGAGGAGTTTCCAACCGTTAAAACAGATTA 
CGATATGCCTTCCAGTGATGGAACCGGAGGGTATAGTGGTTGGACCAGTGAGTCTGTTCA 
GGGGTGAAACCCTGGTGGTGTTTTCACTATGTGGAATGAGTAAACAAGGATCTCTTTCTT 
GCGGCACAAGGAATGGGT 

>G1793 Amino Acid Sequence (conserved domain in AA coordinates : 179-255 , 281-34 

MNSmmiiGFPLSPNNSSLPPHEYNLGLVSDHMDNPFQTQEWNMINPHGGGGDEGGEVPKV 

ADFLGVSKPDENQSNHLVAYNDSDYYFHTNSLMPSVQSNDWVAACDSNTPNNSSYHELQ 

ESAHNLQSLTLSMGTTAGNNVVDKASPSETTGDNASGGALAVVETATPR 

IYRGVTRHRWTGRYEAHLWDNSCRREGQSRKGRQVYL^ 
STTTNFPITNYEKETOEMKHMTRQEFVAAI 

GRVAGNKDLYLGTFSTEEEAAEAYDI AAI KFRGLNA VTNFE INRYDVKAI LES STLP IGG 

GA/^KRLKEAQALESSRKREAEMIALGSSFQYGGGSSTGSGSTSSRLQLQPYPLSIQQPLE 

PFLSLQNITOISHYNNIWAHDSSSFNHHSYI^^ 

HSNPALLHGLVSTSIVTJNNITO^^ 
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YDMPSSDGTGGYSGWTSESVQGSNPGGVFTMWNE* 
>G761 (521. .1549 ) 

GGGGCCGACCGGCCGCCCGGGCAGGTCTAGGTTCAAAAGGACTCACAAGAGAGAGATAGT 

ATGATTGATAGGGAAAGAGAGAGAGATGAAAGAAAGTAAAATATATAATAGATTATTAGG 

ACACGAGTGTCATCTTTTGATTTGTGTCTTGTGTGCTCTCTCTTTCTTCTCTTCCTCGAA 

TGATCATCTTTATATAACCCTACTCTCTTTCTCTTTTCCCATTCTTTCATATCATTCTCC 

CTTTCTCTCTCGGGATCTGATCTCTCTTTCCAGTAACCTATTCCCGAGGAGCACTGTCAA 

ATCTTGTCCACTCTTTGATCTTATCTCGATCTCTTTCTCTTTCTAGTCTTGTGTAGTCTT 

CAAACTTGTGATGTTATCTATATAGTAATCACGAGAGAGAATCATACAATAGCTGAAACA 

TAAAGCTTTCTTAGAAGCTTTAA7U\AGGTCTCATCTGGATTATCCTGTTTAATTTCTAGA 

GTTTCTTCAGGCAGATTATTAACCGATCAAGAAGACAAACATGAATTCATTTTCCCACGT 

CCCTCCGGGTTTTAGATTTCACCCGACAGATGAAGAACTTGTAGACTACTACCTGAGGAA 

AAAAGTCGCATCGAAGAGAATAGAAATTGATTTCATAAAGGACATTGATCTTTACAAGAT 

TGAGCCATGGGACCTTCAAGAGTTGTGCAAAATTGGGCATGAAGAGCAGAGTGATTGGTA 

CTTCTTTAGCCATAAAGACAAGAAGTATCCCACAGGGACTCGAACCAATAGAGCAACAAA 

AGCAGGGTTTTGGAAAGCCACCGGAAGAGATAAGGCTATCTATTTGAGGCATAGTCTAAT 

TGGCATGAGGAAAACACTTGTGTTTTACAAGGGAAGAGCCCCAAATGGACAAAAGTCTGA 

TTGGATCATGCACGAATACCGCTTAGAAACCGATGAAAACGGAACTCCTCAGGAAGAAGG 

ATGGGTTGTGTGTAGGGTTTTCAAGAAGAGATTGGCTGCAGTTAGACGAATGGGAGATTA 

CGACTCATCCCCTTCACATTGGTACGATGATCAACTTTCTTTTATGGCCTCCGAGCTCGA 

GACAAACGGTCAACGACGGATTCTCCCCAATCATCATCAGCAGCAGCAGCACGAGCACCA 

ACAACATATGCCATATGGCCTCAATGCATCTGCTTACGCTCTCAACAACCCTAACTTGCA 

ATGCAAGCAAGAGCTAGAACTACACTACAACCACCTGCAATCAAATATCGCGCATGAGGA 

ACAATTGAATCAAGGAAATCAGAACTTCAGCTCTCTATACATGAACAGCGGCAACGAGCA 

AGTGATGGACCAAGTCACAGACTGGAGAGTTCTCGATAAATTTGTTGCTTCTCAGCTAAG 

CAACGAGGAGGCTGCCACAGCTTCTGCATCTATACAGAATAATGCCAAGGACACAAGCAA 

TGCTGAGTACCAAGTTGATGAAGAAAAAGATCCGAAAAGGGCTTCAGACATGGGAGAAGA 

ATATACTGCTTCTACTTCTTCGAGTTGTCAGATTGATCTATGGAAGTGAGCTGAAAGAGA 

AGACATATAAATGCATATATACATATATATATATACGTACACACGAACACTAATCAAGTG 

TAGATGATGATGATGGTACAGATTTATATTTGCTTTGATTGATTCTTACTACATTATTGA 

ACTTATGTCATATGCATATATACATTGCGTATCTATGCATATTTATACTTGTACTCAATA 

TGATTAACCATATATAAACTCTAATCTAAATGTAACTCCAATATTTTTTAAATAGACAAT 
TGTCTCTTCTTATTAGAAAAAAAA 

>G761 Amino Acid Sequence (domain in AA coordinates: 10-156) 

MNSFSHVPPGFRFHPTDEELVDYYLRKKVASKRIEIDFIKDIDLYKIEPWDLQELCKIGH 

EEQSDWYFFSHKDKKYPTGTRTORATKAGFWKATGRDKAIYLRHSLIGMRKTLVFYKGRA 

PNGQKSDWIMHEYRLETDENGTPQEEGWWCRVFKKRLAAVRRMGDYDSSPSHWYDDQLS 

FMASELETNGQRRILPNHHQQQQHEHQQHMPYGLNASAYALNNPNLQCKQELELHYNHLQ 

SNIAHEEQLNQGNQNFSSLYMNSGl^QViynDQVTDWRVLDKFVASQLSNEEAATASASIQN 

NAKDTSNAEYQVDEEKDPKRASDMGEEYTASTSSSCQIDLWK* 

>G1056 (10.. 798) 

GCTACATATATGGGTTCTATTAGAGGAAACmTTGAAGAGCCTATATCTC^GTC^TTAACG 

AGGCAGAACTCTCTCTATAGCTTAAAGCTCCATGAGGTTCAAACCCACTTAGGAAGTTCT 

GGAAAACCACTAGGAAGCATGAACCTTGATGAGCTTCTCAAGACTGTCTTGCCACCAGCT 

GAGGAAGGGCTTGTTCGTCAGGGAAGCTTGACGTTACCTCGAGATCTCAGTAAAAAGACA 

GTTGATGAGGTCTGGAGAGATATCCAACAGGACAAGAATGGAAACGGTACTAGTACTACT 

ACTACTCATAAGCAGCCTACACTCGGTGAAATAACACTTGAGGATTTGTTGTTGAGAGCT 

GGTGTAGTGACTGAGACAGTAGTCCCTCAAGAAAATGTTGTTAACATAGCTTCAAATGGG 

CAATGGGTTGAGTATCATCATCAGCCTCAACAACAACAAGGGTTTATGACATATCCGGTO 

TGCGAGATGCAAGATATGGTGATGATGGGTGGATTATCGGATACACCACAAGCGCCTGGG 

AGGAAAAGAGTAGCTGGAGAGATTGTGGAGAAGACTGTTGAGAGGAGACAGAAGAGGATG 

ATCAAGAACAGAGAATCTGCAGCACGTTCACGAGCTAGGAAACAGGCTTATACACATGAA 

TTAGAGATCAAGGTTTCAAGGTTAGAAGAAGAAAACGAAAAACTTCGGAGGCTAAAGGAG 

GTGGAGAAGATCCTACCAAGTGAACCACCACCAGATCCTAAGTGGAAGCTCCGGCGAACA 

AACTCTGCTTCTCTCTGATCCTAAAGACTCTTCTTTCTTTCTTCTTCTTTGTGTTGGTTT 

ATATCAGACCGCTTTGTTCTTTGTATATTGTGTAGACTTTATTGACTTTGAACAGCATGT 
CTTTATAAACATTTCTTGAGTGT 
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>G1056 Amino Acid Sequence (domain in AA coordinates 183-246) 
MGSIRGNIEEPISQSLTRQNSLYSLKLHEVQTHLGSSGKPLGSMNLDELLKTVLPPAEEG 
LWQGSLTLPRDLSKKTVDEVWRDIQQDKNGNGTSTTTTHKQPTLGEITLEDLLLRAGVV 
TETWPQENWNIASNGQWVEyHHQPQQQQGFMTYPVCEMQDMVMMGGLSDTPQAPGRKR 

VAGEIVEKTVERRQKRMIKNRESAARSRARKQAYTHELEIKVSRLEEENEKLRRLKEVEK 

ILPSEPPPDPKWKLRRTNSASL* 

>G1447 (82.. 1086) 

AAAAACCCTAACCCTAATTCTCTCAAGACAACTCAAAGGTCTCTCCTTTTTTAGGTTTAT 
TATCACTTCCGTATAATCGCCATGTCTTCTCTACCATGGAAAAAACCAAAATCGAGTCGA 
ATCTTAAGATTCATTTCTGAGTTTCAACAATCACCGTTCGTTGAAACTGGCTTTCCAACT 
TCTCTGATCGATCTCTTCTTCAAGAATCGCGATCGTCTAAAAAAATCTCCATCTAAACGC 
TTCCAACGAATCGAACGCCAGATTCGAACCGCTCCAAACGCTTCTTCGTTGAGTAATCAA 
GATACGATTTTTGAAAAGCCCTCGAGGATTAAAACCGTTCGAAGTAAGGTCGAGAAAGTT 
AATTGCGTTAAAGGTAAATCAGCGGCGTTGAAGAAGAACGCGATTAAAAATAGCGTTTTC 
GGCGGTAGCGGTGAGGTCGTTTTGATGGCGTTTAAGGTTTTGATAGTAGCGTTGCTCGCC 
TTGAGCACGAAGAAGAAGCTCACTTTAGGAATCACTCTCTCTGCCTTCGCTCTTCTCTTA 
ACAGAGCTCGTGGCGGCGCGTGTTTTCACGCGCTCTAATAACACCGACAAAGACAAAAAC 
GCGATTGCCCGCGAGAAAATCGAAACTTTTGATGAAACTCGAGTTCCCAAAGCGATTCCA 
TGTCCTGAGGAAACAGAGCATGTAGTATCTGAAACAGAGGTTTCGAAGTTGAAAGGTTTA 
ACGATACGTGATCTGTTGTCAAAGGACGAGAAATCAACAAGTAAAAGTTGGAGACTAAAA 
TCGAAGATTGTGAAGAAGTTGAGGAGTTACAATAAGAAGGATAAGAAGACGATGAAGATC 
A7UVGAAGAGTCTTTGATTGAAGTCTCGAGTTTGGTTTTAGAAGATAAACCAAAGAAAATT 
GAGTCTGAGAGAGACGAAGAAGAAACGTTGAATCCTCCAGTGGTTGGATCAAACCTGAAT 
GGGATTGTTCTGATCGTGATTGTGCTAACCGGTTTGTTATGTGGGAAGGTCTTAGCTATT 
GTTCTGACACTATCATGTTTGGTTCTTAGATTAGGAGCAGTCAATU^AAGTTAATCTTTGC 
ATATAATTTTTTTTGTATTTTTTAACATGCTTGCATGTGAAACTGTAAATTTTTCTCATT 
CATATGAAGGAGATTGGATTGAATGTTGAATACTAAA 

>G1447 Amino Acid Sequence (domain in AA coordinates: 3-54, 124-156) 
MSSLPWKKPKSSRILRFISEFQQSPFVETGFPTSLIDLFFKNRDRLKKSPSKRFQRIERQ 
IRTAPNASSLSNQDTI FEKPSRI KTVRS KVEKA/NCVKGKSAALKKNAIKNS VFGGSGEVV 
LMAPKVLIVALLALSTKK3<LTLGIT^ 

ETFDETRVPKAIPCPEETEHWSETEVSKLKGLTIRDLLSKDEKSTSKSWRLKSKIVKKL 
RSYWKKDKKTMKIKEESLIEVSSLVLEDKPKKIESERDEEETLNPPVVGSNLNGIVLIVI 
VLTGLLCGKVLAIVLTLSCLVLRLGAVKKVNLCI * 
>G323 (77.. 826) 

CTGCTCATATCAGCCATTGACACAGTTGCTTTGGGTTTCCCTCAAACGGCGCCGATTGTC 
TGGATTTTGACCACTGATGGCCTTAGATCAATCTTTTGAAGATGCTGCTTTACTTGGAGA 
ACTCTATGGAGAAGGTGCATTTTGTTTCAAGAGCAAGAAACCTGAACCCATTACAGTCTC 
GGTTCCTTCTGATGATACTGATGATTCGAATTTTGACTGCAATATTTGCTTAGACTCGGT 
GCAAGAACCTGTTGTGACTCTCTGTGGTC^CCTCTTT^ 

GCTTGATGTACAGAGCTTCTCAACAAGTGATGAATACCAAAGACATAGACAGTGTCCTGT 
TTGTAAATCTAAAGTTTCTCATTCTACTTTGGTTCCTTTGTATGGTAGAGGCCGTTGTAC 
TACTCAGGAGGAAGGTAAAAACAGTGTGCCTAAAAGACCCGTAGGACCGGTTTATCGGCT 
TGAAATGCCGAATTCACCTTATGCAAGTACTGATCTGCGGTTATCACAACGGGTTCATTT 
CAATAGCCCACAGGAAGGTTACTACCCTGTCTCAGGGGTGATGAGCTCGAACAGTTTATC 
ATACTCTGCTGTTTTGGATCCGGTGATGGTGATGGTTGGAGAAATGGTAGCTACGAGGTT 
GTTTGGAACACGAGTGATGGATAGATTTGCGTATCCGGACACTTACAATCTCGCAGGGAC 
TAGCGGGCCGAGGATGAGAAGGCGGATAATGCAGGCAGATAAATCGCTGGGAAGAATCTT 
CTTCTTCTTTATGTGTTGTGTTGTTCTGTGTCTTCTCTTGTTTTAGGTTTTCATAGCTAG 
CTTGGTTCTGCTACTGTTCAGTTTCTTCAGG 

>G323 Amino Acid Sequence (conserved domain in AA coordinates : 48 -96) 

MALDQSFEDAALLGELYGEGAFCFKSKKPEPITVSVPSDDTDDSNFDCNICLDSVQEPW 

TLCGHLFCWPCIHKWLDVQSFSTSDEYQRHRQCPVCKSKVSHSTLVPLYGRGRCTTQEEG 

KNSVPKRPVGPVYRLEMPNSPYASTDLRLSQRVHFNSPQEGYYPVSGVMSSNSLSYSAVL 

DPVMVWGEMVATRLFGTRVM)RFAYPDT^ 

CWLCLLLF* 

>G176 (41.. 1606) 
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AGAAGAAGAAGAAGAAGAGTACCTCATACGTAAACCATTGATGGGCTCTTTTGATCGCCA 

AAGAGCTGTTCCGAAATTCAAAACAGCAACACCGTCACCGCTCCCTCTTTCTCCTTCGCC 

TTACTTCACTATGCCTCCTGGCCTTACTCCCGCCGACTTTCTCGACTCTCCTCTTCTCTT 

CACTTCCTCCAACATTTTGCCGTCTCCTACGACAGGCACATTTCCAGCGCAATCTCTGAA 

CTATAACAATAACGGTTTGCTCATTGACAAAAATGAAATCAAATATGAAGACACAACTCC 

TCCCTTGTTCCTACCATCTATGGTAACTCAGCCTTTACCTCAACTGGATTTATTCAAATC 

CGAAATCATGTCGAGTAACAAAACCTCTGATGACGGCTACAATTGGCGCAAATACGGGCA 

GAAGCAAGTCAAAGG7\AGCGAAAACCCGAGGAGTTACTTCAAATGCACGTATCCAAATTG 

TCTCACAAAGAAGAAAGTAGAGACGTCTCTTGTGAAGGGTCAGATGATTGAGATTGTCTA 

TAAAGGAAGCCACAATCATCCCAAGCCCCAATCCACGAAGCGATCATCTTCCACCGCTAT 

AGCAGCA CATCAG AACAG C AGTAATGGAGACGGTAAAGACATTGGTG AAGATG AAAC AGA 

GGCCAAGAGATGGAAAAGAGAAGAGAATGTGAAGGAGCCAAGAGTGGTGGTTCAGACAAC 

AAGTGATATAGACATTCTTGACGATGGCTACAGATGGAGAAAGTATGGTCAGAAAGTCGT 

CAAGGGTAATCCAAATCCAAGGAGCTATTACAAGTGCACATTTACAGGATGTTTTGTAAG 

GAAACACGTTGAAAGAGCATTTCAAGATCCCAAGTCAGTGATCACAACTTACGAAGGAAA 

ACACAAACACCAAATCCCGACCCCAAGAAGAGGTCCAGTTTTAAGATCTGCTGCAATGGC 

TTCTCCTCTTCTCCCAACTTCGACTACTCCTGATCAACTTCCCGGCGGCGATCCACAGTT 

GCTGAGCTCTCTACGCGTCCTCTTGTCCCGCGTTCTAGCCACCGTCCGTCACGCTTCTGC 

AGATGCCAGACCCTGGGCAGAGCTCGTTGACCGGTCAGCGTTTTCCCGGCCACCATCGCT 

CTCGGAGGCAACGTCACGAGTAAGGAAGAACTTTTCCTATTTCCGAGCCAATTACATAAC 

CTTAGTGGCAATCTTACTCGCCGCGTCTCTGCTCACGCACCCTTTCGCTCTCTTCCTCCT 

CGCATCGCTGGCCGCTTCTTGGCTTTTCCTCTACTTTTTCCGTCCGGCGGATCAGCCGTT 

GGTCATTGGAGGACGCACGTTCTCCGATCTTGAGACGCTAGGGATACTCTGCCTGTCCAC 

TGTGGTGGTGATGTTCATGACCAGCGTTGGATCGCTCTTGATGTCCACTCTAGCAGTTGG 

GATCATGGGCGTGGCCATCCACGGAGCGTTTCGTGCTCCCGAAGACCTGTTTCTTGAAGA 

ACAAGAAGCCATTGGATCTGGACTTTTCGCATTCTTCAACAACAATGCCTCTAATGCAGC 

TGCCGCTGCCATAGCCACCTCAGCAATGTCACGCGTTCGAGTCTGAGATTGTTGAAGAGA 

CTACATTCCTACACCGCATTTCCAAAGTGTGATATTTATTCATATTGAATTGTT 

>G176 Amino Acid Sequence (domain in AA coordinates: 117-173,234-290) 

MGSFDRQRAVPKFKTATPSPLPLSPSPYFTMPPGLTPADFLDSPLLFTSSNILPSPTTGT 

FPAQSLNYNNNGLLIDKNEIKyEDTTPPLFLPSMVTQPLPQLDLFKSEIMSSNKTSDDGY 

NWRKYGQKQVKGSENPRSYFKCTYPNCLTKKKVETSLVKGQMIEIVYKGSHNHPKPQSTK 

RSSSTAIAAHQNSSNGDGKDIGEDETEAKRWKREENVKEPRVWQTTSDIDILDDGYRWR 

KYGQKVVKGNPNPRSYYKCTFTGCFWKHVERAFQDPKSVITTYEGKHKHQIPTPRRGPV 

LRSAAMASPLLPTSTTPDQLPGGDPQLLSSLRVLLSRVLATVRHASADARPWAELVDRSA 

FSRPPSLSEATSRVRKNFSYFRAOTITLVAILLAASLLTHPFALFLIASLAASWLFLYFF 

RPADQPLVIGGRTFSDLETLGILCLSTVWMFMTSVGSLLMSTLAVGIMGVAIHGAFRAP 

EDLFLEEQEAIGSGLFAFFNNNASNAAAAAIATSAMSRVRV* 

>G174 (194.. 1585) 

CCCAATTTGAGATTGTTCGATTTCGATCTACGAGATTCTTACAAGAACATAAGCAGCTTC 
GGTTTTTTGGGATTATCTTATTTGGTCGGATGATGATCTTCTCGATGTCTGTGCTAGGCT 
TTGGGAATTAGATATATTTGGGGTTAAGCTCGAGTCTCTCCGGTTTTGAGTTTACTTGAG 
TTTGTTAGTATTTATGGCTGAGGTGGGAAAAGTTCTGGCTAGTGATATGGAGTTAGACCA 
TTCAAATGAGACTAAAGCAGTGGATGATGTTGTTGCCACTACTGATAAAGCGGAGGTCAT 
ACCAGTGGCTGTAACTAGAACTGAAACCGTTGTTGAAAGTTTGGAATCTACTGACTGTAA 
GGAGCTTGAAAAACTTGTTCCACATACGGTAGCTTCGCAGTCGGAAGTAGATGTTGCTTC 
CCCGGTATCCGAGAAAGCACCGAAGGTTTCTGAAAGTAGCGGTGCATTATCTTTGOVGTC 
TGGTTCGGAAGGGAMAGTCCTTTTATTCGTGAGAAGGTTATGGAAGACGGATACAACTG 
GCGGAAATATGGACAGAAACTTGTGAAAGGAAATGAGTTTGTAAGGAGCTATTACAGGTG 
CACTCACCCTAACTGCAAAGCGAAAAAACAGTTGGAACGGTCTGCGGGTGGACAAGTCGT 
GGATACCGTTTACTTTGGGGAACATGATCACCCAAAGCCTCTTGCTGGTGCTGTTCCTAT 
CAATCAGGATAAGCGAAGTGATGTCTTCACAGCrGTTAGTAAAGAGAAAACATCTGGATC 
CAGTGTTCAGACACTTCGTCAAACCGAACCACCAAAGATCC^TGGAGGATTACATGTTTC 
AGTTATTCCACCAGCTGATGATGTGAAAACTGATATTTCACAATCAAGTAGGATAACGGG 
GGACAACACTCACAAGGATTATAATAGTCCTACCGCAAAGCGAAGGAAGAAAGGAGGGAA 
CATTGAGCTGAGTCCAGTGGAGAGGTCAACCAATGATTCACGCATTGTGGTTCACACTCA 
GACTCTGTTTGATATTGTGAATGATGGGTACCGATGGCGTAAATATGGTCAGAAATCAGT 
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AAAAGGCAGCCCATATCCAAGGAGCTACTATAGATGTTCAAGCCCTGGATGCCCCGTCAA 
GAAACACGTAGAGAGGTCATCTCATGACACAAAGTTGCTTATAACAACTTACGAGGGAAA 
ACACGACCACGATATGCCTCCAGGAAGAGTTGTTACTCATAATAACATGCTGGACTCGGA 
AGTTGATGATAAAGAAGGAGATGCCAACAAGACTCCACAGAGCTCAACTCTTCAATCCAT 
TACAAAAGACCAGCATGTCGAAGATCACTTAAGAAAGAAAACGAAGACTAATGGCTTTGA 
GAAAAGTCTTGATCAAGGTCCAGTTTTGGATGAGAAGCTGAAGGAGGAAATAAAAGAGAG 
ATCAGATGCAAACAAAGATCACGCAGCCAATCACGCCAAGCCGGAAGCAAAGTCAGATGA 
TAAAACCACTGTTTGTCAAGAGAAGGCAGTAGGAACCCTGGAGAGCGAGGAACAAAAACC 
CAAGACAGAGCCTGCCCAAAGCTAAGCATTCAGTGTTGTACCGAGTGGTAATTTATATGG 
CTGTTTTAACATAGATTAGTACAGGCGATATGGTTATAGACTGTACAGTTGTTGTTCAGG 
CGGGACCAGATTTAGATTAGTGTTTAATGGAATAGTATGCTTTAATACCTTTATGTAACC 
ACTTCCATTTGGTTCAAATAAGAGTTACAGGAAGAGAAGGTAACACAACAAGAGCCCTTC 
TTTGTTGATGGAGCCTGTGTAATAGTTGTAGCATGGGGATGTATATGATTTGATTCAACC 
TTATTAATGGTTATGAGACAAAACTATC 

>G174 Amino Acid Sequence (domain in AA coordinates: TBD) 

MAEVGKVLASDMELDHSNETKATODWATTDKAEVIPVAVTRTETVVESLESTDCKELEK 

LVPHTVASQSEVDVASPVSEKAPKVSESSGALSLQSGSEGNSPFIREiOTMEDGYNWRKYG 

QKLVKGNEFVRSYYRCTHPNCKAKKQLERSAGGQWDTVYFGEHDHPKPLAGAVPINQDK 

RSDVFTAVSKEKTSGSSVQTLRQTEPPKIHGGLHVSVIPPADDVKTDISQSSRITGDNTH 

KDYNSPTAKRRKKGGNIELSPVERSTNDSRIWHTQTLFDIVNDGYRWRKYGQKSVKGSP 

YPRS Y YRCS S PGCPVKKHVERS SHDTKLL I TTYEGKHDHDMPPGRVVTHNNMLDS E VDDK 

EGDANKTPQSSTLQSITKDQHVEDHLRKKTKTNGFEKSLDQGPVLDEKLKEEIKERSDAN 

KDHAANHAICPEAKSDDKTTVCQEKAVGTLESEEQKPKTEPAQS* 

>G715 (1..705) 

ATGGATACCAACAACCAGCAACCACCTCCCTCCGCCGCCGGAATCCCTCCTCCACCACCT 

GGAACCACCATCTCCGCCGCAGGAGGAGGAGCTTCTTACCACCACCTTCTCCAACAACAA 

CAACAACAGCTCCAACTATTCTGGACCTACCAACGCCAAGAGATCGAACAAGTTAACGAT 

TTCAAAAACCATCAGCTTCCACTAGCTAGGATAAAAAAGATCATGAAAGCCGATGAAGAT 

GTTCGTATGATCTCCGCAGAAGCACCGATTCTCTTCGCGAAAGCTTGTGAGCTTTTCATT 

CTCGAGCTCACGATCAGATCTTGGCTTCACGCTGAGGAGAATAAACGTCGTACGCTTCAG 

AAAAACGATATCGCTGCTGCGATTACTAGGACTGATATCTTCGATTTCCTTGTTGATATT 

GTTCCTAGAGATGAGATTAAGGACGAAGCCGCAGTCCTCGGTGGTGGAATGGTGGTGGCT 

CCTACCGCGAGCGGCGTGCCTTACTATTATCCGCCGATGGGACAACCAGCTGGTCCTGGA 

GGGATGATGATTGGGAGACCAGCTATGGATCCGAATGGTGTTTATGTCCAGCCTCCGTCT 

CAGGCGTGGCAGAGTGTTTGGCAGACTTCGACGGGGACGGGAGATGATGTCTCTTATGGT 

AGTGGTGGAAGTTCCGGTCAAGGGAATCTCGACGGCCAAGGGTAA 

>G715 Amino Acid Sequence (domain in AA coordinates: 60-132) 

MDTtWQQPPPSAAGIPPPPPGTTISAAGGGASYHHLLQQQQQQLQLFWTYQRQEIEQVND 

FKNHQLPIJ^IKKIMKADEDVRMISAEAPILFAKACELFILELTIRSWLHAEENKRRTLQ 

KOTIAAAITRTDIFDFLVDIVPRDEIKDEAAVI^GGIWVAPTASGVPYYYPPMGQPAGPG 

GMMIGRPAMDPNGVYVQPPSQAWQSVWQTSTGTGDDVSYGSGGSSGQGNLDGQG* 

>G588 (196.. 1599) 

ATCTGAAGTGAACCAAGCTCAGGTTTTGTCTTCTCTTTGATCATTCCTTTCTCAGCAATA 

TAAATTAGAGTTATATCCTTTATAAAGGATTTTGCTTTTTCACCAACAAACCCTAAATTC 

GGTGTCTCAGCAAGAATCACGTGATTCTCGTTCCTCTTCCTCACGAAACCCATCATCTTC 

TATCTCATTTGAGAAATGGGTCAAAAGTTTTGGGAGAATCAAGAAGATCGAGCGATGGTT 

GAATCCACCATAGGCTCTGAAGCTTGCGACTTTTTCATCTCAACAGCTTCAGCTTCCAAC 

ACTGCCTTGTCCAAGCTTGTCTCACCACCAAGTGATTCCAATCTCCAACAAGGGTTACGT 

CACGTTGTTGAAGGATCTGATTGGGATTATGCTCTTTTCTGGCTAGCGTCCM 

AGCTCTGATGGTTGTGTCTTGATCTGGGGAGATGGTCATTGCCGTGTCAAAAAGGGTGCT 

TCAGGTGAGGATTACTCTCAGCAAGATGAGATCAAAAGACGTGTGCTTCGCAAGCTTCAC 

TTGTCGTTCGTTGGTTCAGATGAAGATCATCGTTTGGTGAAATCAGGAGCTCTTACTGAT 

CTCGACATGTTTTATCTGGCTTCTTTGTACTTTTCCTTTAGGTGTGATACCAATAAGTAC 

GGTCCTGCTGGAACCTATGTGTCTGGGAAGCCTCTTTGGGCTGCAGATTTGCCTAGCTGC 

TTGAGTTATTATAGGGTTAGGTCTTTCTTAGCTAGGTCAGCTGGTTTTCAGACTGTGTTG 

TCTGTACCAGTGAATTCTGGAGTTGTGGAGCTTGGTTCTTTAAGACATATTCCAGAAGAT 

AAGAGTGTGATTGAGATGGTGAAATCAGTGTTTGGTGGGTCTGACTTTGTTCAGGCTAAA 
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GAAGCTCCTAAAATCTTTGGTCGACAGCTGAGTCTTGGTGGAGCAAAACCTCGGTCTATG 
AGTATTAATTTCTCCCCGAAGACCGAGGATGACACGGGTTTCTCATTGGAATCGTATGAG 
GTGCAAGCGATCGGAGGCTCTAATCAAGTGTATGGTTATGAGCAAGGGAAAGATGAGACA 
TTGTATCTAACTGACGAGCAAAAGCCGAGGAAGAGAGGGAGAAAACCAGCAAATGGAAGA 
GAAGAGGCTCTAAACCATGTGGAAGCGGAACGGCAGAGGAGGGAGAAGCTGAACCAGAGA 
TTCTACGCTTTGAGAGCGGTGGTGCCTT^ACATCTCCAAGATGGACAAGGCTTCGCTCCTT 
GCAGACGCAATCACTTACATCACGGATATGCAGAAGAAAATCAGGGTGTATGAAACAGAG 
AAGCAGATAATGAAGAGGAGGGAGAGTAATCAGATAACTCCAGCAGAGGTTGATTATCAA 
CAGAGGCATGATGATGCAGTTGTAAGGCTAAGCTGTCCGTTGGAAACTCATCCAGTTTCA 
AAGGTGATACAAACGTTGAGGGAGAATGAAGTTATGCCTCATGATTCCAACGTGGCCATC 
ACAGAGGAGGGTGTGGTTCACACATTCACTCTCCGGCCTCAGGGTGGCTGCACCGCTGAG 
CAGTTGAAGGACAAG CTCCTTGCCTCTCTATCACAGTAACTATCACAGCAGTAACTG CTA 
TGTAATAAGTGTAACCGTGTTGGAGGTTGTATCAATGTACTATTGCTVAGCCAACCAT^AAA 
AAACTCCAGCTTAGTAGGATCGTGTAATTTTCCTTATATGTAATGTTGAGATTTGTCTTT 
TACATATAAAGATTTGA 

>G588 Amino Acid Sequence (domain in AA coordinates: 309-376) 

MGQKFWENQEDRAMVESTIGSEACDFFISTASASNTALSKLVSPPSDSNLQQGLRHWEG 

SDWDYALFWLASNVNSSDGCVLIWGDGHCRVKKGASGEDYSQQDEIKRRVLRKLHLSFVG 

SDEDHRIiVKSGALTDLDMFYLASLYFSFRCDTNKYGPAGTYVSGKPLWAADLPSCLSYYR 

VRSFLARSAGFQTVLSVPVNSGWELGSLRHIPEDKSVIEMVKSVFGGSDFVQAKEAPKI 

FGRQLSLGGAKPRSMSINFSPKTEDDTGFSLESYEVQAIGGSNQVYGYEQGKDETLYLTD 

EQKPRKRGRKPANGREEALNHVEAERQRREKLNQRFYALRAWPNISKMDKASLLADAIT 

YITDMQKKIRVYETEKQIMKRRESNQITPAEVDYQQRHDDAVVRLSCPLETHPVSKVIQT 

LRENEVMPHDSNVAITEEGWHTFTLRPQGGCTAEQLKDKLLASLSQ* 

>G1758 (69.. 677) 

GTCCCTCCTCTTAGCTTCAACCGCCGGAAAAACTAAACAACCTTCTTGGAAAAT^AAGAGA 

AACTAAAAATGAACTATCCTTCAAACCCTAACCCTAGCTCCACAGATTTCACTGAATTTT 

TCAAGTTCGATGATTTTGACGATACTTTTGAGAAGATCATGGAAGAAATCGGCCGTGAGG 

ACCACTCGTCGTCACCGACTTTGAGTTGGAGTTCATCGGAAAAGTTAGTGGCTGCAGAAA 

TCACAAGCCCGCTTCAAACAAGCCTAGCTACCTCACCTATGAGCTTTGAAATAGGTGACA 

AAGATGAAATCAAAAAGAGGAAGAGACACAAAGAAGATCCGATTATTCACGTCTTCAAAA 

CGAAATCATCAATTGATGAAAAGGTTGCTTTAGATGATGGGTATAAATGGAGGAAATACG 

GA7y\GAAGCCGATAACGGGTAGTCCATTTCCAAGGCATTATCACAAGTGTTCGAGCCCAG 

ATTGCAACGTGAAGAAGAAGATCGAAAGAGATACGAACAATCCAGATTACATATTGACAA 

CATACGAAGGTAGACATAACCACCCAAGCCCTTCTGTAGTTTATTGTGATTCAGACGACT 

TTGATCTTAACTCTCTCAACAATTGGTCCTTTCAGACGGCAAATACGTATAGTTTCTCTC 

ATTCTGCTCCATATTGATCGATCGTAGTTACAAGTTTGTGTATATAGATGTATATATATA 

TATCACCAATTCACCATCGTAATCACGTCTCACATGTAACTACGTACATATATCTTGTTC 

GGGGTTCGTTTTGTAATGTATTGAATTGGTGGAGGTAGAATGGAAGTCATCTTGTATAGT 

TGTACTTGTATGTAAGGTTTGATAGTCATTTTTTATAAAGTAACTAATTTGTACAA 

>G1758 Amino Acid Sequence (domain in AA coordinates: TBD) 

MNYPSNPNPSSTDFTEFFKFDDFDDTFEKIMEEIGREDHSSSPTLSWSSSEKLVAAEITS 

PLQTSLATS PMS FE I GDKDEIKKRKRHKEDP I IHVFKTKS S IDEKVALDDGYKWRKYGKK 

PITGSPFPRHYHKCSSPDCNVKKKIERDTNNPDYILTTYEGRHraPSPSVVYCDSDD 

NS LNNW S FQTANTYS FSHS AP Y * 

>G2148 (66.. 737) 

GTCTCTAATATAAGCTTGAACGTTGCTATATATAAATGTAAAGGCGAACGCATAAGAAAA 

GAAAAATGGAGAATGAAGCTTTTGTAGATGGTGAATTGGAGTCTCTTTTGGGGATGTTCA 

ACTTTGATCAATGTTCATCTAACGAATCGAGCTTTTGCAATGCTCCAAATGAGACTGATG 

TTTTCTCTTCTGATGATTTCTTCCCATTTGGTACAATTCTGCAAAGTAACTA 

TTCTTGATGGTTCCAACCACCAAACGAACCGAAATGTCGACTCAAGACAAGATCTGTTGA 

AACCAAGGAAGAAGCAAAAGTTAAGOTCGGAAAGCAATTTGGTTACCGAGCCTAAGACTG 

CTTGGAGAGATGGTCAAAGCCTAAGCAGTTATAATAGTTCAGATGATGAAAAGGCTTTAG 

GTTTAGTGTCTAATACATCAAAAAGCCTAAAACGCAAAGCGAAAGCCAACAGAGGGATAG 

CTTCCGATCCTCAGAGCCTATACGCTAGGAAACGAAGAGAAAGGATAAACGATAGGCTAA 

AGACATTGCAGAGCCTAGTTCCTAATGGGACAAAGGTCGATATAAGCACAATGCTGGAAG 

ATGCTGTCCATTACGTGAAGTTCCTGCAGCTTCAAATCAAGCTCTTGAGTTCAGAAGATC 
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TATGGATGTATGCACCTCTTGCTCACAATGGTCTGAATATGGGACTACATCACAATCTTT 

TGTCTCGGCTTATTTAAGACAAAATCATTGGAATAACATAACTTACAGTACTTGTTTTTT 

TTCTCGTTCTATATTCATGATTATGGTTATTTTTTGTTTGAGTTGTTCAATTTTTCTGTC 

TATTGCGTTCTATGAACTTGACACTCTTTTTGTAATTATTATATGCTAAAGACAATTTGG ' 
ACTAACAG CATTTTAAT AAAAAAAAAAAA 

>G2148 Amino Acid Sequence (conserved domain in AA coordinates : 130-268 ) 

MENEAFVDGELESLLGMFNFDQCSSWESSFCNAPNETDVFSSDDFFPFGTILQSNYAAVL 

DGSNHQTNRNVDSRQDLLKPRKKQKLSSESNLVTEPKTAWRDGQSLSSYNSSDDEKALGL 

VSNTSKSLICRKAI^RGIASDPQSLYARIO^RERINDRLKTLQSLVPNGTKVDISTMLEDA 

VHYVKFLQLQIKLLSSEDLWMYAPLAHNGLNMGLHHNLLSRLI* 

>G2379 (52 . .798) 

CGCCGTCACTCTCCTCCCGGTGCCGCACATTAGCAACACTACTCCCGACGAATGGAGACG 

ACGACGCCGCAGTCAAAATCAAGTGTGTCCCACCGACCGCCGTTGGGAAGAGAAGACTGG 

TGGAGTGAGGAAGCGACGGCGACGCTGGTAGAAGCCTGGGGCAATCGTTACGTCAAGCTG 

AACCACGGAAATCTCCGGCAGAATGACTGGAAAGACGTCGCCGACGCCGTTAACTCTAGA 

CACGGTGATAACAGCCGTAAGAAGACCGACTTACAGTGTAAGAACCGGGTCGATACTTTG 

AAGAAGAAGTACAAAACAGAGAAAGCTAAACTCTCGCCGTCGACTTGGCGTTTCTATAAC 

CGCCTCGATGTTCTAATCGGTCCCGTTGTGAAGAAATCGGCTGGCGGAGTTGTCAAATCA 

GCGCCTTTTAAGAATCATCTGAATCCAACTGGATCGAACTCTACTGGAAGCTCTCTTGAA 

GATGATGATGAGGATGATGATGAGGTTGGTGATTGGGAATTCGTTGCTAGGAAGCATCCT 

CGTGTGGAAGAGGTAGATCTGAGTGAAGGATCAACGTGTAGGGAACTAGCTACGGCGATT 

CTCAAGTTTGGAGAAGTTTACGAGAGAATTGAAGGGAAGAAGCAACAGATGATGATTGAG 

TTGGAGAAGCAGAGAATGGAAGTGACAAAGGAGGTAGAGTTAAAACGT^ATGAACATGTTG 

ATGGAGATGCAGTTAGAGATTGAGAAATCAAAGCACCGGAAACGCGCAAGTGCTTCAGGT 
AAGAAGAACTCACATTAGG 

>G2379 Amino Acid Sequence (domain in AA coordinates : 19-110 , 173-232) 

METTTPQSKSSVSHRPPLGREDWWSEEATATLVEAWGNRYVKLNHGNLRQNDWKDVADAV 
NSRHGDNSRKKTDLQCKNRTOTLKKK^KTC 

VKSAPFKmLNPTGSNSTGSSLEDDDEDDDEVGDWEFVARKHPRVEEVDLSEGSTCRELA : 

TAILKFGEVTERIEGKKQQMMIELEKQRMEVTKETO^ 

ASGKKNSH* 

>G1462 (63.. 1031) 

CGTCGACCATTCTTGCGATTGATCTTTCTCTAGATAATTTTTTTGATCGATTTAGTTTCA 

TTATGGAGGACGACGACGCAGCTTATGATCTAATCAAACACGAACTGTTATACTCAGAAG 

ACGAAGTAATAATCTCACGTTATCTGAAGGGTATGGTCGTTAACGGAGATTCTTGGCCAG 

ATCACTTCATCGAAGACGCAAACGTGTTCACCAAGAATCCAGATAAGGTGTTCAATTCTG 

AGAGACCTAGATTCGTGATCGTTAAACCACGAACAGAGGCTTGTGGTAAAACCGATGGAT 

GTGATTCGGGTTGCTGGAGGATCATTGGTCGTGATAAACTGATAAAGTCGGAGGAGACTG 

GGAAGATTCTAGGGTTCAAGAAGATACTCAAGTTTTGCCTAAAGAGGAAACCTATAGACT 

ACAAGAGAAGTTGGGTAATGGAAGAGTATAGGCTTACCAATAACTTGAACTGGAAGCAAG 

ATC^TGTGATTTGCAAAATTCGGTTTATGTTTGAAGCTGAAATTAGTTTCTTGCTAAGCA 

AGCATTTCTACACTACATCAGAATCGGTTCTTGAAAATGAGCTGTTGCCATCTTATGGAT 

ATTATTTATCCAATACACAAGAGGAGGATGAATTTTATCTGGACGCGATAATGACTTCGG 

AAGGAAACGAGTGGCCTAGCTACGTTACCAACAACGTGTACTGTCTGCATCCATTGGAGC 

TTGTGGATCTTCAAGATCGGATGTTTAATGATTACGGAACCTGCATCTTCGCTAACAAGA 

CTTGTGGTGAAACTGATAAATGCGATGGTGGTTACTGGAAGATCCTGCACGGTGATAAGC 

TGATCAAGTCAAATTTCGGAAAGGTCATTGGTTTCAAGAAGGTATTTGAGTTCTATGAAA 

CGGTGAGACAAATATATCTTTGTGATGGAGAAGAAGTGACGGTAACTTGGACTATACAAG 

AGTATAGGCTTAGCAAAAACGTGAAGCAGAATAAAGTGTTGTGCGTTATCAAGTTGACTT 

ATGATAGATAGGATACTTTACTTTGGTTTTTGTGATCATCTTAGTATC^ 

TAGATACACACATCTATAGGCGACCGCTCTAGACAGGCCTCGTACCG 

>G1462 Amino Acid Sequence (domain in AA coordinates: TBD) 

MEDDDAAYDLIKHELLYSEDEVIISRYLKGMVVNGDSWPDHFIEDANVFTKNPDKVFNSE 

RPRFVIVKPRTEACGKTDGCDSGCWRIIGRDKLIKSEETGKILGFKKILKFCLKRKPIDY 

KRSWVMEEYRLTNNLNWKQDHVICKIRFMFEAEISFLLSKHFYTTSESVLENELLPSYGY 

YLSNTQEEDEFYLDAIMTSEGNEWPSYVTNWYCLHPLELVT»LQDRMFNDYGTCIF 

CGETDKCDGGYWKILHGDKLIKSNFGKVIGFKKVFEFYETVRQIYLCDGEEVTVTWTIQE 
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YRLS KNVKQN KVLC VI KLTYDR * 
>G1211 (44.. 1120) 

TGAAACCTAGATTTCTGCAACTGAATTCCTAATTCGA7U\AAGAATGGAGGGTTCGTCGTC 

GACGATAGCAAGGAAGACATGGGAACTAGAGAACAGCATTCTAACAGTAGACTCACCTGA 

TTCAACCTCCGACAACATCTTCTACTACGACGATACTTCACAGACTAGGTTCCAGCAAGA 

GAAACCGTGGGAGAATGATCCTCACTACTTTAAACGAGTCAAGATCTCAGCGCTCGCTCT 

TCTT7^AGATGGTGGTTCACGCTCGCTCTGGTGGTACAATTGAAATAATGGGTCTTATGCA 

AGGTAAGACCGATGGTGATACTATCATTGTTATGGATGCTTTTGCTTTACCAGTGGAAGG 

TACTGAGACAAGGGTTAATGCTCAGGATGATGCTTATGAGTACATGGTTGAGTATTCACA 

GACCAACAAGCTCGCGGGGCGGCTGGAGAATGTTGTTGGATGGTATCACTCTCACCCTGG 

ATATGGATGCTGGCTCTCCGGTATTGATGTTTCTACGCAGACGCTTAACCAACAGCATCA 

GGAGCCATTTTTAGCTGTTGTTATTGATCCCACAAGGACTGTTTCAGCTGGTAAGGTTGA 

GATTGGTG CTTTCAGAACATACTCTAAAGGATATAAGCCTCCAGATGAAC CTGTTTCTG A 

GTATCAAACTATTCCTTTAAATAAGATTGAGGACTTTGGTGTTCACTGCAAACAGTACTA 

TTCATTAGATGTCACTTATTTCAAGTCATCTCTTGATTCTCACCTTCTGGATCTACTATG 

GAACAAGTACTGGGTGAACACTCTTTCTTCTTCTCCACTGCTGGGTAATGGAGACTATGT 

TGCTGGACAAATATCAGACTTAGCTGAGAAGCTTGAGCAAGCCGAGAGTCATCTGGTTCA 

GTCTCGCTTTGGAGGAGTTGTGCCATCATCCCTTCATAAGAAAAAAGAAGATGAGTCTCA 

ACTAACTAAGATAACTCGGGATAGCGCAAAGATAACTGTGGMCAGGTCCATGGACTAAT 

GTCGCAGGTCATAAAAGATGAATTATTCAACTCAATGCGTCAGTCCAACAACAAATCTCC 

CACTGACTCGTCGGATCCAGACCCTATGATTACATATTGAAGTTGCTCTTCTTTTGGTTT 

CTANTTTTGGATTGACCCATCATTTGTTGTCCTTTCATTTATTTTCTGTTGTGTAAAGAA 

TTATAATGNCGNCGCGAATTCGCGGCCGCTAAAAAAANACAGGAT^ATTGAAAANAATTCN 

NCCATTCCAACATCTTTATTTAATATTATCTCCTCNATTATATAATATTCAAACATCCCT 

ANTANCTTCATTTGACCGTCCCCCTCCCTCCCGTGTTGCNTTGGTGCTGGCCCC 

>G1211 Amino Acid Sequence (domain in AA coordinates: 123-179) 

MEGSSSTIARKTWELENSIIiTVDSPDSTSDNIFYYDDTSQTRFQQEKPWENDPHYFKRVK 

ISAIJUjLKMVVHARSGGTIEIMGLMQGKTDGDTIIVMDAFALPVEGTETRW 

IWEYSQTNKLAGRLENVVGWYHSHPGYGCWLSGIDVSTQTLNQQHQEPFLAVVIDPTRTV 

SAGKVEIGAFRTYSKGYKPPDEPVSEYQTIPLNKIEDFGVHCKQYYSLDVTYFKSSLDSH 

LLDLLWNKYWWTLSSSPLLGNGDWAGQISDLAEKLEQAESHIiVQSRFGGVVPSSLHKK 

KEDESQLTKITRDSAKITVEQVHGLMSQVIKDELFNSMRQSNNKSPTDSSDPDPMITY* 

>G1048 (5.. 892) 

GACCATGGCGGAGGAATTTGGAAGCATAGATTTACTCGGAGATGAAGATTTCTTCTTCGA 
TTTCGATCCTTCAATCGTAATTGATTCTCTTCCGGCGGAGGATTTTCTTCAGTCTTCACC 
GGATTCATGGATCGGAGAAATCGAGAATCAATTGATGAACGATGAGAATCATCAAGAGGA 
GAGTTTTGTGGAATTGGATCAGCAATCGGTTTCAGATTTCATAGCGGATCTACTCGTTGA 
TTATCCAACTAGCGATTCTGGCTCCGTTGATTTGGCGGCTGATAAAGTTCTAACCGTCGA 
TTCTCCCGCCGCCGCTGATGATTCCGGGAAGGAGAATTCGGATTTGGTTGTTGAGAAGAA 
GTCTAATGATTCTGGTAGCGAGATTCATGATGATGATGACGAAGAAGGAGACGATGATGC 
TGTGGCTAAAAAACGAAGAAGGAGAGTAAGAAATAGAGATGCGGCGGTTAGATCGAGAGA 
GAGGAAGAAGGAATATGTACAAGATTTAGAGAAG7VAGAGTAAGTATCTCGAAAGAGAATG 
CTTGAGACTAGGACGTATGCTTGAGTGCTTCGTTGCTGAAAACCAGTCTCTACGTTACTG 
TTTGCAAAAGGGTAATGGCAATAATACTACCATGATGTCGAAGCAGGAGTCTGCTGTGCT 
CTTGTTGGAATCCCTGCTGTTGGGTTCCCTGCTTTGGCTTCTGGGAGTAAACTTCATTTG 
CCTATTCCCTTATATGTCCCACACAAAGTGTTGCCTCCTACGTCCAGAACCAGAAAAGCT 
GGTTCTAAACGGGCTCGGGAGTAGTAGCAAACCGTCTTATACCGGCGTTAGTCGGAGATG 
TAAGGGTTCGAGGCeTAGGATGAAATACCAAATCTTAACCCTTGCGGCGTGACAACGCCT 
TTTTTAACTGCITCTTTTGCGC^TT 

TCTTGTTTTGTATTTCGCTGTTGAAAGTTTTCTGTCTAATATCGATAAGTTAACAGTGAA 
AAAAAAAAAAAAAAA 

>G1048 Amino Acid Sequence (domain in AA coordinates 138-190) 

MAEEFGSIDLIiGDEDFFFDFDPSIVIDSLPAEDFLQSSPDSWIGEIENQLMNDENHQEES 

FVELDQQSVSDFIADLLVDYPTSDSGSVDLAADKVLTVDSPAAADDSGKENSDLVVEKK 

NDSGSEIHDDDDEEGDDDAVAKKRRRRVRNRDAATOSRERKKEYVQDLEKKSKYLERECL 

RLGRMLECFVAENQSLRYCLQKGNGNNTT^SKQESAVLIjLESLLIjGSLLWLLGVNFICL 

FPYMSHTKCCLLRPEPEKLAOjNGLGSSSKPSYTGVSRRCKGSRPRMKYQILTIiAA* 
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>G986 (31.. 846) 

CATTAAATTGGCTCCTGTGAACCTAAATTTATGGACTATGATCCCAACACCAATCCGTTC 

GACCTTCATTTCTCCGGTAAACTTCCGAAAAGAGAAGTCTCGGCTTCAGCTTCTAAAGTT 

GTAGAGAAGAAATGGTTAGTGAAAGATGAGAAGAGAAATATGCTAC7VAGATGAAATAAAC 

CGGGTTAATTCGGAGAACAAGAAGCTAACCGAAATGTTAGCAAGAGTCTGTGAGAAGTAC 

TATGCTCTTAATAATCTTATGGAGGAGTTGCAGAGTCGAAAGAGTCCTGAAAGTGTTAAC 

TTTCAGAACAAACAGCTAACGGGGAAACGAAAACAAGAACTTGATGAGTTTGTTAGCTCC 

CCAATTGGACTCAGTCTCGGACCAATCGAGAACATCACCAACGATAAAGCGACGGTTTCA 

ACCGCTTACTTTGCTGCTGAGAAGTCTGACACAAGCTTGACTGTGAAAGATGGATATCAA 

TGGAGGAAATACGGGCAAAAGATTACGAGAGATAATCCATCTCCTAGAGCTTACTTCAGA 

TGCTCGTTTTCACCGTCTTGTCTAGTCAAGAAGAAGGTGCAACG7VAGTGCAGAAGATCCA 

TCTTTCTTGGTAGCCACTTACGAAGGGACACATAACCACACCGGACCACATGCAAGTGTG 

TCCAGGACAGTGAAACTTGATCTAGTTCAAGGTGGGCTTGAACCAGTTGAGGAAAAGAAA 

GAGAGAGGGACGATTCAAGAGGTTTTGGTGCAACAAATGGCTTCTTCGTTGACCAAAGAT 

CCTAAGTTCACTGCAGCTCTTGCGACTGCTATTTCCGGGAGATTGATAGAGCATTCAAGA 

ACATGAAAGTTCTCTAGAACATGTATATTTCTGTTTTGTTCTATTTTGTTGCTCATTCCT 

AGTAAAAAGGTAAAGATTTGTTTGATCTTGATTAGGAGGCATAGATGTCAATTTTAATGT 

GTGTGTATATAATTACATCAAATCTAAGTATCCAAAAAGGGTCACCCCCATTTTATCTTA 
TG 

>G986 Amino Acid Sequence -(domain in AA coordinates: 146-203) 

MDYDPNTNPFDLHFSGKLPKREVSASASKWEKXWLVKDEKKNMLQDEINRVNSENKKLT 

EMLARVCEKYYALNNLMEELQSRKSPESVNFQNKQLTGKRKQELDEFVSSPIGLSLGPIE 

NITNDKATVSTAYFAAEKSDTSLTVKDGYQWRKYGQKITRDNPSPRAYFRCSFSPSCLVK 

KKyQRSAEDPSFLVATYEGTHNHTGPHASVSRTVKLDLVQGGLEPVEEKXERGTIQEVLV 

QQMASSLTKDPKFTAALATAISGRLIEHSRT* 

>G789 (259. .1593) 

GGCAAGAAGAACCTTAGCCTCTCTTTCTTCTTTCTCTCTCTCTCTCTCTGTGGTACTGTT 
CTGTTTCAACTTTACTCCCTCAGTTTCAGAACAATTCCCTATCTAGAAGAGAGATAAAAC 
CGAGAAGGTTTTGGAGATAGAATCTTTTGTTCTTCTTTTGTCCCtCCTTGCTCGATTTTT 
GTTACGTGTGAAGCAATAAAAAAAAACTGATATAGCTAAATCTTCCATCCATTCAGAGGC 
TTCTAAATCTGATCTGACATGGAACAAGTGTTTGCTGATTGGAATTTTGAAGATAATTTT 
CACATGTCCACTAATAAAAGATCAATCAGACCAGAAGATGAATTAGTGGAGCTATTGTGG 
AGAGATGGTCAAGTGGTTTTACAAAGCCAAGCTCGTAGAGAACCGTCAGTCCAAGTCCAA 
ACCCACAAACAAGAAACCCTAAGAAAACCCAACAATATTTTTCTTGACAACCAAGAAACA 
GTACAAAAGCCTAACTACGCTGCTCTAGATGATCAAGAAACCGTCTCCTGGATACAATAC 
CCTCCGGATGACGTCATCGACCCTTTCGAATCCGAGTTCTCCTCTCATTTCTTCTCTTCG 
ATCGATCACCTCGGAGGTCCTGAGAAGCCACGAACGATCGAAGAGACAGTTAAGCATGAG 
GCTCAAGCCATGGCTCCTCCTAAGTTTAGATCCTCGGTTATAACAGTCGGACCGAGTCAT 
TGCGGCAGCAACCAGTCAACAAATATTCATCAGGCCACTACACTTCCGGTTTCTATGAGT 
GATAGAAGCAAGAACGTCGAAGAAAGACTTGACACTTCGTCAGGTGGCTCCTCCGGTTGC 
AGCTATGGAAGGAACAACAAAGAAACCGTTAGTGGAACAAGTGTAACCATTGACCGTAAA 
AGAAAACATGTTATGGATGCTGATCAAGAATCTGTGTCTC71ATCAGATATAGGTTTGACC 
TCAACCGATGATCAAACCATGGGTAACAAATCGAGCCAACGGTCAGGATCTACTCGAAGA 
AGCCGTGCAGCTGAAGTTCATAATCTCTCAGAAAGGAGGAGGAGAGATCGGATCAATGAA 
AGAATGAAAGCTCTTCAAGAACTCATACCTCACTGCAGCAGAACAGATAAAGCTTCGATA 
TTGGATGAAGGAATTGATTACTTAAAATCACTTCAAATGCAACTCCAAGTGATGTGGATG 
GGAAGTGGAATGGCGGCGGCGGCAGCAGCAGCAGCAAGTCCGATGATGTTTCCCGGGGTA 
CAATCATCTCCATACATTAATCAGATGGCTATGCAAAGTCAGATGCAATTGTCTCAATTC 
CCGGTTATGAACCGGTCCGCTCCGCAGAACCATCCCGGTTTAGTATGTCAAAACCCGGTA 
CAGTTGCAGCTCCAAGCACAGAACCAAATCTTATCGGAGCAGCTCGCTAGGTACATGGGC 
GGGATTCCCCAGATGCCGCCGGCGGGA7VATCAGATGCAGACCGTGCAACAACAACCAGCG 
GACATGTTGGGATTTGGATCTCCGGCGGGACCGCAAAGTCAACTGTCGGCACCGGCGACC 
ACCGACAGTCTTCATATGGGTAAAATAGGCTGACTTGGGATATAGTTTTCCTCCGAAATT 
ATTCTTCTTACAGTTGGTGATTGTTATTTATTTTTGGTCGCCTAAGCAAGCATAAAAGCT 
AAGTCAAATGTATTATAGAGATCTAATAAGTTAGTCTCATACTTATAACTTATTTTTAAA 
CAGTTGAATTATAGTATCAATCAAGTGTTGGGAACCTAAAGATCATACATGTGTCAATAC 
TTTTATATTTGTTCTCAAGGTTCATCAGAAAAACAAAATAAAAAGGATAGACTAGGCCTG 



239 



BNSDOCID: <WO_03013227A2J_> 



WO 03/013227 PCT/US02/25805 

240/286 



CATTTGACATTATCATGGGCTTTTTTGGGTCTATGAATATGAACATTAACCCC 

>G789 Amino Acid Sequence (domain in AA coordinates: 253-313) 

MEQVFADWNFEDNFHMSTNKRSIRPEDELVELLWRDGQWLQSQARRBPSVQVQTOKQET 

LRKPNNIFLDNQETVQKPNYAALDDQETVSWIQYPPDDVIDPFESEFSSHFFSSIDHLGG 

PEKPRTIEETVKHEAQAMAPPKFRSSVITVGPSHCGSNQSTNIHQATTLPVSMSDRSKNV 

EERLDTSSGGSSGCSYGRNNKETVSGTSVTIDRKRKHVMDADQESVSQSDIGLTSTDDQT 

MGNKSSQRSGSTRRSRAAEVHNLSERRRRDRINERMKALQELIPHCSRTDKASILDEAID 

YLKSLQMQLQVMWMGSGMAAAAAAAASPMMFPGVQSSPYINQMAMQSQMQLSQFPVMNRS 

APQNHPGLVCQNPVQLQIiQAQNQILSEQLARYMGGIPQMPPAGNQMQTVQQQPADMLGFG 

SPAGPQSQLSAPATTDSLHMGKIG* 

>G2085 (1..930) 

ATGTTTGGTCGCCATTCGATTATCCCAAATAACCAGATTGGTACCGCCTCTGCTTCCGCT 
GGTGAAGACCATGTCTCTGCCTCCGCTACGTCTGGTCACATTCCTTACGACGATATGGAA 
GAAATCCCTCATCCTGACTCTATCTATGGTGCTGCCTCCGATTTGATTCCCGATGGCTCT 
CAATTGGTTGCTCACCGATCCGATGGCTCTGAATTACTTGTTTCTCGGCCACCGGAAGGG 
GCGAATCAGCTTACGATCTCGTTCCGTGGACAAGTTTACGTTTTTGATGCCGTTGGTGCT 
GACAAGGTGGATGCTGTGTTGTCGCTGTTGGGTGGTTCTACTGAGCTTGCTCCTGGTCCG 
CAGGTGATGGAACTAGCTCAACAGCAGAATCATATGCCTGTTGTAGAATATCAGAGCCGC 
TGTAGCCTTCCGCAACGGGCAC7VATCCTTGGATAGGTTTCGGAAGAAGAGGAATGCTAGA 
TGTTTCGAG7VAGAAAGTAAGATACGGTGTTCGCCAAGAAGTTGCCTTAAGAATGGCACGT 
AATAAAGGTCAATTCACCTCTTCAAAGATGACAGATGGGGCTTATAACTCTGGCACAGAT 
CAAGATTCTGCCCAAGATGATGCCCATCCAGAAATATCGTGTACTCATTGCGGCATTAGT 
TCCAAATGTACACCAATGATGCGACGTGGCCCTTCCGGCCCCAGGACTCTCTGCAATGCC 
TGTGGACTTTTTTGGGCTAACAGGGGTACATTGAGGGATCTCTCATyVGAAAACAGAAGAG 
AATCAGTTGGCTTTAATGAAACCGGATGATGGTGGGAGTGTTGCTGATGCTGCTAACAAC 
TTAAACACTGAAGCTGCAAGTGTTGAAGAACACACTTCCATGGTTTCTCTTGCCAATGGG 
GATAATTCTAATCTGTTAGGTGATCACTAA 

>G2085 Amino Acid Sequence (domain in AA coordinates: TBD) 

MFGRHSIIPNNQIGTASASAGEDHVSASATSGHIPYDDMEEIPHPDSIYGAASDLIPDGS 

QLVAHRSDG SELLVSRPPEGANQLTI S FRGQVYVFDAVGADKVDAVLSLLGGSTELAPGP 

QVMELAQQQNHMPVVEYQSRCSLPQRAQSLDRFRKKRNARCFEKKyRYGVRQEVALRM^ 

NKGQFTSSKMTDGAYNSGTDQDSAQDDAHPEISCTHCGISSKCTPMMRRGPSGPRTLCNA 

CGLFWANRGTLRDLSKKTEENQLALMKPDDG 

DNSNLLGDH* 

>G1783 (1..603) 

ATGGCCGCGTTTCCGCAGTGGACAAGGGTCGATGACAAACGTTTTGAGTTAGCTCTGCTT 
CAAATCCCGGAGGGTTCGCCGAATTTTATAGAGAATATCGCCTATTATCTCCAGAAACCG 
GTGAAGGAGGTGGAGTACTACTACTGCGCGTTGGTCCATGATATTGAGCGGATCGAATCG 
GGTAAGTATGTTTTGCCCAAATACCCGGAAGACGATTACGTGAAACTGACGGAAGCAGGT 
GAGTCTAAGGGCAATGGGAAAAAGACGGGAATTCCTTGGTCAGAAGAGGAACAGAGGTTG 
TTTCTGGAAGGACTAAATAAGTTTGGGAAAGGAGACrGGAAGAACATATCGAGGTATTGT 
GTGAAGTCAAGGACCTCGACGCAAGTGGCAAGCCATGCTCAGAAGTATTTTGCAAGGCAA 
AAGCAGGAGAGTACGAATACTAAACGCCCGAGTATTCATGACATGACTCTGGGAGTTGCG 
GTCAATGTCCCTGGATCCAACTTGGAGTCTACTC 

ATTCCTTCGAATCAATATTATCCCTCCCAGGAAAACTTTCGGGGTTTTGATCAGCGATGG 
TGA 

>G1783 Amino Acid Sequence (domain in AA coordinates: 81.. 129) 
MAAFPQWTRVDDKRFEIjALLQIPEGSPNFIENIAYYLQK^ 

GKYVHjPKYPEDDYVKLTEAGESKGNGKK'IGIPWSEEEQRLFLEGLNKFGKGDWKNISRYC 

vksrtstqvashaqkyfarqkqestntkrpsihdmtl^vavnvpgsnlestgqqphfgdq 
ipsnqyypsqenfrgpdqrw* 

>G2072 (155. .793) 

tcgacccacgcgtccgccc^cgcgtccggatcttttc^cagaagaccaacc^gcttggct 
cgatgagctcctaagtgagccagcatcacctaagattaacaaaggtcatagacgttcagc 
tagtgacacagctgcttacttgaactcagctttaatgccttcgaaggaaaatcatgttgc 
tggttcgtcttggcagttccagaactatgatttgtggcagtccaactcttatgaacaaca 
caataaattaggatgggatttctctacagcaaatggaactaatatccaaagaaatatgtc 
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ATGCGGAGCTTTAAATATGTCGTCGAAACCCATTGAGAAACATGTAAGCAAAATGAAAGA 
AGGAACTTCTACAAAACCAGATGGTCCTAGATCAAAGACTGACTC7\AAACGTATCAAACA 
TCAAAATGCTCATCGAGCGCGTTTGAGAAGGCTTGAGTACATATCAGACCTTGAAAGGAC 
CATCCAAGTGCTACAAGTTGAAGGATGTGAAATGTCATCTGCCATTCACTACTTGGATCA 
GCAGTTACTCATGCTTAGCATGGAAAATAGAGCTTTAAAACAACGTATGGATAGTTTAGC 
AGAAATCCAAAAGCTTAAACATGTGGAGCAGCAATTGCTTGAGAGAGAGATAGGAAACCT 
ACAGTTTCGACGACACCAACAACAACCACAGCAAAACCAAAAACAAGTCCAAGCAATACA 
AAATCGATACACCAAATATCAACCACCTGTTACACAAGAACCCGATGCCCAATTTGCAGC 
CTTGGCAATATGATTTAGGAAATATGGATACATTGTTCAGATTAAGCTGAGCTCCTCTTG 
CTCTACCTTAATGTCCATACAACATAGGTGAACTTGATGTTTGTAGCCTTGAATGAAAAC 
CTAAAAAAGCATCGTTATGTAAATCAAAATGTGGTTGCCCATATCCTCCTCTATTGCATT 
TCTCTCTATTTATGGCATGGTAGAGAATCTCTTGTCAAGAAACTTCATGTTATGTAATAA 

AAAAAAAAAAAAAAAAAAA 

>G2072 Amino Acid Sequence (conserved domain in AA coordinates : 90-149) 

MPSKEl^AGSSWQFQNYDLWQSNSYEQHNKLGWDFSTANGTNIQRNMSCGALNMSSKPI 

EIOWSKMKEGTSTKPDGPRSKTDSKRIKHQNAHRARLRRLEYISDLERTIQvliQVEGCEM 

SSAIHYLDQQLLMLSMENRALKQRMDSLMIQKLKHVEQQLLEREIGNLQFRRHQQQPQQ 

NQKQVQAIQNRYTKYQPPVTQEPDAQFAALAI * 

>G931 (85.. 1071) 

GGAGGTTCTTTGACAGACACATGTATCATCAATCTTCTCTGTTGAAGCAGAGAGAGAGAG 
AGCTAATTGTTGCCTCTGAGTCACATGGATAAGAAAGTTTCATTTACTAGCTCTGTGGCA 
CATTCAACTCCACCATACCTTAGTACTTCCATCTCATGGGGACTTCCAACCAAATCCAAT 
GGTGTGACTGAATCACTGAGTTTGAAGGTGGTAGATGCAAGACCAGAACGTCTTATAAAC 
ACAAAGAATATCAGTTTCCAGGACCAGGATTCATCTTCAACTCTGTCCTCTGCTCAATCT 
TCTAACGATGTTAC7VAGTAGTGGAGATGATAACCCCTCAAGACAAATCTCATTTTTAGCA 
CATTCAGATGTTTGTAAAGGATTTGAAGAAACTCAAAGGAAGCGATTTGCAATTAAATCA 
GGCTCCTCCACGGCAGGAATCGCTGATATTCACTCTTCTCCTTCCAAGGCTAACTTCTCA 
TTTCACTATGCCGATCCACATTTTGGTGGTTTAATGCCTGCGGCTTACCTACCACAGGCA 
ACAATATGGAATCCCCAAATGACTCGAGTTCCGCTACCATTCGATCTCATAGAGAATGAG 
CCTGTCTTTGTCAATGCAAAGCAATTCCATGCAATTATGAGGAGGAGGCAACAGCGTGCT 
AAGCTAGAGGCGCAAAACAAACTAATCAAAGCCCGTAAGCCGTATCTTCATGAATCTCGA 
CATGTTCACGCTCTTAAACGACCTAGAGGATCTGGTGGAAGATTCCTAAACACCAAAAAG 
CTTCAAGAATCTACAGATCCAAAACAAGACATGCCAATCCAACAGCAACACGCAACGGGA 
AACATGTCAAGATTTGTGCTTTATCAGTTGCAGAACAGCAATGACTGTGATTGTTCAACC 
ACTTCTCGCTCTGACATCACATCTGCTTCTGACAGCGTXAATCTCTTTGGACACTCTGAA 
TTTCTGATATCAGATTGCCCATCTCAGACAAACCCAACAATGTATGTTCATGGTCAATCA 
AATGACATGCATGGAGGTAGGAACACACACCATTTCTCTGTCCATATCTGAGCCGGTGGA 
ATCTGGTAATGTGTACGTTCCTACAAAAAAAGGGAAGTCATCCTTGGCTGCTACTTCGCT 
TATTAGCTAGTTCTTATTTCACACGCTTTGTCCAGATATC 

>G931 Amino Acid Sequence (domain in AA coordinates: TBD) 

MDKKVSFTSSVAHSTPPYLSTSISWGLPTKSNGWESLSLKVVDARPERLINTKMISFQD 

QDSSSTLSSAQSSNDVTSSGDDNPSRQISFLAHSDVCKGFEETQRKRFAIKSGSSTAGIA 

DIHSSPSKANFSFHYADPHFGGLMPAAYLPQATIWNPQMTRVPLPFDLIENEPVFVNAKQ 

FHAIMRRRQQRAiCLEAQNKLIIO^RKPY 

QDMPIQQQHATGNMSRFVLYQLQNSNDCDCSTTSRSDITSASDSVNLFGHSEFLISDCPS 
QTNPTMYVHGQSNDMHGGRNTHHFSVHI * 
>G278 (93..187*) 

TCGATCTTTAACCAAATCCAGTTGATAAGGTCTCTTCGTTGATTAGCAGAGATCTCTTTA 
ATTTGTGAATTTCAATTCATCGGAACCTGTTGATGGACACCACCATTGATGGATTCGCCG 
ATTCTTATGAAATCAGCAGCACTAGTTTCGTCGCTACCGATAACACCGACTCCTCTATTG 
TTTATCTGGCCGCCGAACAAGTACTCACCGGACCTGATGTATCTGCTCTGCAATTGCTCT 
CCAACAGCTTCGAATCCGTCTTTGACTCGCCGGATGATTTCTACAGCGACGCTAAGCTTG 
TTCTCTCCGACGGCCGGGAAGTTTCTTTCCACCGGTGCGTTTTGTCAGCGAGAAGCTCTT 
TCTTCAAGAGCGCTTTAGCCGCCGCTAAGAAGGAGAAAGACTCCAACAACACCGCCGCCG 
TGAAGCTCGAGCTTAAGGAGATTGCCAAGGATTACGAAGTCGGTTTCGATTCGGTTGTGA 
CTGTTTTGGCTTATGTTTACAGCAGCAGAGTGAGACCGCCGCCTAAAGGAGTTTCTGAAT 
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GCGCAGACGAGAATTGCTGCCACGTGGCTTGCCGGCCGGCGGTGGATTTCATGTTGGAGG 
TTCTCTATTTGGCTTTCATCTTCAAGATCCCTGAATTAATTACTCTCTATCAGAGGCACT 
TATTGGACGTTGTAGACAAAGTTGTTATAGAGGACACATTGGTTATACTCAAGCTTGCTA 
ATATATGTGGTAAAGCTTGTATGAAGCTATTGGATAGATGTAAAGAGATTATTGTCAAGT 
CTAATGTAGATATGGTTAGTCTTGAAAAGTCATTGCCGGAAGAGCTTGTTAAAGAGATAA 
TTGATAGACGTAAAGAGCTTGGTTTGGAGGTACCTAAAGTAAAGAAACATGTCTCGAATG 
TACATAAGGCACTTGACTCGGATGATATTGAGTTAGTCAAGTTGCTTTTGAAAGAGGATC 
ACAC CAATCTAG ATGATG CGTGTGCTCTTCATTTCGCTGTTGCATATTG CAATGTGAAGA 
CCGCAACAGATCTTTTAA7VACTTGATCTTGCCGATGTCAACCATAGGAATCCGAGGGGAT 
ATACGGTGCTTCATGTTGCTGCGATGCGGAAGGAGCCACAATTGATACTATCTCTATTGG 
AAAAAGGTGCAAGTGCATCAGAAGCAACTTTGGAAGGTAGAACCGCACTCATGATCGCAA 
AACAAGCCACTATGG CGGTTGAATGTAATAATATCCCGGAGCAATG CAAGCATTCTCTC A 
AAGGCCGACTATGTGTAGAAATACTAGAGCAAGAAGACAAACGAGAACAAATTCCTAGAG 
ATGTTCCTCCCTCTTTTGCAGTGGCGGCCGATGAATTGAAGATGACGCTGCTCGATCTTG 
AAAATAGAGTTGCACTTGCTCAACGTCTTTTTCCAACGGAAGCACAAGCTGCAATGGAGA 
TCGCCGAAATGAAGGGAACATGTGAGTTCATAGTGACTAGCCTCGAGCCTGACCGTCTCA 
CTGGTACGAAGAGAACATCACCGGGTGTAAAGATAGCACCTTTCAGAATCCTAGAAGAGC 
ATCAAAGTAGACTAAAAGCGCTTTCTAAAACCGTGGAACTCGGGAAACGATTCTTCCCGC 
GCTGTTCGGCAGTGCTCGACCAGATTATGAACTGTGAGGACTTGACTCAACTGGCTTGCG 
GAGAAGACGACACTGCTGAGAAACGACTACAAAAGAAGCAAAGGTACATGGAAATACAAG 
AGACACTAAAGAAGGCCTTTAGTGAGGACAATTTGGAATTAGGAAATTCGTCCCTGACAG 
ATTCGACTTCTTCCACATCGAAATCAACCGGTGGAAAGAGGTCTAACCGTAAACTCTCTC 
ATCGTCGTCGGTGAGACTCTTGCCTCTTAGTGTAATTTTTGCTGTACCATATAATTCTGT 
TTTCATGATGACTGTAACTGTTTATGTCTATCGTTGGCGTCATATAGTTTCGCTCTTCGT 
TTTGCATCCTGTGTATTATTGCTGCAGGTGTGCTTCAAACAAATGTTGTAACAATTTGAA 
CCAATGGTATACAGATTTGTAATATATATTTATGTACATCAACAATAAAAAAAAAAAAAA 
AAAA 

>G27 8 Amino Acid Sequence (domain in AA coordinates: 2-593) 
MDTTIDGFADSYEISSTSFVATDNTDSSIVYIiAAEQVLTGPDVSALQLLSNSFESVFDSP 
DDFYSDAKLVLSDGREVSFHRCVLSARSSFFKSALAAAKi<£KDSNNTAAVKX I ELK£IAKD 
YEVGFDSWTVLAYVYSSRVRPPPKGVSECADENCCHVACRPAVDFMLEVLYLAFIFKIP 
EL ITL YQRHLLDWDKWI EDTLVI LKLANI CGKACMKLLDRCKE 1 1 VKSNVDMVSLEKS 

lpeelvkeiidrrkelglevpktokhvsnvhkaldsddielvklllkedh™ 

favaycnvktatdllkldl^vnhrnprgytvlhvaamrkepqlilsllekgasaseatl 

egrtalmi akqatmavecnni peqckhslkgrlcvei leqedkreqi prdvpps favaad 

elkmtlldlenrvalaqrlfpteaqaameiaemkgtcefivtslepdrltgtkrtspgvk 

iapfrileehqsrlkalsktvelgkrffprcsavldqimncedltqlacgeddtaekrlq 

kkqrymeiqetlkkafsednlelgnssltdstsstskstggkrsnrklshrrr* 

>G2421 (1..630) 

ATGGAGGGTTCGTCCAAAGGGTTGAGGAAAGGTGCATGGACTGCTGAAGAAGATAGTCTC 
TTGAGGCAGTGTATTGGTAAGTATGGAGAAGGCAAATGGCATCAAGTTCCTTTAAGAGCT 
GGGCTAAATCGGTGCAGGAAAAGTTGTAGACTAAGATGGTTAAACTATTTGAAGCCAAGT 
ATCAAGAGAGGAAAATTTAGTTCTGATGAAGTTGATCTTC^ 

CTAGGAAATAGGTGGTCCTTGATTGCTGGTCGATTACCTGGTCGGACCGCTAATGATGTC 
AAGAACTACTGGAACACCCATCTGAGTAAGAAGCATGAACCGTGTTGTAAAACTAAGATA 
AAAAGGATAAATATTATAACCCCTCCTAATACACCGGCCCAAAAAGTTTGTGAAAATAGT 
ATCACATGTAACAAAGATGATGAGAAAGATGATTTTGTGGATAATTTTATGGTTGGAGAT 
AATATATGGTTGGASCGTTTGCTAGACGAGGGCCAAGAGGTAGATGTGCTGGTTACAGAA 
GCGGCGGCAACAGAAAAGGAGGGC^CTTTGGCGTTTGACGTTGAG 

TTCGATGGAGAGACTGTGATCTTTGATTAGTGTTTATAAACGTTTGTGTTCTCTTGTTT^ 

TGAGGTTTCTCTATTTAATTTAGTATCTATTTO 

TTAGGCAAACCTTATGTTTCCGTTTCTGTGCGGCCGCTCTAG 

>G2421 Amino Acid Sequence (domain in AA coordinates: 9-110) 
MEGSSKGLRKGAWTAEEDSLLRQCIGKYGEGKWHQVPI^GLNRCRKSCRLRWLNYLKPS 
IKRGKFSSDEVDLLLRLHKLLGNRWSLIAGRLPGRTANDVKimmTHLSKKHEPCCKTKI 
KKINIITPPNTPAQKVCENSITCNKDDEKDDFVDNFMVGDNIWLERLLDEGQEVDVLVTE 
AAATEKEGTLAFDVEQLWNLFDGETVIFD* 
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>G2032 (53.. 1789) 

TCCCTCCCAGAGTAAGAACTTCCATACTTTGCTCTAGATTTCTTGAGAAAAGATGCAGCC 
GATCTTCCATGCGATCCTTAAAAATGACCTTCCAGCTTTTTTAGAGTTGGTAGAAGATAG 
TGAATCGTCTCTGGAGGAGAGAAACGAGGAAGAACACTTGAACAACACGGTTTTGCACAT 
GGCTGCAAAGTTTGGTCACCGAGAACTCGTCTCCAAGATTATTGAGCTCCGACCTTCCCT 
CGTGTCTTCCCGCAACGCATACAGAAACACACCTTTGCATCTTGCTGCTATCCTTGGAGA 
TGTAAACATAGTTATGCAGATGTTAGAGACTGGATTGGAAGTGTGTTCTGCACGCAATAT 
CAACAACCACACACCACTCCACTTGGCTTGCCGTAGCAATTCCATAGAGGCTGCCAGACT 
CATCGCGGAAAAGACACAATCAATTGGCCTCGGTGAACTCATTCTCGCCATATCAAGTGG 
ATCCACTAGTATCGTAGGGACTATACTGGAGAGATTCCCAGACCTAGCTAGGGAAGAAGC 
TTGGGTGGTTGAAGACGGCTCACAATCAACGCTACTGCATCATGCGTGTGATAAGGGAGA 
CTTTG7VACTGACAACTATATTGTTAGGGCTCGATCAAGGATTAGAAGAAGCACTTAACCC 
CMTGGTTTATCACCTCTGCATCTTGCGGTCCTCAGAGGCTCGGTTGTGATCCTGGAGGA 
GTTCTTGGACAAGGTTCCATTGTCTTTCAGCTCAATCACGCCGTCGAAAGAGACAGTCTT 
TCATCTCGCTGCTCGAAACAAAAATATGGATGCCTTTGTTTTTATGGCAGAGAGTTTGGG 
MTTAACAGCCAAATTCTTCTACAGCAAACCGATGAAAGTGGCAACACTGTCTTACATAT 
TGCTGCATCCGTCTCTTTTGATGCTCCTCTTATACGTTACATTGTTGGTAAGAATATAGT 
AGATATCACGTCCAAGAACAAGATGGGTTTTGAAGCTTTTCAACTTCTCCCTCGAGAAGC 
CCAAGACTTTGAGTTGTTATCAAGGTGGCTGAGATTTGGTACCGAGACTTCACAAGAGCT 
GGATTCTGAGAACAATGTAGAACAACACGAAGGCTCTCAAGAGGTCGAGGTAATACGGTT 
GCTAAGGATTAtAGGAATAAACACATCAGAGATAGCAGAGAGAAAGAGAAGCAAGGAACA 
GGAAGTGGAAAGAGGTCGTCAGAACTTGGAATATCAGATGCATATAGAAGCATTACAGAA 
TGCAAGAAATACGATTGCTATAGTGGCAGTCTTGATTGCTTCAGTTGCTTATGCCGGTGG 
GATAAACCCTCCGGGGGGCGTCTACCAAGACGGGCCATGGAGAGGGAAATCCTTAGTGGG 
GAAAACAACGGCGTTTAAGGTCTTTGCGATATGCAACAACATCGCACTGTTCACGTCCTT 
GGGCATCGTTATTCTTCTTGTTAGCATCATACCTTACAAGAGGAAACCCTTAAAGAGATT 
ATTGGTGGCCACGCATAGGATGATGTGGGTTTGTGTAGGTTTCATGGCGACGGCTTATAT 
AGCGGCGTCTTGGGTGACCATACCGCATTATCATGGAACACAATGGTTATTTCCAGCAAT 
TGTAGCCGTTGCTGGTGGAGCGTTGACCGTACTCTTTTTCTATCTCGGAGTTGAGACCAT 
CGGTCATTGGTTTAAGAAGATGAATCGTGTAGGGGATAATATACCTTCCTTTGCAAGAAC 
CAGTTCAGATTTAGCCGTCTCCGGAAT^ATCAGGCTATTTCACCTATTAAGAAAAACTGGT 
TTTCTAATTTCCCTGTAACCTGTGTAATTGTGTATGTG 

>G2032 Amino Acid Sequence (domain in AA coordinates: entire protein) 
MQPIFHAILKNDLPAFLELVEDSESSLEERNE^ 

PSLVSSRNAYRNTPLHLAAILGDWIVMQMLETGLEVCSARNINNHTPLHLACRSNSIEA 
ARLIAEKTQSIGLGELILAISSGSTSIVGTILERFPDLAREEAWWEDGSQSTLLHHACD 
KGDFELTTILLGLDQGLEEALNPNGLSPLHLAVLRGSWILEEFLDKVPLSFSSITPSICB 
TVFHLAARNKNI^AFVFMAESLGINSQILLQQ^ 

NIVDITSKNKMGFEAFQLLPREAQDFELLSRWLRFGTETSQELDSENNVEQHEGSQEVEV 
IRLLRI IG INTS E I AERKRS KEQEVERGRQNLEYQMHIEALQNARNT IAI VAVLIAS VAY 
AGGINPPGGVYQDGPWRGKSLVGKTTAFKVFAIOINIALFTSLGIVILLVSIIPYKRKPIj 
KRLLVATHRMMWVS VGFMATAY I AASWVTI PHYHGTQWLFPAI VAVAGGALTVLFFYLGV 
ETIGHWFKKMNRVGDNIPSFARTSSDLAVSGKSGYFTY* 
>G1396 (83.. 313) 

TCGACCTCGTTTCCTTTCCTCCTCTCTTCCTACCATTAGTACGTTACTGGAGCTGATCTC 
ACGTATATTTTGGATCGTAATCATGGACGGCGAAGATTTTGCCGGAAAGGCGGCTGCTGA 
AGCCAAGGGATTGAACCCGGGATTAATCGTGCTGCTTGTTGTTGGAGGTCCGCTTCTTGT 
GTTCCTAATCGCCAACTACGTGCTTTACGTTTATGCTCAGAAGAACCTACCTCCAAGGAA 
GAAGAAGCCCGTTTCCAAAAAGAAGCTCAAGCGGGAGAAGCTAAAGCAAGGAGTCCCTGT 
CCCTGGAGAATAAAAGCCAGCTTAAGCTTCCTTCACTTGTGCCTCCTTCAAAGCGGTTTT 
TGTTCGGTTACCAAATTTCACCCTTGCGGGTTTTTTTCTTCCTTTACTTCTGTCATGAGG 
ATTATCTTTGAGGCCT 

>G1396 Amino Acid Sequence (domain in AA coordinates: TBD) 
MDGEDFAGKAAAEAJCGLNPGLIVLLWGGPLLVFLIANYVLYVYAQK^PPRKKKPVSKK 
KLKREKLKQGVPVPGE* 
>G619 (382.. 2748) 

ATTTTTTTCCAATCTGCAAATTTTAGTCTATGTCTGTTCCTTGTGCTCCCTCTTCTCAGT 
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ACCTGCAAATGGAGGAAGAAGAATCCTTCTCTGAAACCCTTGTTCTCATTTGATTCCTCC 

TTCCTCTCTCTTCTTCTCTCTCTGTCTCTGATTCGTTATTCCACACTTATGACTCATCTT 

TCCCGTCAATAGCTAAGTTTGCCTCTTCTTTGTGAAATTTAGCTGAAAAAGGAGAGGAAT 

TCCGAATTCTGTCACTTCAAAGCTCGAATTTTGCAAACTTTCCTTTGATGGGTTTTACTT 

GTTTTGTTGTAATCTGATTAAAAATAG7LAACTTTTTGTTTTCTTCTTGTCTCCTTTTGCT 

CTTAAAAGAGAAGCTTTTTCAATGGAATTTGACTTGAATACTGAGATTGCGGAGGTGGAA 

GAGGAGGAGAATGATGATGTAGGAGTAGGAGTAGGAGGAGGAACAAGAATTGACAAGGGT 

AGGCTTGGAATTTCACCATCTTCTTCTTCTTCATGCTCTTCCGGATCATCATCGTCATCA 

TCTTCTACAGGCTCTGCATCTTCCATTTACTCTGAGCTTTGGCATGCTTGTGCTGGTCCT 

CTCACTTGTCTTCCCAAGAAAGGCAATGTAGTTGTCTATTTCCCTCAAGGTCATTTGGAG 

CAAGATGCTATGGTTTCATATTCGTCTCCTCTTGAAATCCCCAAATTTGACCTTAATCCC 

CAAATCGTCTGCAGGGTGGTTAATGTCCAGTTGCTTGCTAATAAGGACACCGATGAGGTC 

TACACTCAAGTCACTCTGCTTCCACTTCAAGAGTTTTCGATGCTAAATGGGGAGGGGAAA 

GAGGTCAAGGAGTTAGGAGGGGAGGAAGAGAGGAACGGAAGCTCATCCGTCAAGCGGACA 

CCTCATATGTTCTGTAAAACCTTAACAGCGTCTGACACAAGCACACATGGAGGCTTCTCT 

GTACCTAGAAGAGCCGCTGAAGATTGTTTTGCTCCTCTTGACTACAAACAACAGAGGCCA 

TCTCAAGAGCTCATTGCAAAGGACCTCCATGGAGTAGAGTGGAAGTTTCGCCATATCTAT 

AGAGGTCAACCAAGGAGGCATCTACTCACCACTGGTTGGAGTATCTTTGTCAGTCA7U^AG 

AATCTCGTCTCTGGTGATGCGGTTCTCTTTCTGAGAGACGAAGGAGGAGAGCTGAGATTA 

GGAATCAGAAGAGCAGCACGGCCAAGAAATGGACTTCCTGACTCAATCATTGAGAAGAAT 

TCATGTTCAAACATTCTGTCTCTTGTGGCTAATGCTGTATCTACAAAAAGCATGTTTCAT 

GTGTTCTACAGTCCACG AG CG ACG C ATGCAG AGTTTGTGATTC CTTATGAGAAGTATATC 

ACAAGCATCAGGAGTCCTGTTTGCATAGGCACAAGATTTAGAATGCGATTTGAAATGGAC 

GATTCTCCTGAGAGAAGATGCGCTGGTGTAGTGACTGGAGTCTGTGACTTGGACCCGTAT 

AGGTGGCCAAACTCTAAATGGAGGTGCTTGTTGGTGCGATGGGATGAGTCTTTTGTGAGT 

GATCATCAAGAAAGAGTTTCACCTTGGGAGATTGATCCCTCGGTTTCTCTCCCACACTTG 

AGCATTCAGTCATCTCCAAGGCCTAAAAGGCCATGGGCAGGTTTACTGGATACTACCCCA 

CCCGGAAACCCCATAACAAAAAGGGGTGGTTTTTTGGACTTTGAGGAGTCGGTTAGACCC 

TCTAAGGTCTTGCAAGGTGAAGAAAATATAGGTTCTGCATGACCCTCACAGGGGTTTGAT 

GTTATGAACCGCCGGATACTGGATTTTGCGATGCAGTCTCATGCAAATCCAGTCCTTGTG 

TCGAGTAGAGTCAAGGATCGATTTGGTGAGTTTGTAGATGCTACTGGCGTGAACCCAGCT 

TGTTCAGGTGTTATGGACCTGGATAGGTTTCCAAGGGTCTTGCAAGGTCAAGAAATTTGC 

TCGCTTAAATCATTCCCGCAATTTGCTGGTTTCAGTCCAGCTGCTGCTCCTAATCCCTTT 

GCTTACCAAGCCAACAAGTCAAGTTACTATCCGCTAGCTTTGCATGGGATTAGGAGCACT 

CATGTTCCGTATCAGAATCCATACAATGCGGGAAACCAATCCTCGGGTCCCCCTTCACGT 

GCAATAAACTTTGGTGAAGAGACTAGAAAGTTTGATGCACAAAATGAAGGTGGCCTACCA 

AATAATGTTACAGCTGATTTGCCATTCAAGATTGATATGATGGGAAAACAGAAAGGCAGT 

GAGTTGAATATGAATGCTTCATCAGGATGTAAACTTTTCGGATTCTCCTTACCAGTGGAG 

ACACCTGCATCTAAGCCGCAAAGCTCGAGCAAAAGAATCTGTACAAAGGTTCACAAGCAA 

GGAAGCCAAGTGGGGAGAGCTATTGATTTGTCGCGACTTAACGGGTATGATGATCTCCTT 

ATGGAGCTTGAACGGCTGTTCAACATGGAAGGGCTTCTCAGGGATCCTGAAAAAGGATGG 

AGGATCTTATATACTGATAGTGAGAACGATATGATGGTCGTTGGCGATGATCCATGGCAT 

GATTTCTGCAATGTGGTGTGGAAGATACACTTATACACGAAAGAGGAAGTGGAGAATGCG 

AATGACGATAACAAGAGTTGTTTAGAGCAAGCTGCTCTCATGATGGAAGCATCAAAGTCA 

TCTTCTGTGAGCCAGCCTGATTCTTCTCCTACAAT^ 

AGCTTATTTCCTATGTTTTAAAGTGTGTTTTGCT^ 

GTCTTTGAATCC^TTTATGTGTTTCT^ 

TGTACCGTTTTACTeGAGAGATATGTGAGTTTATGGGATGTGTAAAGCATGCCATTGGAT 

TTTAAGGTTTTCAAAATTACAATATATATATTAGTTTTGAAGOT 

A 

>G619 Amino Acid Sequence (domain in AA coordinates: 64-406) 
MEFDLNTEIAEVEEEENDDVGVGVGGGTRIDKGRLGISPSSSSSCSSGSSSSSSSTGSAS 
SIYSELWHACAGPLTCLPKKGNVVVYFPQGHLEQDAMVSYSSPLEIPKFDIiNPQIVCRVV 
NVQLLANKDTDEVYTQVTLLPLQEFSMLNGEGKEV^ 

LTASDTSTHGGFSVPRRAAEDCFAPLDYKQQRPSQELIAKDLHGVEWKFRHIYRGQPRRH 
LLTTGWSIFVSQKNLVSGDAVLFLRDEGGELRLGIRRAARPRNGLPDSIIEKNSCSNILS 
liVANAVSTKSMFHVF^SPRATHAEFVIPYEKYITSIRSPVCIGTRFRMRFEMDDSPERRC 
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AGWTGVCDLDPYRWPNSKWRCLLVRWDESFVSDHQERVSPWEIDPSVSLPHLSIQSSPR 

PKRPWAGLLDTTPPGNPITKJRGGFLDFEESVRPSKVLQGQENIGSASPSQGFDVMNRRIL 

DFAMQSHANPVLVSSRVKDRFGEFVDATGVNPACSGVMDLDRFPRVLQGQEICSLKSFPQ 

FAGFSPAAAPNPFAYQANKSSYYPLALHGIRSTHVPYQNPYNAGNQSSGPPSRAINFGEE 

TRKFDAQNEGGLPNNVTADLPFKIDMMGKQKGSELNMNASSGCKLFGFSLPVETPASKRQ 

SSSKRICTKVHKQGSQVGRAIDLSRLNGYDDLLMELERLFNMEGLLRDPEKGWRILYTDS 

ENDMMWGDDPWHDFCNWWKIHLYTKEEVENANDDNKSCLEQAALMMEASKSSSVSQPD 

SSPTITRV* 

>G2295 (33.. 917) 

GTAATATATAACAATAACTCAGGTTACAAAGGATGGTTCCGAAAGTGGTCGACCTACAAA 
GGATAGCGAACGATAAGACAAGGATAACAACTTACAAGAAGAGGAAAGCTAGTCTTTACA 
AGAAGGCACAAGAGTTCTCAACTCTCTGCGGCGTCGAGACATGTCTCATCGTCTACGGTC 
CCACGAAGGCTACCGATGTGGTGATTTCCGAGCCAGAGATATGGCCGAAGGACGAGACCA 
AAGTCAGGGCCATCATACGCAAGTACAAAGACACAGTGTCGACCAGCTGCAGGAAAGAAA 
CCAACGTGGAGACTTTCGTCAACGATGTAGGGAAAGGAAACGAGGTGGTGACTAAAAAGA 
GAGTGAAGCGTGAGAATAAGTATTCTAGTTGGGAGGAGAAGCTAGACAAGTGTTCACGAG 
AGCAACTACATGGGATTTTCTGTGCCGTGGATAGCAAGTT7\AATGAAGCTGTAACGAGAC 
AGGAGCGTAGTATGTTTAGGGTTAATCATCAAGCCATGGACACACCATTCCCGCAGAATT 
TAATGGACCAACAATTCATGCCACAGTATTTTCATGAGCAGCCACAGTTTCAAGGCTTCC 
CTAATAATTTCAATAATATGGGTTTCTCGTTGATTTCACCTCATGATGGTCAGATTCAAA 
TGGACCCAAATCTCATGGAGAAGTGGACCGACTTGGCTTTGACTCAAAGCTTGATGATGT 
CAAAGGGAAACGATGGTACTCAATTCATGCAGAGGCAAGAACAACCATACTATAATCGTG 
AACAGGTTGTATCGAGGTCTGCAGGTTTCAATGTTAACCCGTTTATGGGATATCAAGTCC 
CGTTTAATATTCCTAATTGGAGATTATCGGGAAATCAAGTTGAAAATTGGGAGCTTTCAG 
GGAAGAAAACGATATGATTTGAATTACGGAGCTTTATTAGTTTTTAGGGTTTTATAGTTT 
TG 

>G2295 Amino Acid Sequence (domain in AA coordinates: TBD) 
WPKVVDLQRIANDKTRITTYKKRKAS 

PEI WPKDETKVRAI IRKYKDTVSTSCRKETNVETFViroVGKGNEWTKKRVKRENKYSSW 
EEKLDKCSREQLHGIFCAVDSKLNEAVTRQERSMFRVraQAMDTPFPQNLMDQQFMPQYF 
HEQPQFQGFPNNFMl^GFSLISPHDGQIQMDPNLMEKWTDLALTQSLMMSKGN^ 
RQEQPYYTOEQVVSRSAGFNVNPFMGYQVPFNIPNWRLSGNQVENWELSGKKTI * 
>G312 (1..1755) 

ATGGCTTACATGTGCACTGATAGTGGCAATCTAATGGCTATTGCTCAACAAGTCATCAAA 
CAGAAGCAGCAACAAGAACAACAACAGCAGCAACATCATCAAGACCATCAGATTTTTGGT 
ATTAATCCTTTGTCTCTTAACCCATGGCCCAATACTTCCCTCGGGTTTGGGCTTTCAGGT 
TCGGCTTTTCCCGACCCGTTTCAAGTTACCGGCGGCGGAGATTCCAACGATCCTGGCTTT 
CCTTTTCCTAACTTAGACCACCACCACGCCACAACCACCGGCGGTGGGTTCAGGTTATCT 
GATTTCGGCGGTGGAACCGGCGGCGGCGAGTTTGAGTCCGACGAGTGGATGGAGACTCTT 
ATCAGCGGTGGAGACTCCGTTGCAGACGGTCCTGATTGTGACACCTGGCATGATAATCCC 
GATTACGTAATCTACGGTCCTGATCCATTCGATACTTACCCGAGTCGACTCAGTGTCCAA 
CCGTCAGATCTAAACCGAGTCATTGACACGTCGAGTCCGCTTCCTCCGCCGACCTTGTGG 
CCTCCTTCTTCGCC^TTATCGATTCCTCCGCTTACTCATGAGTCACCAACC7y\AGAAGAT 
CCAGAGACTAACGACTCCGAAGACGATGACTTCGACCTAGAACCACCTCTCCTCAAAGCT 
ATATACGACTGTGCACGGATCTCAGACTCTGACCCTAACGAAGCTTCCAAGACGCTTCTT 
CAGATCCGAGAATCTGTATCGGAGCTAGGTGATCCGACGGAGCGAGTTGCATTTTACTTC 
ACGGAAGCTCTCTCCAACAGACTGTCTCCTAATTCGCCGGCGACGTCGTCTTCTTCTTCA 
TCTACGGAGGATTTAATCTTATCTTAT^ 

TTCGCACATTTGACGGCGAATCAAGCGATTCTAGAAGCGACGGAGAAGTCGAACAAGATT 
CACATCGTCGATTTTGGAATCGTTCAAGGTATACAATGGCCTGCTCTTCTTCAAGCTCTA 
GCTACTCGTACTTCTGGTAAACCCACTCAAATCCGGGTCTCGGGTATACCCGCTCCATCT 
CTCGGTGAATCTCCGGAACCGTCGTTAATCGCCACCGGAAACCGCCTCCGTGATTTCGCC 
AAGGTTCTGGATCTGAATTTCGATTTCATCCCAATTCTCACTCCCATACATTTACTTAAC 
GGGTCAAGTTTCCGGGTCGACCCGGATGAAGTACTGGCCGTGAATTTCATGCTCCAGCTC 
TACAAATTACTCGACGAGACGCCGACGATAGTTGACACCGCACTACGGCTCGCCAAATCG 
TTGAACCCGAGGGTCGTCACTCTCGGAGAATACGAAGTGAGCTTAAACCGGGTCGGTTTC 
GCT7y\.CCGGGTAAAGAACGCGCTTCAATTCTATTCCGCGGTTTTCGAATCCCTTGAACCG 
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AACTTGGGGCGTGATTCGGAGGAGAGAGTGAGAGTTGAGCGAGAGTTGTTCGGCCGGAGA 
ATCTCGGGTTTGATTGGACCGGAGAAAACCGGAATTCATAGAGAAAGAATGGAAGAGAAA 
GAGCAATGGCGGGTATTAATGGAGAATGCCGGTTTTGAATCGGTTAAGCTGAGTAATTAC 
GCAGTGAGCCAAGCGAAGATTCTATTGTGGAATTACAATTACAGCAATTTGTATTCAATT 
GTTGAATCTAAGCCTGGCTTCATCTCTTTGGCCTGGAACGATTTACCTCTCCTCACTCTT 
TCTTCCTGGCGATAA 

>G312 Amino Acid Sequence (domain in AA coordinates: 320-336) 

MAYMCTDSGNLMAIAQQVIKQKQQQEQQQQQHHQDHQIFGINPLSLNPWPNTSLGFGLSG 

SAFPDPFQVTGGGDSNDPGFPFPNLDHHHATTTGGGFRLSDFGGGTGGGEFESDEWMETL 

ISGGDSVADGPDCDTWHDNPDYVIYGPDPFDTYPSRLSVQPSDIjNRVIDTSSPLPPPTLW 

PPSSPLSIPPLTHESPTIvEDPETNDSEDDDFDLEPPLLKAIYDCARISDSDPNEASKTLL 

QIRESVSELGDPTERVAFYFTEALSNRLSPNSPATSSSSSSTEDLILSYKTLNDACPYSK 

FAHLTANQAILEATEKSNKIHIVDFGIVQGIQWPALLQALATRTSGKPTQIRVSGIPAPS 

LGESPEPSLIATGNRLRDFAKVLDLNFDFIPILTPIHLLNGSSFRVDPDEVLAVNFMLQL 

YKLLDETPTIVDTALRLAKSLNPRWTLGEYEVSLNRVGFANRVKNALQFYSAVFESLEP 

NLGRDSEERVRVEREIiFGRRISGLIGPEKTGIHRERMEEKEQWRVLMENAGFESVKLSNY 

AVS Q AKI LLWN YW Y SNL YS I VE S KPG F I S L AWNDLPLLTLS S WR * 

>G1444 (192. .1001) 

AATCCCCTATCCTTCGCAAGACCCTTCCTCTATATAAGGAAGTTCATTTCATTTTTTTTT 

GACACGCTGACAAGCTGACTCTAGCATATCTGGCACCGGCGACCAGTCCTTCTTTGGTGC 

AAAGATCCCAT^AATCAAAATCGAAAGAGAGAATAAATCAAAAGGAAGAATCTTTATCT 

GCTTTCTCTCGATGAGGATCCGGAAACGACAAGTGCCTCTTCCTTTATCGTCTCTATTAC " 

CAGTTCCTCTATCAGATCTCTACTTTAACCGCTCACCGACGGCCACCGCGAGATACTTTC 

GCGGTGGTTATAAAGACGGCGGTGATGATTTTGGTTCTCTTCAGCTTTCGCTTCCGCCGC 

CGTCGCAGATTTCTGATCGGCTTATTCAAAGAGATTTGATAAAGAAGAAGGAGGAGGTCA 

AGGCTTTGGATGATGATAATGGTGATGTAGACGTCAAGAGTCGTACTGATGCATCGGGCA 

GCAAGAATGTTAATCCCCGAGGAGAATCCGTCTCTTCAATACAAGTTGTCGAGAAGAATG 

AAAAGGTTGTGTCTTTGAGGAAGAGAAGAGGCTTTATCAACTTTGAGGATTACGAAGATG 

AGGAAGATGAAGAAGCTAGTGGCGGTGGAGGCCGTATTAATAAAGGGAAAAAGAAAGCGA 

AAAAGAGCGGTGGTGGGTTAGAGGAAGGATCACGGTGCAGCCGTGTTAACGGTAGAGGAT 

GGAGATGTTGTCAGCAAACGCTTGTTGGTTATTCTCTTTGTGAGCATCATCTCGGTAAAG 

GAAGGGTAAGGAGCATGAACAAGAGTGGTGGTGGTCGTGGCGGCGAGAAAAAGGCGGTGG 

TGGTGGAAGTGAAGAAGAAGAGAGTAAAGCTTGGCATGGTAAAGGCACGTTCAATAAGTA 

GTTTGCTTGGACAAACCAGCACTAGTGGTGGTACTAGTGGTGATGTTGATCAGGGTGAGA 

TAAGTGCACCTGCTGATCAGTTCGCTGCATGTGATAAGTAGGTCTGTTGATCAGCATTTG 

CATGTATATGGATATGTGTATGTTTATGTACATGATGATAATGGGCATAGCGCGGCCGCT 

CTAGACAGGCCTGGAACCGGATCCTCTAGCTAGAGCTTTCGTTAGTATCATCGGGTTTAG 

ACAACGTT 

>G1444 Amino Acid Sequence (domain in AA coordinates: 168-193) 

MRIRKRQVPLPLSSLLPVPLSDLYFNRSPTATARYFRGGYKDGGDDFGSLQLSLPPPSQI 

SDRL IQRDLIKKKEEVKALDDDNGDVDVKSRTDAS GS KNVNPRGES VS S I QWEKNEKW 

SLRKRRGFINFEDYEDEEDEEASGGGGRINKGKKKAKKSGGGLEEGSRCSRVNGRGWRCC 

QQTLVG YS LCEHHLGKGRVRSMNKS GGGRGGEK3CAVVVEVKKKRVKLGMVKARS I S S LLG 

QTSTSGGTSGDVDQGEISAPADQFAACDK* 

>G801 (27.. 746) 

GATAGTGATAACGAAATCCTAATTCCATGGCCGACAACGACGGAGCAGTGAGTAACGGCA 

TCATAGTCGAGCAGACGTCAAACAAAGGACCTCTTAACGCCGTTAAGAAACCACCGTCTA 

AAGATCGACAC^GCAAAGTTGACGGAAGAGGAAGAAGGATTCGTATGCC^TCATTTGCG 

CAGCTCGAGTTTTTCAATTGACCAGAGAGTTAGGTC^ 

AGTGGCTTCTCCGTCAAGCTGAGCCITCTATCATAGCCGC 

CGGCGAGTTTCTCCACTGCTTCTCTCTCCACTTCTTCTCCGTTTACTCTCGGGAAACGTG 

TCGTCAGAGCGGAGGAAGGAGAATCCGGCGGCGGAGGAGGAGGAGGGTTAACAGTGGGAC 

ACAO^TGGGGACTTCGTTAATGGGTGGTGGTGGTTCTGGTGGGTTTTGGGCT 

CGAGGCCGGATTTCGGACAAGTCTGGAGCTTTGCAACCGGAGCTCCACCGGAAATGGTTT 

TTGCGCAGCAG(^GCZVACC^GCTACACTCTTCGTCCGCCACCAGCAGCAACAGCAAGCTT 

CCGCCGCCGCAGCAGCTGCAATGGGTGAGGCTTCAGCAGCTAGAGTTGGGAATTATCTTC 

CGGGTCATCATCTCAATTTGCI^GCTTCTTTGTCTGGTGGAGCTAACGGGTCGGGTCGGA 
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GGGAAGACGACCACGAACCACGTTGAGT^ATGGTATTGTCTTTTTGGTAATGTATAGAAA 

AATTCCTATGTTTTATGTCATCGAAAGTGTTTAGAAAGTACCTCTAATTTGCGGTTTCTT 

TTGCTCCTTTTTTACTTAATTT/UVGCTTATTGCTTGTTTGATTAGGGTTTTAGGGTTTAA 

GAATATTTGGTCTCGTTAATTTGTTTCGGAGAGTGATAGAAAGAGAGAGAGATTGATTGA 

TTGTTGTACCTAAAACGCTATAAAAGCTCTGTTTTTACTAGCGAAAAAA 

>G801 Amino Acid Sequence (domain in AA coordinates: 32-93) 

MADNDGAVSNGIIVEQTSNKGPLNAVKKPPSKDRHSKVDGRGRRIRMPIICAARVFQLTR 

ELGHKSDGQTIEWLLRQAEPSIIAATGTGTTPASFSTASLSTSSPFTLGKRWRAEEGES 

GGGGGGGLTVGHTMGTSLMGGGGSGGFWAVPARPDFGQVWSFATGAPPEMVFAQQQQPAT 

LFVRHQQQQQAST^AAAAAMGEASAARVGNYLPGHHLNLLASLSGGANGSGRREDDHEPR* 

>G1950 (42 . .764) 

CTGAATTCGAACTTTGGAAG7\AGAAGAAGCTTTGATCAATCATGGAAATTGCAACCGATA 
CAGCAAAGCAGATGAGAGACGAAGAGTTGTTCAAAGCAGCGGAATGGGGAGATTCATCGT 
TGTTCATGTCATTATCTGAAGAACAGCTCTCTAAATCTCTCAATTTCAGAAACGAAGATG 
GTCGCTCTCTCCTCCATGTCGCTGCTTCCTTCGGCCATTCTCAAATAGTGAAGTTGTTAT 
CAAGTTCAGATGAAGCAAAGACTGTAATCAATAGCAAGGATGATGAAGGATGGGCTCCTT 
TGCATTCCGCTGCTAGCATCGGTAATGCTGAGCTCGTTGAGGTGCTTTTGACCAGAGGTG 
CTGATGTCAATGCCAAAAATAACGGTGGTCGCACTGCTCTTCACTATGCTGCTAGCAAAG 
GCCGGTTGGAGATTGCTCAGCTTTTATTAACACACGGTGCAAAGATTAACATCACAGACA 
AGGTTGGTTGCACTCCGCTTCACAGGGCAGCAAGCGTGGGAAAGTTAGAAGTTTGTGAAT 
TTCTTATTGAAGAAGGAGCAGAGATCGATGCTACGGATAAAATGGGTCAAACTGCACTCA 
TGCATTCAGTTATCTGCGATGACAAACAGGTTGCGTTCCTGCTTATAAGACATGGTGCAG 
ATGTGGATGTAGAAGACAAGGAAGGCTACACTGTTCTAGGCCGAGCTACCAATGAATTCC 
GACCTGCACTTATCGATGCTGCTAAGGCCATGCTTGAAGGATAAAATGACTCTGGATTAC 
TTTAAAACTTACTAACTCTGAGAGTTGTTTAGTTACTTAAAAGGATTTTTCTTTACTGTA 
TCATGTTTGCAAAATGTTTCTGCCTTATCAATTCATGTTCTGT 

>G1950 Amino Acid Sequence (domain in AA coordinates: 65-228) 

MEIATDTAKQMRDEELFKAAEWGDSSLFMSLSEEQLSKSLNFRNEDGRSLLHVAASFGHS 

QI VKLLS S SDEAKTVINSKDDEG WAPLHS AAS I GNAELVEVLLTRGADVNAKNNGGRTAL 

HYAASKGRLEIAQLLLTHGAKINITDKVGCTPLHRAASVGKLEVCEFLIEEGAEIDATDK 

MGQTALMHSVICDDKQVAFLLIRHGADVnVEDKEGYTVLGRATNEFRPALIDAAJCAM 

* 

>G958 (55.. 1950) 

CGTCGACATGTTCATATTTGTTTCTAGCTAAGT^AGTTTGTATAAGGCAGTGGACATGGCT 
CCTGTTTCAATGCCTCCAGGTTTCCGGTTTCATCCAACAGACGAAGAGCTTGTCATATAC 
TACCTCAAGCGAAAGATTAATGGTCGGACTATTGAGTTAGAGATAATACCCGAGATTGAT 
CTTTACAAATGCGAACCTTGGGATTTACCTGGGAAGTCCTTGCTGCCAAGTAAAGACCTA 
GAATGGTTCTTTTTCAGTCCTCGAGACCGGAAATATCCAAACGGATCAAGAACAAACCGG 
GCGACCAAAGCAGGTTACTGGAAAGCCACCGGGAAAGATCGTAAAGTGACTTCACATTCA 
CGGATGGTTGGAACAAAGAAAACATTAGTTTATTACCGAGGAAGAGCGCCTCATGGCTCT 
CGTACCGATTGGGTCATGCACGAGTACCGTCTTGAAGAACAAGAATGTGACTCT71AATCC 
GGTATACAGGATGCCTATGCACTTTGTCGAGTATTTAAGAAG^ 

ATTGAAGAACAACACCATGGTACGAAGAAGAACAAAGGAACGACTAATAGTGAACAATCT 

CCAGTCTCACCTGAGACAGGAGGCTTAACTCAACTCGGTAATAATTCGTCGTCGGATATG 
GAAACGATAGAGAATAAATGGAGTCAGTTTATGTCGCATGACACGTCCTTCAACTTCCCA 
CCTCAGTCTCAATATGGAACAATCTCATATCCTCCCTCGAAGGTTGATATAGCGTTAGAG 
TGTGCAAGACTACA5SAATCGTATGTTGCCACCAGTACCACCACTTTACGTAGAAGGTCTC 
AC^CACAATGAATATTTTGGAAACAATGTAGCTAACGATACAGATGAAATGTTGAGCMG 
ATTATAGCATTGGCTCAAGCCTCACATGAGCCACGAAACAGTCTAGACTCATGGGACGGT 
GGTTCTGCTTCCGGGAACTTCCATGGAGACTTTAACTATTCCGGAGAAAAAGTCTCATGC 
CTAGAGGCGAACGTGGAGGCTGTAGATATGCAAGAACACC^TGTGAATTTTAAGGAAGAA 
AGACTTGTTGAAAACTTGAGATGGGTAGGAGTATCAAGCAAGGAACTTGAAAAGAGCTTC 
GTTGAAGAACACTCAACGGTAATTCCTATAGAAGATATTTGGAGATATCATAATGAT7VAT 
CAAGAACAAGAACATCATGATCAAGATGGTATGGACGTTAACAACAACAATGGAGATGTG 
GATGATGCTTTCACACTCGAGTTTTCGGAAMCGAACATAACGAGAATCTTTTGGACAAG 
AACGATCATGAGACAACGAGTTCCTCATGTTTTGAGGTGGTAAAAAAAGTTGAGGTTAGC 
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CATGGATTGTTTGTCACAACTCGTCAGGTAACCAACACATTCTTCCAACAGATAGTACCA 
TCGCAAACCGTTATAGTTTATATAAATCCGACGGATGGCAATGAGTGTTGTCATAGTATG 
ACATCAAAAGAGGAGGTTCATGTCCGTAAAAAGATAAATCCGCGAATCAACGGAGTAAGC 
TCAACAGTTCTTGGACAATGGAGAAAATTCGCGCATGTTATTGGCTTCATTCCTATGCTT 
CTATTGATGCGTTGTGTTCATCGAGGTAACTCTAACAAAAACAGAGGCAGTGAAGGTTAC 
TCGAGGCAGCCTACGAGAGGAGATTGTAACAATCGGGGAACAATACTCATGATGGAAAAT 
GCTGTCGTGAGAAGAAAAATTTGGAAGAAGAAGAAAGAGAAAAATATGGTTGACGT^ACAA 
GGTTTTCGGTTTCAAGATAGTTTCGTATTGAAGAAGTTGGGGCTTTCTCTTGCTATCATC 
TTAGCTGTTTCTACCATAAGTCTTATTTGAATACTGAGGTTCAATATATCATATATGGCT 
TTTCACTTTTCTATTGTACTCCCATTTGCCTAGGTCGTATGC 

>G958 Amino Acid Sequence (conserved domain in AA coordinates : 7 -156) 

MAPVSMPPGFRFHPTDEELVIYYLKRKINGRTIELEIIPEIDLYKCEPWDLPGKSLLPSK 

DLEWFFFSPRDRKYPNGSRTNRATICAGYWKATGKDRKVTSHSRMVGTKKTLVYYRGRAPH 

GSRTDWVMHEYRLEEQECDSKSGIQDAYALCRVFKKSALANKIEEQHHGTKKNKGTTNSE 

QSTSSTCLYSDGMYENLENSGYPVSPETGGLTQLGNNSSSDMETIENKWSQFMSHDTSFN 

FPPQSQYGTISYPPSKVCIALECARLQNRMLPPVPPLYVEGLTHNEYFGNNVANDTDEML 

SKIIALAQASHEPRNSLDSWDGGSASGNFHGDFNYSGEKVSCLEANVEAVDMQEHHVNFK 

EERLVENLRWVGVS SKELEKS FVEEHSTVI PIEDI WRYHJSnDNQEQEHHDQDGMDWNNN 

DVBDAFTLEFSENEHNENLLDKNDHETTSSSCFEVVKKVEVSHGLFVTTRQVTNTFFQQI 

VPSQTVIVYINPTDGNECCHSMTSKEEVHVRKKINPRINGVSSTVLGQWRKFAHVIGFIP 

MLLLMRCVHRGNSNKNRGSEGYSRQPTRGDCIWRGTILMMENAv^ 

EQGFRFQDSFVLKKLGLSLAIILAVSTISLI* 

>G1037 (1..1722) 

ATGACTGTTGAACAAAATTTAGAAGCTTTGGATCAGTTTCCTGTAGGAATGAGAGTTCTT 

GCTGTTGATGATGACCAAACTTGTCTCAAAATCCTTGAATCTCTCCTTCGTCACTGCCAA 

TACCATGTAACAACGACGAACCAAGCACAAAAGGCTTTAGAGTTATTGAGAGAGAACAAG 

AACAAGTTTGATCTGGTTATTAGTGATGTTGACATGCCTGACATGGATGGTTTCAAACTC- 

CTTGAGCTTGTTGGTCTTGAAATGGACCTACCTGTCATAATGTTGTCTGCGCATAGTGAT 

CCAT^AGTATGTGATGAAGGGAGTTACTCATGGTGCTTGTGATTATCTACTGAAGCCGGTT 

CGTATTGAGGAGTTGAAGAACATATGGCAACATGTCGTGAGAAGTAGATTTGATAAGAAC 

CGTGGGAGTAATAATAATGGTGATAAGAGAGATGGATCAGGTAATGAAGGTGTTGGGAAT 

TCTGATCCGAACAATGGGAAAGGTAATAGAAAACGTAAAGATCAGTATAATGAAGATGAG 

GATGAGGATAGAGATGATAATGATGATTCGTGTGCTCAAAAGAAGCAACGTGTTGTTTGG 

ACTGTTGAGCTGCATAAGAAATTTGTTGCAGCTGTTAACCAATTGGGATATGAGAAGGCT 

ATGCCTAAAAAGATTTTGGATCTGATGAATGTTGAGAAGCTCACTAGAGAAAATGTGGCC 

AGTCATCTTCAGAAATTCCGCCTTTACTTGAAGAGGATCAGTGGTGTGGCTAATCAGCAA 

GCTATTATGGCAAACTCTGAGTTACATTTTATGCAAATGAATGGACTTGATGGTTTCCAT 

CACCGCCCAATCCCTGTTGGATCTGGTCAGTACCATGGTGGGGCTCCTGCAATGAGATCT 

TTCCCTCCAAACGGGATTCTTGGCAGACTCAATAGCTCTTCGGGGATCGGTGTCCGCAGC 

CTTTCTTCTCCrrCCTGCAGGAATGTTCTTGCAAAACCAGACCGATATCGGAAAGTTTCAC 

CATGTCTCATCACTTCCTCTTAACCACAGTGATGGAGGAAACATACTTCAAGGGTTGCCA 

ATGCCTTTAGAGTTCGACCAGCTTCAGACAAACAACAACAAAAGTAGAAACATGAACAGT 

AACAAGAGCATTGCTGGGACCTCCATGGCTTTTCCTAGCTTCTCT 

CTCATCAGTGCTCCTAATAACAATGTCGTGGTTCTAGAAGGTCACCCACAAGCAACTCCT 
CCAGGCTTCCCAGGACACCAGATCAATAAACGTTTGGAGCATTGGTCAAATGCTGTATCC 
TCTTCGACTCACCCTCCTCCCCCGGCACATAACAGTAATAGTATCAATCATCAGTTCGAT 
GTCTCTCCATTACCGCATTCTAGACCCGACCCCTTGGAATGGAACAATGTGTCATCAAGC 
TACTCTATACCATTCTGTGACTCTGCCAATACATTGAGTTCTCC^ 

AATCCCCGAGCTTTCTGTAGAAACACGGACTTCGATTCAAACACAAATGTGCAACCTGGA 
GTCTTTTATGGTCCATCCACGGATGCTATGGCTCTGTTGAGTAGTAGTAACCCGAAAGAA 
GGGTTCGTCGTAGGCCAACAGAAGTTACAGAGTGGTGGATTCATGGTTGCAGATGCTGGT 
TCCTTAGATGATATAGTCAACTCCACGATGAAGCAGGTGTGA 

>G1037 Amino Acid Sequence (domain in AA coordinates: 11-134, 200-248) 
MTVEQNLEALDQFPVGMRVLAVDDDQTCLKILESLLimCQYHTCTTNQAQKALELLRENK 
NKFDIiVISDVDMPDMDGFKLLELVGLEMDLPVIMLSAHSDPKYVMKGVTHGAOT 
RIEELKNIWQHWRSRFDKNRGSNNNGDKRDGSGNEGVGNSDPNNGKGNRKRKDQYNEDE 
DEDRDDNDDSCAQKKQRVWTVELHKKFVAAVNQLGYE LDLMNVEKLTRENVA 



248 



BNSDOCID: <WO_03013227A2_L> 



WO 03/013227 PCT/US02/25805 

249/286 



SHLQKFRLYLKRISGVANQQAIMANSELHFMQMNGLDGFHHRPIPVGSGQYHGGAPAMRS 

FPPNGILGRLNSSSGIGVRSLSSPPAGMFLQNQTDIGKFHHVSSLPLNHSDGGNILQGLP 

MPLEFDQLQTNNNKSRNMNSNKSIAGTSMAFPSFSTQQNSLISAPNNNVVVLEGHPQATP 

PGFPGHQINKRLEHWSNAVSSSTHPPPP7U1NSNSINHQFDVSPLPHSRPDPLEWNNVSSS 

YSIPFCDSANTLSSPALDTTNPRAFCRNTDFDSNTMVQPGVFYGPSTDT^MALLSSSNPKE 

GFWGQQKLQSGGFMVADAGSLDDIVNSTMKQV* 

>G2065 (33.. 1124) 

AACCACACAAAACAAAACAAAAAAACATATTGATGGGGATGAAGAAGGTAAAGCTATCTT 
TGATAGCTAATGAAAGATCAAGG7VAAACATCCTTCATGAAGAGGAAAAACGGGATATTCA 
AGAAACTCCACGAGTTGTC7UVCTCTATGTGGTGTCCAAGCTTGTGCTCTCATCTATAGTC 
CATTCATACCGGTTCCAGAGTCATGGCCGTCAAGGGAAGGTGCTAAAAAGGTAGCTTCAA 
AGTTTCTGGAGATGCCGCGGACAGCCCGAACCAGGAAGATGATGGATCAAGAAACCCATC 
TTATGGAGAGGATTACCAAAGCAA7VAGAGCAACTAAAGAATTTGGCTGCTGAGAACCGAG 
AATTACAGGTTAGACGATTTATGTTTGATTGTGTTGAAGGCAAAATGTCCCAGTATCGTT 
ATGATGCAAAAGACCTTCAAGATTTGCTATCTTGTATGAATCTATATCTCGATCAGCTTA 
ACGGAAGGATCGAGTCCATTAAAGAAAACGGTGAGTCGTTGTTGTCTTCCGTCTCTCCTT 
TTCCTACTAGAATTGGTGTTGACGAAATTGGTGATGAGTCGTTTTCCGACTCTCCTATTC 
ATTCTACAACTAGGGTTGTAGATACTCCTAATGCTACCAATCCTCATGTTCTTGCGGGCG 
ATATGACTCCTTTTCTTGATGCGGACGCAAATGCGGTAACTGCTCCCAGTCGATTTTCTG 
ATCATATTCMTATGAAAATATGAATATGAGTC7UUUVTCTGCATGAACCGTTTCAACACC 
TTGTTCCTACTAACGTTTGTGATTTTTATCAAAATCAGAATATGAATCAGGTTCAATACC 
AGGCTCCTAATAATCTGTTTAATCAGATTCAACGAGAATTCTACAACATAAATTTGAATC 
TGAATTTGAATCTGAATTCAAATCAGTATCTGAATCMCAACAATCATTCATGAATCCGA 
TGGTGGAACAACATATGAATCATGTTGGAGGGCGTGAAAGCATTCCTTTCGTGGACAGAA 
ACTACTACAACTACAATCAACTACCAGCCGTTGATCTTGCTTCCACCAGTTACATGCCTT 
CAACCACCGATGTTTATGATCCTTACATCAACAACAATCTCTAATCACAAT^AGACGGAGA 
TTTTCTAGTTTAA 

. >G2065 Amino Acid Sequence (domain in AA coordinates: TBD) 
MGMKKVKLSLIANERSRKTSFMKRKWGIFKKLHELSTLCGVQACALIYSPFIPVPESWPS 
REGAJKKVASKFLEMPRTARTRKMMDQEraLMERITKAKEQ 

VEGKMSQYRYDAKDLQDLLSCMNLYLDQLNGRIESIKENGESLLSSVSPFPTRIGVDEIG 
DESFSDSPIHSTTRVVDTPNATNPHVLAGDMTPFLDADANAVTAPSRFSDHIQYENMNMS 
QNLHEPFQHLVPTNVCDFYQNQNMNQVQYQAPNNLFNQIQREFYNINLNLNLNLNSNQYL 

NQQQSFMNPMVEQHMNHVGGRESIPFVDRNYYNYNQLPAVDLASTSYMPSTTDVYDPYIN 
NNL* 

>G2137 (77.. 1123) 

GGGATTTGACTTTAGCACTTCA/U^ATCCAAAGCTAAAAGACAAAAAAGAATAGAGGTTCG 
ATTTGCATCTCCATTAATGGGCATCGATCTTTCTCTTAAGCTCGAGGCCGAGGAGAA7VAA 
GAAAGAGATAGAAGGATCGAAACATAGCCGTGAGAACAAAGAAGACGAAGAACATGATGC 
TAGTGGTGATGAAGATGAACAAATGGTGAAAGAAGACGAAGATGATTCTTCTTCTTTAGG 
TTTAAGAACCCGAGAAGAAGAAAACGAACGTGAAGAGCTCTTGCAGCTACAGATCCAGAT 
GGAAAGTGTGAAAGAAGAGAATACTAGGTTGAGGAAGCTTGTCGAGCAGACTCTTGAAGA 
TTATCGTCATCTTGAGATGAAATTCCCGGTTATCGATAAAACCAAGAAGATGGATCTTGA 
AATGTTCCTTGGAGTACAAGGCAAACGATGTGTGGATATAACAAGTAAGGCTCGGAAAAG 
AGGAGCTGAGAGATCTCCGTCAATGGAAAGAGAAATAGGGCTTTCACTTTCTCTAGAGAA 
AAAACAGAAACAAGAAGAGAGCAAAGAAGCTGTTCAGTCTCATCACCAAAGATACAATAG 
TAGCAGCTTAGATATGAATATGCCACGTATCATTTCATCTTCTCAAGGTAATAGAAAGGC 
CAQGGTGTCCGTGAGGGCGAGATGTGAGACCGCAACAATGAATGATGGATGCCAATGGAG 
GAAGTACGGTCAGAT^AACCGCGAAAGGGAATCCATGTCCTCGAGCTTATTACCGATGCAC 
CGTGGCTCCAGGATGTCCCGTTAGAAAACAGGTGCAAAGGTGTTTAGAAGACATGTCAAT 
ACTGATAACAACCTACGAAGGAACACATAACCATCCACTTCCGGTCGGAGCAACAGCCAT 
GGCTTCCACTGCCTCTACTTCTCCATTCTTGTTACTCGATTCCAGTGACAACCTCTCTCA 
TCCTTCCTATTACC^VAACTCCTCAAGCCATAGACTCTTCTTTGATTACATACCCACAAAA 
TAGCAGCTACAACAATCGAACCATAAGAAGCTTGAACTTTGATGGTCCATCTAGAGGAGA 
TCACGTTTCATCTTCTCAAAACCGATT7VAATTGGATGATGTAGAGTTTCCTATATCTCTA 

GTTTCTAACATTTATGTTTCGTATA 
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>G2137 Amino Acid Sequence (conserved domain in AA coordinates : 109- 168 ) 

MGIDLSLKLEAEEKKKEIEGSICHSRENKEDEEHDASGDEDEQMVKEDEDDSSSLGLRTRE 

EENEREELLQLQIQMESVKEENTRLRKLVEQTLEDYRHLEMKFPVIDKTK1CMDLEMFLGV 

QGKRCVDITSKARKRGAERSPSMEREIGLSLSLEKKQKQEESKEAVQSHHQRYNSSSLDM 

NMPRIISSSQGNRKARVSVRARCETATMNDGCQWRKYGQKTAKGNPCPRAYYRCTVAPGC 

PVRKQVQRCLEDMSILITTYEGTHNHPLPVGATAMASTASTSPFLLLDSSDNLSHPSYYQ 

TPQAIDSSLITYPQNSSYNNRTIRSLNFDGPSRGDHVSSSQNRLNWMM* 

>G746 (1..1311) 

ATGGGTGAGGAGTTAGCTGACACT^ATGAACCTGGATTTGAATCTTGGGCCTGGTCCTGAG 

TCTGATCTCCAACCTGCACCAAACGAGACTGTGAATTTGGCTGATTGGACTAATGACCCG 

CCTGAGAGATCTTCTGAAGCTGTGACTVAGGATCAGGACTCGGCATAGGACACGGTTCAGA 

CAGCTTAATCTCCCGATCCCGGTTCTATCTGAAACCCATACCATGGCTATAGAGCTCAAC 

CAGTTGATGGGAAATTCTGT7\AATAGAGCTGCTATGCAGACTGGTGAGGGTAGTGAAAGA 

GGCAATGAGGATTTGAA^TGTGTGAGAATGGCGATGGAGCCCTTGGGGACGGTGTATTG 

GATAAGAAAGCGGATGTCGAGAAAAGCAGTGGCAGCGACGGTAACTTTTTCGATTGTAAT 

ATATGTTTGGATTTGTCGAAGGAGCCGGTTCTCACCTGTTGTGGTCATCTTTACTGTTGG 

CCTTGTCTGTACCAATGGTTACAAATTTCGGATGCAAAGGAATGTCCTGTTTGTAAAGGA 

GAGGTGACCTCCAAAACCGTGACACCGATCTATGGACGTGGAAACCACAAGAGAGAAATT 

GAAGAGAGTTTAGATACTAAGGTCCCCATGAGACCACACGCGAGACGCATTGAGAGCTTG 

AGGAATACAATTCAAAGGTCGCCTTTTACAATACCAATGG7VAGAAATGATTAGACGTATA 

CAGAATAGGTTTGACAGGGATTCAACCCCAGTCCCTGATTTTAGTAACCGAGAGGCATCA 

GAAAGAGTCAACGATCGAGCCAATTCGATCCTTAACCGGTTGATGACATCTAGGGGAGTT 

AGATCAGAGCAGAACCAGGCTAGTGCTGCAGCAGCAGCCATTGTCGCAGCATCAGAGGAT 

ATTGATCTAAATCCAAACATTGCTCCTGATCTTGAAGGAGAAAGCAACACGAGATTCCAT 

CCTCTCTTGATCAGGAGACAGTTACAGTCGCACCGAGTTGCAAGGATCTCGACTTTCACT 

TCTGCGTTGAGTTCAGCTGAGAGGCTTGTGGATGCGTATTTTAGGACTCATCCGTTGGGG 

AGGAACCACCAAGAGCAAAACCATCATGCTCCTGTTGTGGTTGATGATAGAGACTCATTC 

TCAAGCATTGCAGCTGTTATAAACTCTGAGAGTCAAGTGGATACTGCAGTTGAGATCGAT 

TCTATGGCTCTTTCGACATCGTCCTCGAGGAGAAGGAATGAGAATGGTTCGAGGGTTTCT 

GATGTAGACAGTGCAGATTCTCGTCCGCCTAGGAGAAGGAGATTTACTTGA 

>G746 Amino Acid Sequence (domain in AA coordinates: 139-178) 

MGEELADTMl^DLNLGPGPES 

QLNLPIPVLSETHTMAIELNQLMGNSWRAAMQTGEGSERGNEDLKMCENGDGALGDGVL 

DKKADVEKSSGSDGNFFDCNICLDLSKEPVLTCCGHLYCWPCLYQWLQISDAKECPVCKG 

EVTSKTVTPIYGRGNHKREIEESLDTKVPMRPHARRIESLRNTIQRSPFTIPMEEMIRRI 

QNRFDRDSTPVPDFSl^ASERVlJDRANSIIiNRLMTSRGVRSEQNQASAAAAAIVAASED 

IDLNPNI APDLEGE SNTRFHPLL IRRQLQSHRVAR I S TFTS ALS S AERLVDAYFRTHPLG 

RNHQEQJSTHHAPVVTODRDSFSSIAAVINSESQTO^ 

DVDSADSRPPRRRRFT* 

>G2701 (46.. 837) 

GTGTTTGTAGTTGAAACTTATTCTTCCCTTTTTTTGTTTTTAGGTATGGAGACTCTGCAT 
CCATTCTCTCACCTACCTATCTCTGACCACCGGTTCGTTGTTCAAGAGATGGTGAGCTTA 
CACAGCTCGAGTAGCGGTAGCTGGACTAAAGAAGAGAACAAGATGTTCGAACGAGCTCTT 
GCGATATACGCTCAAGACTCGCCTGATCGCTGGTTTAAAGTTGCTTCCATGATCCCTGGA 
AAGACTGTTTTTGATGTTATGAAGCAATATAGTAAGCTTGAAGAAGACGTTTO 
GAAGCAGGACGTGTTCCCATTCCTGGTTATCCTGCAGCTTCTTCTCCCTTGGGGTTTGAC 
ACGGACATGTGTCGTAAACGGCCTAGTGGAGCTAGAGGATCTGATCAAGATCGAAAGAAA 
GGAGTCCCTTGGACAGAGGAAGAACAGAGGAGATTCTTGTTAGGCCTTCTCi^ 
AAAGGAGATTGGAGAAACATATCGAGAAACTTCGTGGTGTCAAAGACGCCAACGCAAGTG 
GCGAGCCACGCCCAAAAGTATTACCAGAGACAGCTCTCCGGAGCCAAGGACAAACGCAGG 
CCAAGTATCCATGACATGACAACCGGCAATC 
'TCCGATCATAGAGATATTCTCCCTGATTTAGGGTTTATCGATAAGGATGATACGGAGGAG 
GGAGTAATATTTATGGGTCAGAATCTCTCTTCAGAAAATCTGTTTTCTCCATCACCAACT 
TCATTCGAAGCTGCCATTAACTTCGCCGGAGAAAATGTCTTCAGTGCCGGAGCTTAAGGC 
AACATAGAATCCCCAAACTCAGCGGC 

>G2701 Amino Acid Sequence (domain in AA coordinates: 33-81, 129-183) 
METLHPFSHLPISDHRFVVQEMVSLHSSSSGSWTKEENKMFERALAIYAEDSPDRWFKVA 
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SMI PGKTVFDVMKQYS KLEEDVFDI EAGRVPI PGYPAAS S PLGFDTDMCRKRPSGARGSD 

QDRKKGVPWTEEEHRRFLLGLLKYGKGDWRNISRNFWSKTPTQVASHAQKYYQRQLSGA 

KDKRRPSIHDITTGNLLNANLNRSFSDHRDILPDLGFIDKDDTEEGVIFMGQNLSSENLF 

SPSPTSFEAAINFAGENVFSAGA* 

>G1819 (1..639) 

ATGGAAGAGAACAACGGCAACAACAACCACTACCTGCCGCAACCATCGTCTTCCCAACTG 
CCGCCGCCACCATTGTATTATCAATCAATGCCGTTGCCGTCATATTCACTGCCGCTGCCG 
TACTCACCGCAGATGCGGAATTATTGGATTGCGCAGATGGGAAACGCAACTGATGTTAAG 
CATCATGCGTTTCCACTAACCAGGATAAAGAAAATCATGAAGTCCAACCCGGAAGTGAAC 
ATGGTCACTGCAGAGGCTCCGGTCCTTATATCGAAGGCCTGTGAGATGCTCATTCTTGAT 
CTCACAATGCGATCGTGGCTTCATACCGTGGAGGGCGGTCGCCAAACTCTCAAGAGATCC 
GATACGCTCACGAGATCCGATATCTCCGCCGCAACGACTCGTAGTTTCAAATTTACCTTC 
CTTGGCGACGTTGTCCCAAGAGACCCTTCCGTCGTTACCGATGATCCCGTGCTACATCCG 
GACGGTGAAGTACTTCCTCCGGGAACGGTGATAGGATATCCGGTGTTTGATTGTAATGGT 
GTGTACGCGTCACCGCCACAGATGCAGGAGTGGCCGGCGGTGCCTGGTGACGGAGAGGAG 
GCAGCTGGGGAAATTGGAGGAAGCAGCGGCGGTAATTGA 

>G1819 Amino Acid Sequence {domain in AA coordinates: 46-188) 
MEENNGNNtmYLPQPSSSQLPPPPLYYQ 

HHAFPLTRIKKIMKSNPEVNWTAEAPVLISKACEMLILDLTMRSWLHTVEGGRQTLKRS 
DTLTRSDISAATTRSFKFTFLGDWPRDPSWTDDPVLHPDGEVLPPGTVIGYPVFDCNG 
VYASPPQMQEWPAVPGDGEEAAGEIGGSSGGN* 
>G1227 (372. .1451) 

TCTTCCGTGTGTTAACAGAAGTCCCCACAATTGTCTGTCTTCGCTGCGAGACAAAACTGC 
CACAGCCAATAATGTTTCTCTGAGGGACCTTGCTTCTGTCAGAGACTCGCTCTCTCTCTC 
CTCTTCTTGCTCTGCTCAGCTCTCTCACCAACTCATCTTCAGTCCTCAAACAAACATCTG 
TTCTCATCTTTGTTTTCTTTCCTTTCTTTCTCATATCTCATTTTCAATTTTCCCAATTTC 
TCTTCAACATCTTCATAGCAATTTAAGACCACTATTCCATTATAAAGCTAACTGCTTTAG 
AAACTCCTCACATTTATTTCTTCCCCATCATTGTTTTAGAGAGGGAGAAAGAAAAAGAGC 
TCAGCTTTCTGATGGAGAGGAGTATTCAAGGACAAAACAAGCTCTGTTGTTTGGACCAAA 
AAGTGAATGTGAGAAGAAGCCTACAAGTTCAAGAAACTGTAGAGGATCATCAAAGCTTTG 
CCCTTGAAGAGGAAGAACAACAACTCTCAACTCCGAGCTTGCTGCAAGACACAACAATAC 
CATTTCTACAAATGCTGCAACAAAGTGAAGACCCTTCACCGTTTTTGTCATTCAAAGACC 
CAAGCTTTCTAGCACTACTATCTCTCCAGACACTTGAAAAGCCTTGGGAACTCGAAAACT 
ACCTCCCACATGAAGTTCCAGAGTTTCATTCACCGATCCATTCTGAMCCAACCACTACT 
ATCATAATCCATCTTTGGAAGGAGTCAATGAAGCCATCTCAAACCAAGAACTTCCATTCA 
ACCCACTAGAGAATGCGCGTTCAAGACGCAAGCGGAAAAACAACAACTTGGCATCATTGA 
TGACAAGAGAAAAGCGAAAGAGAAGAAGAACTAAACCAACAAAGAACATAGAAGAGATAG 
AGAGTCAAAGAATGACACACATTGCGGTTGAACGAAACCGCAGACGCCAAATGAACGTTC 
ATCTGAACTCACTCCGCTCCATCATTCCATCTTCATACATCCAGAGGGGAGACCAAGCGT 
CAATAGTAGGAGGAGCAATAGACTTCGTAAAGATCCTAGAGCAACAGTTGCAATCCCTTG 
AAGCACAAAAGAGAAGT(^(^GAGTGATGATAA(^^GAGCAAATTCCAGAAGATAAC^ 
GTCTCAGGAACATTTCGTCGAACAAGTTGCGTGCGAGTAATAAAGAAGAACAAAGTAGCA 
AACTCAAAATCGAAGCCACAGTGATAGAGAGTCACGTCAACCTAAAAATTCAATGTACGA 
GGAAACAAGGACAACTTCTCAGATCAATCATATTGCTGGAGAAACTTCGATTCACTGTTC 
TTCATCTCAACATCACATCTCCGACCAATACATCTGTCTCTTATTCCTTCAACCTCAAGA 
TGGAAGATGAATGTAATTTGGGATCAGCGGATGAGATAACGGCGGCGATTCGTCAGATTT 
TCGACAGCTGATTGACTAATCCAAGTAAAAAGTAAAATAAAAAAAGAAACGTTTACTTTG 
GTAACTTCGTTTTCATGATTAAATTCTTTATTTGGTCGTATGTGATTGGAGTCTTCTCGG 
CATGGAACTTGACTTTGGTTTTAGGGTACTAGTCTCTACAGAAGCTGTGGTCCTTCTTTG 
GATGC 

>G1227 Amino Acid Sequence (domain in AA coordinates: 183-244) 

MERSIQGQNKLCCLDQKVNVRRSLQVQETVEDHQSFALEEEEQQLSTPSLLQDTTIPFLQ 

MLQQSEDPSPFLSFKDPSFLALLSLQTLEKPWELENYLPHEVPEFHSPIHSETNHYYHNP 

SLEGWEAISNQELPFNPLENARSRRKRKI^IASLMTREKRKRRRTKPTKNIEEIESQR 

MTHIAVERNRRRQMNVHLNSLRSIIPSSYIQRGDQASIVGGAIDFVKILEQQLQSLEAQK 

RS QQSDDNKEQI PEDNS LRN I S SNKLRASNKEEQS S KLKI EATVT ESHVNLKI QCTRKQG 

QLLRSIILLEKLRFTVLHLNITSPTNTSVSYSFNLKMEDECNLGSADEITAAIRQIFDS* 
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>G2417 (118. .1311) 

CATACCGGTGGAAGATTCTGCTTTACTACGCTCTCCGCTTCTTCTTCTCCTCGATTCGAT 
TCTCCTCATGGGTTTATCATGAATTTTTAGGTTTTGAGTAATTCAGAAACTCGAGTGATG 
ATCCCGAATGATGATGATGATGCAAATTCTATGAAGAATTATCCGTTAAATGATGATGAT 
GCAAATTCTATGAAGAATTATCCGTTAAATGATGATGATGCAAATTCTATGGAGAATTAT 
CCGTTAAGGTCAATTCCGACGGAGCTTTCACACACTTGTTCATTGATACCACCTTCTTTA 
CCAAACCCTTCAGAAGCAGCAGCAGACATGTCCTTCAATTCAGAACTCAATCAAATCATG 
GCAAGGCCTTGTGATATGCTCCCTGCCAATGGTGGAGCTGTTGGTCATAACCCTTTTTTG 
GAACCAGGATTCAACTGCCCCGAGACAACAGATTGGATTCCCTCTCCACTCCCCCATATT 
TATTTTCCTTCGGGTTCTCCCT^ATCTAATAATGGAGGATGGTGTCATTGATGAGATTCAC 
AAACAAAGTGACTTGCCACTTTGGTATGACGACTTGATTACCACTGATGAAGATCCACTC 
ATGTCTAGTATCTTGGGCGATCTTCTCCTTGACACTAATTTCAACTCAGCTTCAAAGGTG 
CAG CAACCAAGTATGCAATCGCAGATTCAACAACCCCAAG CTGTTCTGCAGCAGCCTTCT 
TCTTGTGTGGAATTGCGCCCACTTGATAGGACAGTATCCTCAAACAGCT^ACAACAATAGC 
AACAGTAATAATGCAGCAGCAGCAGCTAAGGGACGTATGCGTTGGACGCCTGAACTTCAT 
GAGGTTTTTGTTGACGCTGTTAACCAGCTCGGTGGCAGTAATGAAGCAACTCCTAAAGGT 
GTCCTGAAGCATATGAAAGTCGAAGGTTTGACTATTTTTCATGTCAAAAGTCATTTGCAG 
AAATATAGAACAGCTAAATATATACCAGTACCATCAGAAGGTTCGCCGGAGGCAAGGTTG 
ACACCGCTTGAGCA7VATTACATCTGATGATACGAAACGTGGGATAGATATCACTGAGACT 
CTGCGAATTCAGATGGAACATCAGAAGAAACTGCATGAGCAGCTTGAGAGTCTAAGAACA 
ATGCAACTTCGGATAGAAGAGCAAGGAAAGGCGCTGTTGATGATGATTGAGAAGCAAAAT 
ATGGGTTTCGGCGGACCAGAACAAGGAGAGAAAACAAGTGCGAAAACGCCTGAAAATGGT 
TCAGAGGAGTCGGAATCCCCGCGGCCAAAGCGTCCGAGAAATGAAGAATGAAGGAAACCT 
TTCTTCGGATGGTAGATCATAAAACTGTGGTTTTGGTGGAGTTGTAGAGTATGACTTATT 
AGGAGTAGAGCTTTCAGTCTTCTTCAGGC 

>G2417 Amino Acid Sequence (domain in AA coordinates: 235-285) 

MIPNDDDDANSMKNYPLNDDDANSMK I PTELSHTCSLI PPS 

LPNPSEAAADMSFNSELNQIMARPCDMLPANGGAVGHNPFLEPGFNCPETTDWIPSPLPH 

IYFPSGSPNLIMEDGVIDEIHKQSDLPLWYDDLITTDEDPLMSSILGDLLLDTNFNSASK 

VQQPSMQSQIQQPQAVLQQPSSCVELRPLDRTVSSNSNNNSNSNNAAAAAKGRMRWTPEL 

HEVFVDAVNQLGGSNEATPKGVLKHMICVEGLTIFHVKSHLQKyRTAKYIPVTSEGSPE^ 

LTPLEQITSDDTKRGIDITETLRIQMEHQKKLHEQLESLRTMQLRIEEQGKALLMMIEKQ 

NMGFGGPEQGEKTSAKTPENGSEESESPRPKRPRNEE* 

>G2116 (104.. 1117) 

TTCATCTCCATCATTATCTCCATTGACATTGTTCTCAATTGCGAATAATAATCATAATTA 

TTCACACAACCAAAGCATTCATCTCTCAGATTCTCTTAAAAAAATGGAGAAATCAGATCC 

TCCACCAGTCCCAAAGCCCGGCGCCACTATTATCCCCTCCTCCGATCCAATTCCTAATGC 

CGATCCGATTCCATCTTCTTCCTTCCACCGCCGATCTCGCTCCGACGATATGTCCATGTT 

CATGTTCATGGATCCCCTCTCCTCCGCCGCACCACCTTCCTCCGACGACCTTCCCTCCGA 

CGACGATCTCTTCTCTTCTTTCATCGATGTCGATAGCCTCACCTCTAATCCCAATCCCTT 

TCAAAATCGTTCCCTCTCCTCCAACTCCGTTTCCGGCGCTGCTAATCCTCCTCCTCCTCC 

TTCCTCTCGTCCTCGCCACCGTCACAGCAATTCCGTTGACGCTGGATGCGCCATGTATGC 

CGGTGATATCATGGACGCTAAGAAAGCTATGCCTCCTGAAAAACTCTCTGAGCTTTGGAA 

CATCGATCCCAAACGCGCCTUUWVGGATTCTAGCGAATCGACAATCTGCAGCTCGATCC^A 

AGAGAGAAAAGCTCGATACATTCAAGAACTTGAGCGCAAAGTTCAATCTCTTCAAACCGA 

AGCTACCACTCTCTCTGCTC^GCTTACTCTCTACCAGAGAGACACA^TGGACTAGCAAA 

CGAAAACACAGAGCTGAAACTTAGGTTGC^GCTUVTGGAAC^a^GCTCAGCTTCGTAA 

TGCTTTAAACGAAG€GTTGAGGAAAGAAGTTGAAAGGATGAAGATGGAGACAGGAGAAAT 

CTCTGGTAATT(^GATTCGTTTGATATGGGAATGCAGCAGATTCAGTATTCTTCCTCAAC 

TTTCATGGCTATTCCACCATATCATGGCTCAATGAACCTCCATGATATGCAGATGCATTC 

TAGTTTCAATCCTATGGAGATGTCCAATTCTCAAAGCGTGTCGGACTTTCTACAGAACGG 

CCGAATGC^GGGOTGGAGATTAGTAGCAATAGCTCAAGCTTAGTCAAATCTGAAGGACC 

TTCTCTCTCTGCTAGTGAGAGTAGCTCTGCCTATTGACGACAAGATTATGATGAGGCTCA 
TTTTTCTG 

>G2116 Amino Acid Sequence (conserved domain in AA coordinates : 150-210) 

MEKSDPPPVPKPGATIIPSSDPIPNADPIPSSSFHRRSRSDDMSMFMFMDPLSSAAPPSS 

DDLPSDDDLFSSFIDVDSLTSNPNPFQNPSLSSNSVSGAANPPPPPSSRPRHRHSNSVDA 
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GCAMYAGDIMDAKKAMPPEKIiSELWNIDPKRAKRILANRQSAARSKERKARYIQELERKV 

QSLQTEATTLSAQLTLYQRDTNGLANENTELKLRLQAMEQQAQLRNALNEALRKEVERMK 

METGEISGNSDSFDMGMQQIQYSSSTFMAIPPYHGSMNLHDMQMHSSFNPMEMSNSQSVS 

DFLQNGRMQGLEISSNSSSLVKSEGPSLSASESSSAY* 

>G647 (1. .948) 

ATGATGATCGGCGAAAATAAAAACCGGCCACATCCMCGATCCATATCCCTCAATGGGAT 

CAAATCAACGATCCAACGGCCACAATCTCTTCACCATTCTCTTCCGTCAACCTTAACAGC 

GTTAACGACTACCCACACTCTCCGTCACCGTATCTCGACTCCTTCGCTTCTCTCTTCCGT 

TACCTCCCGTCAAACGAGTTAACAAACGATTCAGACTCATCAAGTGGCGACGAGTCATCA 

CCACTCACCGACTCATTCTCCTCCGACGAGTTTCGCATCTACGAGTTCAAAATCCGGCGA 

TGCGCTCGAGGTCGATCTCATGATTGGACGGAGTGTCCGTTCGCACATCCCGGAGAAAAA 

GCTCGACGACGTGATCCGAGAAAGTTTCATTACTCCGGCACCGCTTGTCCTGAGTTTCGT 

7^AAGGAAGTTGTAGAAGAGGTGATTCGTGTGAGTTCTCTCATGGAGTTTTCGAGTGTTGG 

CTCCATCCTTCTCGTTACCGTACTCAGCCGTGTAAAGACGG7^ACTAGCTGCCGGAGAAGA 

ATCTGTTTCTTCGCTCATACGACGGAGCAGTTACGTGTATTACCTTGTTCGTTAGATCCA 

GATCTTGGATTCTTCTCAGGATTAGCTACTTCTCCGACTTCGATTCTTGTTTCTCCTTCG 

TTTTCACCACCGTCGGAATCTCCGCCGCTTTCTCCGAGTACCGGTGAACTTATTGCGTCG 

ATGAGGAAAATGCAATTGAACGGAGGTGGTTGTTCGTGGAGTTCTCCGATGAGATCTGCA 

GTTAGGTTACCTTTTTCGTCGTCTCTGCGTCCGATTCAGGCGGCAACGTGGCCGAGGATA 

AGAGAGTTTGAGATCGAAGAAGCTCCGGCGATGGAATTTGTGGAATCTGGGAAAGAGCTG 

AGAGCGGAGATGTATGCAAGACTCAGTAGAGAGAACTCACTCGGTTGA 

>G647 Amino Acid Sequence (domain in aa coordinates: 77-192) 

MMIGENKNRPHPTIHIPQWDQIl^PTATISSPFSSWLNSVNDYPHSPSPYLDSFASLFR 

YLPSNELTNDSDSSSGDESSPLTDSFSSDEFRIYEFKIRRCARGRSHDWTECPFAHPGEK 

ARRRDPRKFHYSGTACPEFRKGSCRRGDSCEFSHGVFECWLHPSRYRTQPCKDGTSCRRR 

ICFFAHTTEQLRVLPCSLDPDLGFFSGLATSPTSILVSPSFSPPSESPPLSPSTGELIAS 

MRKMQLNGGGCSWSSPMRSAVRLPFSSSLRPIQAATWPRIREFEIEEAPAMEFVESGKEL 

RAEMYARLSRENSLG * 

>G974 (377.. 1162) 

AAAAAAAAAGTTGATATACTTTCTGGTTTTCTCCTTAACTTTTATTCTTTACAAATCCAT 
CCCCCTTAGATCTGTTTATTTCCCGCTACTTTGATTCATTTCTGTTAGTAATCTGTCTTT 
CGTATAGAAGAAAACTGATTTCTTGGTTTGTATTTTCTTAAAGAGATCAATCTTTTTTTA 
TTTTTGATCTTCTTGTGTTTTTTTTTCTTTGTAGAATTAATCGTTTGTGAGGGTATTTTT 
TTAATTCCCTCCTCTCAGAAATCTACACAGAGGTTTTTTATTTTATAAACCTCTTTTTCG 
ATTTTCTTGAAAACAAAAAATCCTGTTCTTTACTTTTTTTACAAGAACAAGGGAAAAAAA 
TTTCTTTTTATTAGAAATGACMCTTCTATGGATTTTTACAGTAACyVAAACGTTTCTVACA 
ATCTGATCCATTCGGTGGTGAATTAATGGAAGCGCTTTTACCTTTTATCAAAAGCCCTTC 
CAACGATTCATCCGCGTTTGCGTTCTCTCTACCCGCTCCAATTTCATACGGGTCGGATCT 
CCACTCATTTTCTCACCATCTTAGTCCTAAACCGGTCTCAATGAAACAAACCGGTACTTC 
CGCGGCTAAACCGACGAAGCTATACAGAGGAGTGAGACAACGTCACTGGGGAAAATGGGT 
GGCTGAGATTCGTTTACCGAGGAATCGAACTCGACTTTGGCTCGGAACATTCGACACGGC 
GGAGGAAGCTGCTTTAGCTTATGACAAGGCGGCGTATAAGCTCCGAGGAGATTTTGCGCG 
GCTTAATTTCCCTGATCTCCGTCATAACGACGAGTATCAACCTCTTCAATCATCAGTCGA 
CGCTAAGCTTGAAGCTATTTGTCAAAACTTAGCTGAGACGACGCAGAAACAGGTGAGATC 
AACGAAGAAGTCTTCTTCTCGGAAACGTTCATCAACCGTCGCAGTGAAACTACCGGAGGA 
GGACTACTCTAGCGCCGGATCTTCGCCGCTGTTAACGGAGAGTTATGGATCTGGTGGATC 
TTCTTCGCCGTTGTCGGAGCTGACGTTTGGTGATACGGAGGAGGAGATTCAGCCGCCGTG 
GAACGAGAACGCGCTGGAGAAGTATCCGTCGTACGAGATCGATTGGGATTCGATTCTTCA 
GTGTTCGAGTCTTGTAAATTAGATGTTGCCATAGGGGTATTTTAGGGACTTTAGAGCTCT 
CTGCGATGGAGTTTTTGGTCATTGCAGAGATTTT^ 

TAATATCAAATAAGTTTATCTACTTTGATGTTAATTAGTGTTAATCTCTGCGTCGGTCCA 

AGCTGTTTTTTTTTGGCATGCTTCGACCGTGTGAGATTTC 

CTTGATTTTCTTAGTTCAAGTTAAATTGGCACAAAAAAAAAAAAAAAAAA 

>G974 Amino Acid Sequence (domain in AA coordinates: 81-140) 

MTTS MDFYSNKTFQQSDPFGGELMEALLPF I KS PSNDS S AFAFSLPAP I S YG SDLHS FSH 

HLS PKPVSMKQTGTS AAKPTKLYRGVRQRHWGKWVAE IRLPRNRTRLWLGTFDTAEEAAL 

AYDKAAYKLRGD FARLNFPDLRHNDE YQPLQS SVDAKLEAI CQNLAETTQKQVRSTKKS S 
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SRKRSSTVAVKLPEEDYSSAGSSPLLTESYGSGGSSSPLSELTFGDTEEEIQPPWNENAL 

EKYPSYEIDWDSILQCSSLVN* 

>G1419 (27..S92) 

GAAGACTCCAACATAATTCATCATCTATGGCTTCTTCACATCAACAACAGCAAGAACAAG 
ACCAGTCAGCTTTAGATCTCATAACCCAACACCTTCTTACTGATTTCCCTTCCTTAGACA 
CCTTTGCCTCCACCATCCACCACTGCACCACCTCAACTCTAAGCCAACGCAAACCACCTC 
TTGCCACTATAGCAGTTCCTACTACTGCACCGGTGGTTCAAGAGAATGATCAAAGGCATT 
ACAGAGGCGTCAGGAGAAGACCATGGGGTAAGTATGCGGCTGAGATCAGAGACCCAAACA 
AGAAAGGTGTTCGTGTCTGGTTAGGCACTTTTGACACAGCCATGGAAGCTGCAAGAGGTT 
ATGACAAGGCAGCTTTTAAACTACGAGGAAGCAAAGCTATTCTTAACTTCCCACTTGAAG 
CAGGAAAGCATGAGGACTTGGGAGACAACAAGAAGACTATTTCTTTAAAAGCAAAGAGGA 
AGAGACAGGTGACGGAGGATG AAAGCCAGCTG ATCAG CCGTAAAG CTGTTAAGAGGGAAG 
AAGCTCAGGTTCAGGCTGATGCTTGTCCATTAACGCCATCAAGTTGGAAGGGGTTTTGGG 
ACGGAGCAGACAGTAAAGACATGGGAATATTTTCCGTGCCTCTGTTATCTCCTTGTCCAT 
CTCTTGGACACTCTCAACTCGTAGTTACTTAAGCTTCAGAGGGTCAAACTGGAAAAAATC 
AACATTGGATTGTTTTCAAAGCTTCTAGATTAGCTGATTGTAAAAAAATGTTTTACTATA 
TTCATTCATTCTTCTTAAATGCAATTCTTTCTACCCTTCC 

>G1419 Amino Acid Sequence (domain in AA coordinates: 69-137) 

MAS SHQQQQEQDQSALDL I TQHLLTDFPS LDTFASTIHHCTTS TLS QRKP PIiATI AVPTT 

APWQENDQRHYRGVRRRPWGKYAAEIRDPNKKGVRVWLGTFDTAMEAARGYDKAAFKLR 

GSKAILNFPLEAGKHEDLGDNKKTISLKAKRKRQVTEDESQLISRKAVKREEAQVQADAC 

PLTPSSWKGFWDGADSKDMGIFSVPLLSPCPSLGHSQLWT* 

>G1634 (22.. 855) 

TTATCTCGTAGCCTTTAAACGATGGAGACTCTGCATCCACTACTCTCGCACGTGCCAACT 

TCTGACCACCGGTTTGTAGTTCAAGAGATGATGTGCTTGCAAAGCTCGAGCTGGACTAAA 

GAAGAGAACAAGAAGTTTGAGCGAGCTCTTGCTGTCTACGCTGATGACACGCCTGATCGC 

TGGTTCAAAGTTGCTGCTATGATCCCTGGAAAGACCATATCAGATGTCATGAGGCAATAC 

TCTAAGCTTGAAGAAGACCTCTTCGATATCGAAGCAGGACTTGTCCCGATCCCGGGTTAC 

CGTTCAGTTACTCCTTGTGGATTTGATCAGGTTGTGAGTCCACGTGACTTTGATGCGTAT 

CGTAAACTTCCTAATGGAGCCAGAGGATTTGATCAAGACCGTAGGAAAGGAGTTCCATGG 

ACGGAGGAAGAACACAGGAGATTCTTGTTAGGGCTTCTCAAGTATGGGAAAGGAGATTGG 

AGAAACATATCGAGGAACTTTGTGGGATCAAAAACACCAACTCAGGTTGCAAGTCATGCC 

CAAAAGTACTACCAAAGACAGCTTTCCGGTGCGAAAGACAAACGACGGCCTAGCATTCAC 

GACATCACCACCGTCAATCTTCTCAATGCCAATCTTAGCCGTCCATCGTCTGATCACGGT 

TGCTTAGTCTCAAAACAGGCCGAGCCGAAACTAGGGTTCACCGACAGGGATAATGCAGAG 

GAGGGAGTTATGTTTCTTGGTCAGAATCTATCCTCGGTCTTCTCTTCCTACGATCCTGCC 

ATTAAGTTTTCCGGAGCAAATGTTTACGGTGAAGGAGGTTACTGTATCTCACAAGATCTT 

GAAACGAGAAAATGAGAATTTTGAAATTTTAACTATTGCAACGAAACCATAATTGC 

>G1634 Amino Acid Sequence '(domain in AA coordinates: 129-180) 

METLHPLLSHVPTSDHRFWQEMMCLQSSSWTKEE 

IpGKTISDVMRQYS KLEEDLFDI EAGLVPI PG YRS VTPCGFDQ WS PRDFDAYRKLPNGA 
RGFDQDRRKG VPWTEEEHRRFLLGLLKYGKGDWRNI SRNFVGS KTPTQ VASHAQKYYQRQ 
LSGAKDKRRPSIHDITTVNLLNANIiSRPSSDHG^ 
QNLS S VFSSYDPAI KFSGANVYGEGGYCISQDLETRK* 
>G1637 (1..954) 

ATGGTGAAGGAGACGGTGACGGTGGCGAAAACGTGCTCACACTGTGGCCATAATGGCCAT 
AACGCACGGACTTGTCTCAACGGCGTTAATAAGGCAAGTGTTAAACTGTTCGGCGTTAAT 
ATATCGTCTGATCGGATTAGGCCGCCTGAGGTAACGGCGTTAAGGAAGAGTCTTAGTTTG 
GGAAACCTTGATGCTCTTCTCGCTAACGATGAAAGTAACGGTAGCGGTGATCCTATCGCC 
GCCGTTGATGATACCGGTTATCATTCCGATGGTCAGATTCATTCCAAGAAGGGTAAAACT 
GCTCATGAGAAGAAAAAGGGGAAGCCATGGACGGAAGAAGAACATCGTAATTTCTTAATC 
GGTTT7VAAGAAACTCGGAAAAGGAGATTGGAGAGGCATTGCAAAGAG 

AGAACACCAACACAAGTCGCAAGT(^TGCTCAGAAATATTTTATTAGGTTAAACGTTAAC 
GACAAGAGAAAAAGACGTGCTAGTCTCTTTGACATCTCTCTCGAAGATCAGAAGGAGAAA 
GAGAGGAACTCTCAAGATGCTTCAACAAAGACTCCACCTAAACAACCAATAACCGGAATT 
CAACAACCGGTAGTACAAGGTCATACTCAAACCGAGATTTCGAACAGGTTTC^GAATTTA 
TCAATGGAGTATATGCCAATCTACCAACCCATACCACCTTACTACAACTTTCCACCTATT 
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ATGTACCATCCAAATTATCCAATGTACTATGCCAACCCTCAAGTACCGGTTAGGTTTGTT 

CATCCTTCTGGTATACCTGTTCCAAGACATATACCGATTGGTTTGCCTCTGTCTCAACCG 

AGTGAAGCTTCTAATATGACAAATAAAGACGGTTTGGATCTTCATATCGGTTTGCCTCCA 

CAAGCTACTGGAGCTTCTGACTTGACTGGTCATGGCGTTATTCATGTGAAATGA 

>G1637 Amino Aoid Sequence (domain in AA coordinates: 109-173) 

MVKETVTVAKTCSHCGHNGHNARTCIiNGVNKAS VKLFGVN I S SDPI RPPEVTALRKSLS L 

GNLDALLANDESNGSGDPIAAVDDTGYHSDGQIHSKKGKTAHEKKKGKPWTEEEHRNFLI 

GLNKLGKGD WRG I AKS FVSTRTPTQVASHAQKYF I RLNVNDKRKRRAS LFD I SLEDQKEK 

ERNSQDASTKTPPKQPITGIQQPWQGHTQTEISNRFQNLSMEYMPIYQPIPPYYNPPPI 

MYHPNYPMYYANPQVPVRFVHPSGIPVPRHIPIGLPLSQPSEASNMTNKDGLDLHIGLPP 

QATGASDLTGHGVIHVK* 

>G1818 (601.. 1161) 

TAACAAATCAAATAATTAGAGAAATAACCAAAATTTAACTTTTAGAGGGACTACAGGATT 
TGTACTTTGTACATTCATATATTATTGTTATATATCGTTTCATACATTAATTTGAACCAA 
TGTAAATTAAGTAAAATTCAATTTAACATCATGAGCAAATTCTTATTAAAATTCTCTTAA 
AATTTTGAGCAAATTATGCTTTCACATTTAACATTTGAAAACATCATTTTTAACAAGATA 
TTCAAAACTAAGTTTTGTACAGCAAAATTTTAACTTTCAATTTTATAGAGAAAAAGGTAT 
TTTTTTTTTTGTTTCATTTTTATAAGACTATTATTTGGTATATAATATACACTTTAAGTA 
AAAACAAATCTCTTTCTTTTTTCTTCTTATAATACCAACCACAAGTCTGTCAGTCACACA 
CATACAGTTAAT7VACATTAAATATTCTTAACAAACTACTAAATAGGTTGAGATTCATATA 
TGT AAAGAGATCACTTCTTAATCTTATCC TAC C ATATCTTATATACGCTTAATTTTCCTT 
TATATATGCAAACCTCCACATAAAAATATCTCAAACCCAAACACTTCAAACAAAAAAAAA 
ATGGAGAACAACAACAACAACCACCAACAGCCACCGAAAGATAACGAGCAACTAAAGAGT 
TTCTGGTCAAAGGGGATGGAAGGTGACTTGAATGTCAAGAATCACGAGTTCCCCATCTCT 
CGTATCAAGAGGATAATGAAGTTTGATCCGGATGTGAGTATGATCGCTGCTGAGGCTCCA 
AATCTCTTATCTAAGGCTTGTGAAATGTTTGTCATGGACCTCACGATGCGTTCATGGCTC 
CATGCTCAAGAGAGCAACCGACTCACGATACGGAAATCTGATGTTGATGCCGTAGTGTCT 
CAAACCGTCATCTTTGATTTCTTGCGTGATGATGTCCCTAAGGACGAGGGAGAGCCCGTT 
GTCGCCGCTGCTGATCCTGTGGACGATGTTGCTGATCATGTGGCTGTGCCAGATCTTAAC 
AATGAAGAACTGCCGCCGGGAACGGTGATAGGAACTCCGGTTTGTTACGGTTTAGGAATA 
CACGCGCCACACCCGCAGATGCCTGGAGCTTGGACCGAGGAGGATGCGACTGGGGCAAAT 
GGAGGAAACGGTGGGAATTAATATTTGGATTGGGTTTTGTAACCGCTGTTGTGAGAACTT 
GAATTTCTTTTTGAGTTCTGCTTATGTTTTCAATGTTATGTTTTTTAGTTGTTGAATGTA 
TTTCTGTTGTTTTGTCCAAAAAAAAAAAAGAATGTATTTCTGTTGTTGTCTTTCAAATGA 
ATCTAATGGTTTATGAATATTGGCTTTAGATTAATTTATGCATACAAAAACACAAGGATT 
ACGGATAAAAAAGTCCTCAGTTTACCCATGGAAACATAATCTTCTAGTGATTCCTTATGA 
GAGTAGAAAAGAATCATATATTATAATCTATTTCATJUIGAGATAGGGTACTGTAAACAAG 
GATGTTTATTCGGCTATTTCTTTTTTTTTTAAT 

GTTTGCAGCTTTTTGTTAGATTACATTCTAGAGGCAACAAGATCCAGAGATCTAGCAAAA 
AAAACTTATTTTGAAACCTGAATCTATTTTAAAAATTTTCCAACTCATTTTTCGTTCTTA 
TTCTTTGTTTTCCAACGGAATTTGGCGCACAAACGATTO^ 

>G1818 Amino Acid Sequence (domain in AA coordinates: 36-113) 
MENNNNNHQQPPKDNEQLKSFWSKGM 

NLLSKACEMFVMDLTMRSWLHAQESNRLTII^DVDAWSQTVIFDFLRDDVPKDEGEPV 
VAAADPVDDVADHVAVPDLNNEELPPGTVIGTPVCYGLGIHAPHPQMPGAWTEEDATGAN 
GGNGGN* 

>G1820 (1..609) 

ATGGCTGAGAACAAGAACAACAACGGCGACAACATGAACAACGACAACCACCAGCAACCA 
CCGTCGTACTCGCAGCTGCCGCCGATGGCATCATCCAACCCTCAGTTACGTAATTACTGG 
ATTGAGCAGATGGAAACCGTCTCGGATTTCAAAAACCGTCAGCTTCCATTGGCTCGAATT 
AAGAAGATCATGT^AGGCTGATCCAGATGTGCACATGGTCTCCGCAGAGGCTCCGATCATC 
TTCGCAAAGGCTTGCGAAATGTTCATCGTTGATCTCACGATGCGGTCGTGGCTCAAAGCC 
GAGGAGAACAAACGCCACACGCTTCAGAAATCGGATATCTCCAACGCAGTGGCTAGCTCT 
TTCACCTACGATTTCCTTCTTGATGTTGTCCCTAAGGACGAGTCTATCGCCACCGCTGAT 
CCTGGCTTTGTGGCTATGCCACATCCTGACGGTGGAGGAGTACCGCAATATTATTATCCA 
CCGGGAGTGGTGATGGGAACTCCTATGGTTGGTAGTGGAATGTACGCGCCATCGCAGGCG 
TGGCCAGCAGCGGCTGGTGACGGGGAGGATGATGCTGAGGATAATGGAGGAAACGGCGGC 
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GGAAATTGA 

>G1820 Amino Acid Sequence (domain in AA coordinates: 70-133) 
MAENNl^GDNMNNDNHQQPPSYSQLPPMASSNPQLRNYWIEQMETVSDFKNRQLPLARI 
KKIMKADPDVHMVSAEAPI I FAKACEMFI VDLTMRSWLKAEENKRHTLQKSDISNAVASS 

FTYDFLLDWPKDESIATADPGFVAMPHPDGGGVPQYYYPPGWMGTPMVGSGMYAPSQA 
WPAAAGDGEDDAEDNGGNGGGN* 

>G1903 (1..1200) 

ATGTCTAAATCTAGAGATACGGAGATAAAGTTGTTTGGGAGGACAATCACATCTCTTTTA 
GATGTGAATTGTTATGATCCGTCGTCGTTGTCCCCTGTTCACGATGTTTCTTCTGATCCA 
AGCAAGGAGGATTCGTCTTCTTCTTCATCTTCTTGTTCTCCAACTATTGGACCAATCAGG 
GTTCCGGTTA7^AAAAAGTGAGCAAGAGAGT7VACAAATTCAAAGATCCATATATATTATCC 
GATCTAAACGAACCACCAAAAGCAGTATCTGAGATTTCATCACCAAGAAGTTCCAAGAAC 
AACTGTGATCAACAGAGCGAGATCACAACAACAACTACCACAAGTACTACATCAGGAGAG 
AAATCAACGGCTCTCAAGAAACCGGACAAGCTTATTCCATGTCCTAGATGTGAAAGCGCA 
AACACCAAATTCTGTTATTACAACAACTACAACGTGAACCAGCCACGTTACTTCTGCAGG 
AACTGTCAGAGGTATTGGACAGCTGGTGGATCTATGAGGAACGTTCCTGTTGGCTCAGGT 
CGTCGCAAGAACAAAGGATGGCCTTCTTCAAACCATTACTTGCAAGTCACTTCTGAGGAT 
- TGTGATAATAATAACTCGGGGACGATCCTTAGTTTCGGTTCTTCGGAGTCTTCGGTTACA 
GAGACTGGTAAGCATCAGTCAGGTGATACAGCAAAGATAAGTGCTGATTCAGTTTCTCAA 
GA7^AATAAAAGCTACCAAGGGTTTCTTCCTCCGCAAGTAATGTTACCTAATAATTCTTCT 
CCTTGGCCTTACCAATGGAGTCCAACGGGTCCTAACGCTAGTTTCTACCCTGTCCCCTTC 
TACTGGGGATGCACGGTTCCGATATACCCTACCTCAGAGACTTCATCATGTTTAGGAAAA 
CGGTCAAGAGATCAAACTGAAGGAAGAATCAATGATACTAATACAACAATAACTACTACA 
AGAGCAAGATTGGTCTCAGAATCTCTTAGAATGAATATCGAAGCTAGTAAGAGCGCTGTG 
TGGTCTAAGTTACCGACAAAACCCGAGAAAAAAACGCAAGGATTCAGTTTGTTCAATGGA 
TTTGACACAAAGGGAAACAGCAACAGAAGTAGCTTGGTCTCCGAAACTTCTCACAGTCTA 
CAAGCAAACCCTGCAGCGATGTCTAGAGCTATGAACTTCAGGGAGAGCATGCAACAATAA 
>G1903 Amino Acid Sequence (domain in AA coordinates: 134-180) 

MSKSRDTEIKLFGRTITSLLDVNCYDPSSLSPVHDVSSDPSKEDSSSSSSSCSPTIGPIR 

VPVKKSEQESNKFKDPYILSDLNEPPKAVSEISSPRSS 

KSTALKKPDKLIPCPRCESANTKFCYYNNYNVNQPRYFCRNC 

RR1CNKGWPSSNHYLQVTSEDCDNNNSGTILSFGSSESSVTETGKHQSGDTAKISADSVSQ 

ENKSYQGFLPPQVMLPNNSSPWPYQWSPTGPNASFYPVPFYWGCTVPIYPTSETSSCLGK 

RSRDQTEGRINDTNTTITTTRARLVSESLRMNIEASKSAVWSKLPTKPEKKTQGFSLFNG 

FDTKGNSNRSSLVSETSHSLQANPAAMSRAMNFRESMQQ* 

>G371 (1..582) 

ATGGAGATTGAGAAGGATGAGGACGACACAACATTGGTTGATTCTGGAGGAGACTTCGAC 
TGCAACATATGTTTGGATCAGGTTCGAGACCCGGTCGTGACTTTATGTGGCCACCTGTTT 
TGTTGGCCCTGCATTGACAAGTGGACTTATGCGTCCAACAATTCAAGACAACGAGTCGAT 
CAATACGATCATAAGAGGGAACCACCAAAATGTCCGGTATGCAAATCTGATGTCTCCGAG 
GCTACGCTTGTCCCGATCTACGGACGAGGACAGAAAGCTCCCCAGTCCGGTTCAAATGTA 
CCGAGC^GACCAACTOTTCCGGTTTATGACT^^ 

GGGGAGAGTCAACGTTACATGTATAGAATGCCTGATCCGGTGATGGGTGTGGTATGCGAA 

ATGGTATACCGGAGACTATTTGGAGAGTCTTCGAGCAACATGGCACCTTACCGCGATATG 

AATGTCCGGTCTAGGCGACGGGCAATGCAGGCTGAGGAGTCATTAAGCAGAGTCTACTTG 

TTTCTACTTTGCTTCATGTTTATGTGTCTATTTCTCTTCTAA 

>G371 Amino Acid Sequence (domain in aa coordinates: 21-74) 

MEIEKDEDDTTLVDSGGDFDCNICLDQ 

QYDHKREPPKCPVCKSDVSEATLVPIYGRGQKAPQSGSNVPSRPTGPVYDLRGVGQRLGE 

GESQRYMYRMPDPVMGWCEMVYRRLFGESSSNI^YRDMNVRSRRRAMQAEE 
FLLCFMFMCLFLF* 

>G597 (255.. 1310) 

AAAATTCTCCTGTAAAATTTAATATTATAAAAGTGGTTT 

AATTTTCATCTTTAATCTTAAATTCTGGTAACCTTAATGCGCGATCCGCTTTO 

TTTGTGAGAGAGAAGAGATCTAAAAAAATCCACAATTTTGrrCAAATCTTGGAGTTAAAT 

GCTGAATTTTAGGCCTTGTTGCTTAGATTTATGGCTTAAAGTTTCAAA 

TATGTGAGAAGAAAATGTC7VGGATCTGAGACGGGTTTAATGGCGGCGACCAGAGAATCAA 
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TGCAATTTACAATGGCTCTCCACCAGCAGCAGCAACACAGTCAAGCTCAACCTCAGCAGT 
CTCAGAACAGGCCATTGTCATTCGGTGGAGACGACGGAACTGCTCTTTACAAGCAGCCGA 
TGAGATCAGTATCACCACCGCAGCAGTACCAACCCAACTCAGCTGGTGAGAATTCTGTCT 
TGAACATGAACTTGCCCGGAGGTGAGTCTGGAGGCATGACTGGAACTGGAAGTGAGCCAG 
TGAAAAAGAGGAGAGGTAGACCGAGGAAATATGGGCCTGATAGTGGTGAAATGTCACTTG 
GTTTGAATCCTGGAGCTCCTTCTTTCACTGTCAGCCAACCTAGTAGCGGCGGCGATGGAG 
GAGAGAAGAAGAGAGGAAGACCTCCTGGTTCTTCTAGCAAAAGGCTCAAGCTTCAAGCTT 
TAGGCTCGACTGGAATCGGATTTACGCCTCATGTACTTACCGTGCTGGCTGGAGAGGATG 
TATCATCCAAGATAATGGCGTTAACTCATT^ATGGACCCCGTGCTGTGTGTGTCTTGTCTG 
CAAATGGAGCCATCTCCAATGTGACTCTCCGCCAGTCTGCCACATCCGGTGGAACTGTTA 
CATATGAGGGGAGATTTGAGATTCTGTCTTTATCGGGATCTTTCCATTTGCTGGAGT^ACA 
ATGGTCAAAGAAGGAGGACGGGAGGTCTAAGCGTGTCATTATCAAGTCCGGATGGTAATG 
TCCTCGGTGGCAGTGTAGCTGGTCTTCTTATAGCAGCATCACCTGTTCAGATTGTTGTTG 
GGAGTTTCTTACCAGACGGAGAAAAAGAACCAAAACAGCATGTGGGACAAATGGGACTGT 
CGTCACCCGTATTACCGCGTGTGGCCCCAACGCAGGTGCTGATGACTCCAAGTAGCCCAC 
AATCTCGAGGCACAATGAGTGAGTCATCTTGTGGAGGAGGACATGGAAGCCCTATTCATC 
AGAGCACTGGAGGACCTTACAATAACACCATTAACATGCCCTGGAAGTAGCCAAGTGATC 
TGTGTCGGCTTAAAACCAACAACTTCCCGTTATTAGAGTGATTTATTTCTACATTTGGTT 
TAGACTTTCTAGTTCTGATGGTTATTTCTACAGTTGGTTTAGACTTTCTAGTTCTGTTCA 
GACAAAAGGAGTTTGATAAATTGACCGACCTATTTTGTGTGTTTGAGGTACTTTCAGAAC 
CATAGGTGTTCAGAAATTAGAATGTTCTGTTTAAAAAA 

>G597 Amino Acid Sequence (domain in AA coordinates: 97-104,137-144) 

MSGSETGLMAATRESMQFTMALHQQQQHSQAQPQQSQNRPLSFGGDDGTALYKQPMRSVS 

PPQQYQPNSAGENSVLNMNLPGGESGGMTGTGSEPVKKRRGRPRKYGPDSGEMSLGLNPG 

APSFTVSQPSSGGDGGEKKRGRPPGSSSKRLKLQALGSTGIGFTPHVLTVLAGEDVSSKI 

MALTHNGPRAVCVLSANGAISNVTLRQSATSGGTVTYEGRFEILSLSGSFHLLENNGQRS 

RTGGLSVSLSSPDGNVLGGSVAGLLIAASPVQIWGSFLPDGEKEPKQHVGQMGLSSPVL 

PRVAPTQVLMTPSSPQSRGTMSESSCGGGHGSPIHQSTGGPYNNTINMPWK* 

>G1009 (28.. 1704) 

AAAAAAAAAAAAAACCTATTCCCAAAGATGAAGAACAATAACAACAAATCTTCTTCTTCT 
TCTAGCTATGATTCTTCTTTGTCTCCTTCTTCTTCATCCTCCTCCCACCAGAACTGGCTC 
TCTTTCTCTCTCTCCAACAATAACAACAACTTCAATTCTTCCTCAAACCCTAATCTCACT 
TCCTCCACATCAGATCATCATCATCCTCACCCTTCTCACCTCTCTCTCTTTCAAGCTTTC 
TCCACTTCTCCAGTCGAACGGCAAGATGGGTCACCGGGAGTTTCACCCAGCGATGCCACG 
GCGGTTCTTTCCGTATACCCCGGCGGTCCTAAACTTGAGAACTTCCTCGGCGGAGGAGCC 
TCAACGACGACAACAAGACCAATGCAACAAGTGCAATCTCTTGGCGGCGTTGTCTTCTCT 
TCCGACCTACAGCCACCGCTTCATCCTCCGTCCGCCGCCGAGATCTACGACTCTGAGCTC 
AAGTCAATAGCCGCTAGCTTCCTAGGAAACTACTCCGGTGGACACTCGTCGGAGGTCTCT 
AGCGTACATAAACAACAACCGAATCCTCTAGCTGTCTCAGAGGCTTCGCCTACTCCGAAG 
AAGAACGTAGAGAGTTTTGGACAACGTACCTCGATTTATAGAGGAGTCACAAGACATAGA 
TGGACTGGAAGATACGAAGCTCATCTATGGGATAATAGTTGCCGAAGAGAAGGCCAAAGC 
AGAAAAGGAAGACAAGTTTATTTAGGTGGTTATGATAAGGAAGATAAAGCAGCTAGAGCT 
TACGACCTTGCAGCTCTTAAGTATTGGGGTCCTACAACTACGACTAATTTCCCGATATCA 
AATTACGAATCTGAACTTGAAGAAATGAAACACATGACTCGACAAGAGTTCGTTGCTTCT 
TTAAGACGGAAAAGCAGTGGATTCTCTAGGGGTGCCTCCATGTACAGAGGCGTCACTAGA 
CATCATCAGCATGGTCGATGGCAGGCACGAATTGGAAGAGTTGCAGGCAACAAAGACCTT 
TATCTTGGCACATTTAGCACTCAAGAGGAAGCTGCAGAAGCTTATGATATAGCAGCGATC 
AAATTCCGCGGTCTAAATGCAGTCACCAATTTCGACATCAGTCGATATGATGTCAAATCA 
ATTGCTAGCTGTAATCTCCCTGTGGGTGGACTAATGCCTAAACCTTCTCCAGCAACCGCA 
GCGGCTGACAAAACCGTTGATCTTTCTCCATCCGACTCTCCATCTCTAACCACACCGTCC 
CTCACGTTCAATGTGGCAAC^CCGGTCAATGACCATGGAGGAACTTTTTACCACACTGGT 
ATACCAATCAAACCAGACCCGGCTGATCATTATTGGTCCAACATCTTTGGATTCCAGGCA 
AACCCGAAAGCAGAAATGCGACCATTAGCAAACTTTGGGTCGGATCTTCATAAGCCTTCT 
CCTGGTTATGCTATAATGCCGGTAATGCAGG7^GGTGAAAACAACTTTGGTGGTAGTTTT 
GTTGGGTCTGATGGGTATAACAATCATTCCGCTGCATCGAACCCGGTCTCAGCAATTCCG 
CTGTCCTCGACAACTACAATGAGTAACGGTAACGAAGGGTATGGTGGAAACATAAACTGG 
ATTAATAACAACATTTCAAGTTCTTACCAAACTGCAAAATCAAATCTCTCTGTTTTGCAC 
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ACACCGGTTTTTGGGTTGGAATGAGTATTCACATCTTAGTGAGAACTAAAATAAATATGT 
AGGAAAAAAATAAGGCTCTGTTTGAAGAAATCAGATATTTTCTTCTTAGATTATTTAAGT 
AGTTTAAAAAAAATATTTTTTAAGTGTTTCACTTTTACGTTTGTCTGCTGACCACGAATT 
TTGCTGGATCTGACAGTACTAACTCTTTGTTTAATGACCTTATGGGTTCCTTTTTTACTT 

ATGAATGATTGAAGATGGAAACTGCTTGCATGTGAATAAACGAAAATCAAACNATCTTCG 
GTAACTTAAAAA 

>G10O9 Amino Acid Sequence (domain in aa coordinates: 201-277, 303-371) 

MKNNNNKSSSSSSYDSSLSPSSSSSSHQNWLSFSLSNNNmFNSSSNPNLTSSTSDHHHP 

HPSHLSLFQAFSTSPVERQDGSPGVSPSDATAVLSVYPGGPKLENFLGGGASTTTTRPMQ 

QVQSLGGWFSSDLQPPLHPPSAMIYDSELKSIAASFLGNYSGGHSSEVSSVHKQQPNP 

LAVSEASPTPKKNVESFGQRTSIYRGVTRHRWTGRYEAHLWDNSCRREGQSRKGRQVYLG 

GYDKEDKAARAYDLAALKYWGPTTTTNFPISNYESELEEMKHMTRQEFVASLRRKSSGFS 

RGASMYRGVTRHHQHGRWQARIGRVAGNKDLYLGTFSTQEEAAEAYDIAAIKFRGLNAVT 

NFDI SRYDVKS I ASCNLPVGGLMPKPSPATAAADKTVDLS PSDSPSLTTPSLTFNVATPV 

NDHGGTFYHTGIPIKPDPADHYWSNIFGFQANPKA£MRPLANFGSDLHNPSPGYAIMPVM 

QEGENNFGGSFVGSDGYNiraSAASNPVSAIPLSSTTTMSNGNEGYGGNINWINNNISSSY : * 
QTAKSNLS VLHTP VFGLE * 
>G170 (1..1107) 

ATGGGGATGAAGAAGGTGAAGCTATCTTTGATAGCTAATGAAAGATCAAGGA7VAACATCC 
TTCATAAAGAGGAAAGACGGGATTTTTAAGAAACTCCACGAGTTGTCAACTCTGTGTGGT 
GTCCAAGCTTGTGCTCTCATCTACAGTCCATTCATACCGGTTCCAGAGTCATGGCCGTCA 
AGGGAAGGTGCTAAAAAGGTGGCTTCAAGGTTTCTGGAGATGCCGCCGACAGCCCGAACC 
AAGAAGATGATGGATCAAGAGACTTACCTTATGGAGAGGATTACCAAAGCAAAAGAGCAA 
CTAAAGAACCTGGCTGCTGAGAACCGAGAGTTACAGGTTAGACGATTTATGTTTGATTGT 
GTTGAAGGCAAAATGTCCCAGTATCATTATGATGCAAAAGACCTTCAAGATTTGCAATCT 
TGTATAAATCTATATCTCGATC^GCTTAACGGAAGGATCGAGTCCATTAAAGAAAATGGT 
GAGTCGTTGTTGTCTTCCGTCTCTCCTTTTCCTACTAGAATTGGTGTTGACGAAATTGGT 
GATGAGTCATTTTCCGACTCTCCTATTCATGCTACAACTGGGGTTGTAGATACTCTTAAT 
GCTACCAATCCTCATGTTCTTACGGGCGATATGACTCCTTTTCTTGATGCGGACGCAACT 
GCGGTAACTGCTTCCAGTAGATTTTTTGATCATATTCCATATGAAAATATGAATATGAGT 
CAAAATCTGCATGAACCGTTTCAACACCTTGTTCCTACTAACGTTTGTGATTTTTTTCAA 
AATCAGAATATGAATCAGGTTCAATACCAGGCTCCTAATAATCTGTTTAATCAGATTCAA 
CGAGAATTCTACAACATAAATTTGAATCTGAATTTGAATCTGAATTCGAATCAGTATCTG 
AATCAACAACAATCATTCATGAATCCGATGGTGGAACAACATATGAATCATGTTGGAGGG 
CGTGAAAGCATTCCTTTCGTGGACGGAAACTGCTACAACTACCATCAACTACCATCCAAT 
CAACTACCAGCCGTTGATCATGCTTCCACCAGTTACATGCCTTCCACCACCGGTGTCTAT 
GATCCTTACATCAACAATAATCTCTAA 

>G170 Amino Acid Sequence (domain in aa coordinates: 2-57) 

MGMKKVKLSLIANERSRKTSFIKRIODGIFK^ 

REGAKKVASRFLEMPPTARTKKMMDQETO^ 

VEGKMSQYHYDAKDLQDLQSCINLYLDQLNGRIESIKENGESLLSSVSPFPTRIGVDEIG 
DE S FSDSP IHATTGVTOTLNATNPHVLTGDMTPFLDADATAVTASSRFFDH I PYENMNMS 
QNLHEPFQHLVPTNVCTFFQNQNMNQVQYQAP^ 

NQQQS FMNPMVEQHMNHVGGRE S I PFVDGNC YNYHQLPSNQLPAVDHASTS YMPSTTGVY 

DPYINNWL* 

>G1768 (185.. 1426) 

CTTCCTTTTGCTTCAGCTGCGA^^ 

TATCGACTTCCACCGAAAGATCACTTCTTAACCTACACAAGGTGTTTGTTATGAAGATCA 
GATAAATAAAAGGTCATTTGAGGATAATGGTTGATGTC 

TGTGATGGACAATGTAAGAGGTTC^TAATGTTGC^GCC^CTGCCAGAGATAGCTGAGAG 
TATCGATGATGCTATCTGCCATGAACTCTCCATGTGGCCTGATGATGCTAAAGATTTGTT 
ATTGATAGTGGAGGCAATATCAAGGGGAGACTTGAAGTTGGTACTTGTTGCTTGTGCAAA 
AGCTGTTTCTGAGAATAATCTTCTAATGGCACGATGGTGTATGGGTGAGTTGCGCGGTAT 
GGTTTCGATTTCTGGTGAGCCAATCCAGAGATTGGGAGCTTATATGTTAGAAGGGCTTGT 
TGCTAGGCTTGCTGCTTCTGGTAGTTCGATATATAAGTCTCTCCAGTCCAGAGAACCAGA 
GAGTTATGAATTTTTATCTTATGTGTATGTTCTGCATGAGGTTTGTCCATATTTCAAGTT 



258 



BNSDOCID: <WO_03013227A2J_> 



WO 03/013227 PCT/US02/25805 

259/286 



TGGATACATGTCAGCGAATGGTGCGATTGCAGT^AGCAATGAAGGATGAAGAGAGGATTCA 
CATTATTGACTTCCAAATTGGACAAGGGAGCCAGTGGATAGCACTTATCCAGGCTTTTGC 
AGCTAGGCCTGGTGGGGCTCCAAATATTCGAATTACCGGAGTTGGTGATGGATCTGTCTT 
GGTTACAGTCAAGAAGAGACTAGAGAAACTTGCAAAGAAGTTTGATGTTCCATTCAGGTT 
CAATGCGGTTTCAAGGCCAAGTTGTGMGTTGAAGTGGAAAATCTTGATGTCCGAGATGG 
CGAAGCCCTTGGAGTGAACTTTGCTTACATGCTGCATCATTTGCCAGATGAGAGTGTAAG 
CATGGAAAACCACAGGGACCGGTTGCTGAGGATGGTGAAGAGTCTATCACCTAAAGTAGT 
CACTCTTGTGGAAC7VAGAATGCAACACGAACACTTCCCCTTTCCTTCCTAGGTTCCTTGA 
GACATTAAGTTATTACACGGCAATGTTCGAATCTATCGATGTTATGCTTCCGAGAAATCA 
CAAGGAAAGGATCAATATCGAGCAGCACTGCATGGCAAGGGATGTCGTCAACATCATAGC 
TTGTGAAGGAGCCGAGAGGATCGAAAGACACGAGCTTCTCGGGAAATGGAAGTCAAGGTT 
TTCCATGGCGGGTTTTGAGCCATACCCCTTGAGCTCAATCATTTCAGCCACCATTAGAGC 
CCTCTTGAGAGATTACAGCAACGGGTATGCGATTG7VAGAAAGAGATGGTGCTCTGTACCT 
TGGTTGGATGGACCGAATCTTGGTCTCATCTTGTGCATGGAAGTGAAAGAATAAACGTCT 
CCT^AGAATGTAATGCAAAAGACAGAACTGGAAGTAATAGATAGTTTTGTCTCATAACCAT 
TAATAAGGTTGAATCAAATCATATACATCCCCATGCTACAACTATTACACAGGCTCCATC 
AACAAAGAAGGGCTCTTGTTGTGTTACCTTCTCTTCCTGTAACTCTTATTTGAACCAAAT 
GGAAGTGGTTACAT 

>G1768 Amino Acid Sequence (domain in AA coordinates: 54-413) 
MDNVRGSIMLQPLPEIAESIDDAICHELSMWPDDAKDLLLIVEAISRGDLKLVLVACAKA 
VSENNLLMARW CMGELRGM VS I S GEP I QRLGAYMLEGLVARLAASGS S I YKS LQS RE PES 
YEFLSYVYVLHEVCPYFKFGYMSANGAIAEAMKDEERIHIIDFQIGQGSQWIALIQAFAA 
RPGGAPNIRITGVGDGSVLVTVKICRLEKLAKKFDVPFRFNAVSRPSCEVEVENLDVRDGE 
ALGVNFAYMLHHLPDESVSMENHRDRLLRMVKSLSPKVVTLVEQECNTNTSPFLPRFLET 
LS Y YTAMF E S I D VML PRNHKE R IN I EQHCMARD WN 1 1 ACEG AE R I ERHELLG KWKS RF S 
MAGFEPYPLSSIISATIRALLRDYSNGYAIEERDGALYLGWMDRILVSSCAWK* 
>G185 (77.. 988) 

ATGCAAAAATAAACATAGTAACAATACTTTAAACTATTTACACCACTTTAATCTTATTCT 
CCACTCTTTGAACGTAATGGAGAAGAACCATAGTAGTGGAGAGTGGGAGAAGATGAAGAA' 
CGAGATCAACGAGCTAATGATAGAAGGAAGAGACTATGCACACCAGTTTGGATCAGCTTC 
ATCTCAAGAAACACGTGAACATTTAGCCAAAAAGATTCTTCAATCTTACCACAAGTCTCT 
CACCATCATGAACTACTCCGGCGAACTTGACCAAGTTTCTCAGGGTGGAGGAAGCCCCAA 
GAGCGATGATTCCGATCAAGAACCACTTGTCATCAAGAGTTCGAAGAAGTCAATGCCAAG 
GTGGAGTTCAAAAGTCAGAATTGCCCCTGGAGCTGGTGTTGATAGAACGCTGGACGATGG 
ATTCAGTTGGAGAAAGTACGGCCAGAAGGATATTCTCGGAGCCAAATTTCCAAGAGGATA 
CTATAGATGCACGTATAGAAAGTCTCAAGGATGTGAAGCCACTAAACAAGTCCAAAGATC 
TGATGAAAATCAGATGCTCCTTGAGATCAGTTACCGAGGAATACATTCTTGCTCTCAAGC 
TGCAAATGTCGGTACAACAATGCCGATACAAAACCTCGAACCG^ 

CGGAAATCTTGACATGGTAAAGGAAAGTGTAGACAACTACAATCACCAAGCACATTTGCA 
TCACAACCTTCACTATCCATTGTCATCTACCC(^UVATCTAGAGAATAACAATGCCTATAT 
GCTTCAAATGCGAGATCAAAACATCGAATATTTTGGATCTACGAGCTTCTCTAGTGATCT 
AGGAACTAGTATCAACTACAATTTTCCAGCATCTGGCTCGGCTTCTCACTCAGCATCAAA 
CTCTCCGTCCACCGTCCCTTTGGAATCCCCGTTTGAAAGCTATGATCCAAATCATCCATA 
TGGAGGATTTGGTGGGTTCTATTCTTAGTTATCTAC^^ 

CATGACCTCTTGATTAAAGAGAGAGTTTTCATAATAGCTAATCAATTTCCTAT.TCAAATA 
TCCGAGTTTTTTTTCTAATCATGTTTATC^^ 

GTCTATGTTGAAATAAATGGATTTGTACTCGTAGGTATGATCCTTGTTATCTAAAAAAAA 
AAAAA — 

>G185 Amino Acid Sequence (domain in AA coordinates: 113-172) 
MEKIOISSGEWEKMKNEINELMIEGRDYAHQFGSASSQET 

SGELDQVSQGGGSPKSDDSDQEPLVIKSSKKSMPRWSSKVRIAPGAGVDRTLDDGFSWRK 
YGQKD ILGAKFPRG YYRCTYRKSQGCEATKQVQRSDENQMLLE I S YRGIHS CS QAANVGT 
TMPIQNLEPNQTQEHGNLDMVKESVDNYNHQAHLHHNLHYPLSSTPNLEmONAYMLQMRD 
QNIEYFGSTSFSSDLGTSINYNFPASGSASHSASNSPSTVPLESPFESYDPNHPYGGFGG 
FYS* 

>G1931 (5.. 592) 

ATCAATGGAAGGGGTTGACAACACAAATCCTATGTTAACCCTAGAAGAAGGCGAAAACAA 
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CAATCCTTTTTCTTCCTTAGATGACAAAACATTAATGATGATGGCTCCTTCGTTAATCTT 

TTCGGGCGATGTAGGTCCATCTTCTTCTTCTTGTACTCCAGCAGGTTATCATCTATCTGC 

TCAGCTGGAGAACTTTCGAGGAGGTGGAGGAGAGATGGGAGGATTAGTGAGTAATAATAG 

CAATAATAGTGATCATAATAAGAATTGCAACAAAGGAAAAGGGAAGAGAACTTTGGCAAT 

GCAGAGGATAGCTTTTCATACAAGGAGTGATGATGATGTTCTTGATGATGGTTATCGTTG 

GCGAAAGTACGGTCAGATVATCTGTCAAGAACAATGCTCATCCCAGGAGCTATTATAGATG 

TACATACCACACATGCAACGTGAAGAAACAAGTGCAAAGACTGGCAAAAGATCCAAACGT 

TGTCGTAACAACCTACGAAGGTGTTCATAATCATCCTTGTGAGAAGCTCATGGAGACTCT 

TAGCCCTCTCCTTAGGCAACTTCAGTTCCTCTCAAGAGTTTCTGATCTGTAATTATTGAA 

TGTTAATTAGTGGTGTAATACATTAATTATGCTTTAATCTCTCCATTGACCCTCAATC 

>G1931 Amino Acid Sequence (domain in AA coordinates: 114-170) 

MEGVDNTNPMLTLEEGENNNPFSSLDDKTLMMMAPSLIFSGDVGPSSSSCTPAGYHLSAQ 

LENFRGGGGEMGGLVSNNSNNSDHNIOJCNKGKGKRTLAMQRIAFHTRSDDDVLDDGYRWR 

KYGQKSVKl^AHPRSYYRCTYHTCNVKiCQVQRL 

PLLRQLQFLSRVSDL * 

>G2543 (1..2169) 

ATGAGTTTCGTCGTCGGCGTCGGCGGAAGTGGTAGTGGAAGCGGCGGAGACGGTGGTGGT 

AGTCATCATCACGACGGCTCTGAAACTGATAGGAAG7^AGAAACGTTACCATCGTCACACC 

GCTCAACAGATTCAACGCCTTGAATCGAGTTTCAAGGAGTGTCCTCATCCAGATGAGAAA 

CAGAGGAACCAGCTTAGCAGAG7UVTTGGGTTTGGCTCCAAGACAAATCAAGTTCTGGTTT 

CAGAACAGAAGAACTCAGCTTAAGGCTCAACATGAGAGAGCAGATAATAGTGCACTAAAG 

GCAGAGAATGATAAAATTCGTTGCGAAAACATTGCTATTAGAGAAGCTCTCAAGCATGCT 

ATATGTCCTAACTGTGGAGGTCCTCCTGTTAGTGAAGATCCTTACXTTGATGAACAAAAG 

CTTCGGATTGAAAATGCACACCTTAGAGAAGAGCTTGAAAGAATGTCTACCATTGCATCA 

AAGTACATGGGAAGACCGATATCGCAACTCTCTACGCTACATCCAATGCACATCTCACCG 

TTGGATTTGTCAATGACTAGTTTAACTGGTTGTGGACCTTTTGGTCATGGTCCTTCACTC 

GATTTTGATCTTCTTCCAGGAAGTTCTATGGCTGTTGGTCCTAATAATAATCTGCAATCT 

CAGCCTAACTTGGCTATATCAGACATGGATAAGCCTATTATGACCGGCATTGCTTTGACT 

GCAATGGAAGAATTGCTCAGGCTTCTTCAGACAAATGAACCTCTATGGACAAGAACAGAT 

GGCTGCAGAGACATTCTCAATCTTGGTAGCTATGAGAATGTTTTCCCAAGATCAAGTAAC 

CGAGGGAAGAACCAGAACTTTCGAGTCGAAGCATCAAGGTCTTCTGGTATTGTCTTCATG 

AATGCTATGGCACTTGTCGACATGTTCATGGATTGTGTCAAGTGGACAGAACTCTTTCCC 

TCTATCATTGCAGCTTCTAAAACACTTGCAGTGATTTCTTCAGGAATGGGAGGTACCCAT 

GAGGGTGCATTGCATTTGTTGTATGAAGAAATGGAAGTGCTTTCGCCTTTAGTAGCAACA 

CGCGAATTCTGCGAGCTACGCTATTGTCAACAGACTGAACAAGGAAGCTGGATAGTTGTA 

AACGTCTCATATGATCTTCCTCAGTTTGTTTCTCACTCTCAGTCCTATAGATTTCCATCT 

GGATGCTTGATTCAGGATATGCCCAATGGATATTCCAAGGTTACTTGGGTTGAACATATT 

GAAACTGAAGAAAAAGAACTGGTTCATGAGCTATACAGAGAGATTATTCACAGAGGGATT 

GCTTTTGGGGCTGATCGTTGGGTTACCACTCTCCAGAGAATGTGTGAAAGATTTGCTTCT 

CTATCGGTACCAGCGTCTTCATCTCGTGATCTCGGTGGAGTGATTCTATCACCGGAAGGG 

AAGAGAAGCATGATGAGACTTGCTCAGAGGATGATCAGCAACTACTGTTTAAGTGTCAGC 

AGATCCAACAACACACGCTCAACCGTTGTTTCGGAACTGAACGAAGTTGGAATCCGTGTG 

ACTGCACATAAGAGCCCTGAACGAT^CGGCACAGTCCTATGTGCAGCCACCACTTTCTGG 

CTTCCCAATTCTCCTCAAAATGTCTTCAATTTCCTCAAAGACGAAAGAACCCGTCCTCAG 

TGGGATGTTCTTTCAAACGGAAACGCAGTGCAAGAAGTTGCTCACATCTGAAACGGATC^ 

CATCCTGGAAACTGCATATCGGTTCTACGTGGATCCAATGCAACACATAGCAACAACATG 

CTTATTCTGCAAGAAAGCTCAACAGACTCATCAGGAGCM 

GATTTAGCAGCATTOAACATCGCAATGAGCGGTGAAGATCCTTCTTATATTCCTCTCTTG 

TCCTCAGGTTTCACAATCTCACCAGATGGAAATGGCTCAAACTCTGAACAAGGAGGAGCC 

TCGACGAGCTCAGGACGGGGATCAGCTAGCGGTTCGTTGATAACGGTTGGGTTTCAGATA 

ATGGTAAGCAATTTACCGACGGCAAAACTGAATATGGAGTCGGTGGAAACGGTTAATAAC 

CTGATAGGAACAACTGTACATCAAATTAAAACCGCCTTGAGCGGTCCTACAGCTTCT^CT 
ACAGCTTGA 

>G2543 Amino Acid Sequence (domain in AA coordinates: 31-91) 
MSFWGVGGSGSGSGGDGGGSHHHDGSETDRKKKRYHRHTAQQIQRLESSFKECPHPDEK 
QRNQLSRELGLAPRQIKFWFQNRRTQLKAQHERADNSALKAENDKIRCENIAIREALKHA 
ICPNCGGPPVSEDPYFDEQKLRIENAHLREELERMSTIASKYMGRPISQLSTLHPMHISP 
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LDLSMTSLTGCGPFGHGPSLDFDLLPGSSMAVGPNNNLQSQPNLAISDMDKPIMTGIALT 
AMEELLRLLQTNEPLWTRTDGCRDILNLGSYENVFPRSSNRGKNQNFRVEASRSSGIVFM 
NAMALTOMFMDCVKWTELFPSIIAASKTLAVISSGMGGTHEGAIiHIiliYEEMEVLSPLVAT 
REFCELRYCQQTEQGSWIWNVSYDLPQFVSHSQSYRFPSGCLIQDMPNGYSKVTWVEHI 
ETEEKELVHELYREIIHRGIAFGADRWVTTLQRMCERFASLSVPASSSRDLGGVILSPEG 
KRSMMRLAQRMISNYCLSVSRSNNTRSTWSELNEVGIRVTAHKSPEPNGTVLCAATTFW 
LPNSPQNVFNFLKDERTRPQWDVLSNGNAVQEVAHISNGSHPGNCISVLRGSNATHSNNM 
LILQESSTDSSGAFWYSPVDLAALNIAMSGEDPSYIPLLSSGFTISPDGNGSNSEQGGA 
STSSGRASASGSLITVGFQIMVSNLPTAKLNMESVETVNNLIGTTVHQIKTALSGPTAST 
TA* 

>G264 (30.. 1430) 

CTTGTACCAGTTTCTGATTAGATTCAACAATGAACGGCGCATTAGGTAACTCCTCCGCCT 

CCGTTAGCGGCGGAGAAGGAGCCGGAGGACCAGCGCCTTTCTTGGTGAAAACCTACGAGA 

TGGTCGACGATTCATCAACGGACCAGATCGTATCGTGGAGCGCTAACAACAACAGCTTCA 

TCGTTTGGAATCATGCCGAATTTTCACGCCTCCTTCTTCCAACCTACTTCAAACACAATA 

ACTTCTCTTCCTTCATTCGTCAGCTCAATACCTATGGGTTTAGGAAGATTGATCCAGAGA 

GGTGGGAGTTTTTGAATGATGATTTTATTAAGGATCAGAAGCATCTTCTCAAGAATATAC 

ATAGAAGGAAACCTATACACAGCCACAGTCATCCACCTGCTTCGTCGACTGATCAAGAAA 

GAGCAGTGTTGCAAGAGCAAATGGACAAGCTTTCACGTGAGAAAGCTGCAATTGAAGCTA 

AGCTTTTJ^AAGTTCAAACAACAGAAGGTTGTAGCAAAGCATCAGTTTGAAGAAATGACTG 

AGCATGTTGATGATATGGAGAATAGGCAGAAGAAGCTGCTGAATTTTTTGGAAACTGCGA 

TTCGGAATCCTACTTTTGTTAAGAATTTTGGTAAGAAAGTCGAGCAGTTGGATATTTCAG 

CTTACAACAAAAAG CGAAGGCTCCCTGAAGTTGAG CAATCAAAGC CACCTTCAGAAGATT 

CTCATCTGGATAATAGTAGTGGTAGCTCGAGACGCGAGTCTGGAAACATTTTTCATCAAA 

ATTTCTCTAATAAATTGCGACTAGAGCTTTCTCCAGCTGATTCAGATATGAACATGGTTT 

CACACAGTATACAAAGTTCCAATGAAGAAGGTGCGAGTCCCAAAGGGATACTGTCAGGAG 

GTGATCCAAATACTACACTAACAAAAAGAGAAGGCCTACCATTTGCACCTGAAGCTCTAG 

AGCTTGCGGATACCGGGACATGCCCGAGGAGATTACTGTTAAATGATAATACAAGGGTGG 

AGACCTTGCAGCAGAGGCTAACTTCTTCAGAGGAGACTGATGGTAGCTTTTCATGTCATT 

TAAATCTAACCCTGGCTTCTGCTCCGTTACCGGACAAT^ACAGCTTCACAGATAGCTAAGA 

CGACTCTTAAAAGTCAGGAGTTAAACTTTAACTCAATAGAAACAAGTGCAAGTGAGAAAA 

ATCGGGGTAGACAAGAGATTGCAGTTGGAGGTAGCCAAGCAAATGCAGCTCCTCCAGCAA 

GAGTGAATGATGTATTCTGGGAACAGTTCCTAACAGAAAGGCCAGGGTCTTCAGATAATG 

AGGAGGCAAGTTCGACTTATAGAGGTAACCCATACGAAGAGCAAGAGGAGAAAAGAAACG 

GGAGTATGATGTTACGTAATACAAAGAATATCGAGCAGCTGACCTTATAAACTATTTGGA 

CGGTTACATCAACGAGAGTACGAACTGAGGTTTTGGTAAGAAGTATGGGTGAGTAAGTAA 

TGAAACATTGGACTGAAAAAGCGTAAGTAGCTTTGTTGTAAACACTTGCGTCTCTGTCTA 

CACAAGTAATTTGACTGTAAATGTAAGTGTACAGGATTTAAATTGAATAAGCA 

>G264 Amino Acid Sequence (domain in AA coordinates: 24-114) 

MNGALGNSSASVSGGEGAGGPAPFLVKTYEMVDDSSTDQIVSWSANNNSFIVWNHAEFSR 

LLLPTYFKHNNFSSFIRQLNTYGF 

HPPASSTDQERAVLQEQMDKliSREKAAIEAKLLKFKQQKVVAICHQFEEMTEHVDDMENRQ 
KKLLNFLETAIRNPTFVKNFGKKVEQLDISAYNKXRRLPEVEQSKPPSEDSHIjDNSSGSS 
RRESGNIFHQNFSNKLRLELSPADSDMNMVSHSIQSSNEEGASPKGIIiSGGDPNTTLTKR 
EGLPFAPEALELADTGTCPRRLLLNDNTRVET^ 

pdktasqiakttlksqelnfnsietsaseknrgrqeiavggsqanaapparvndvfweqf 
lterpgssdneeasstyrgnpyeeqeekrngsmmlrntknieqltl* 

>G32 (101.. 736^- 

AACACACATTCCCTCTCTTCCTTCAACTAGAAAAi\AGATAGATATATCGGACATTTATTG 
ATCTGTGTATGCATAAAGGTATAGTATCATTATTAGAAAGATGAACACAACATCATCAAA 
GAGCAAGAAGAAGCAAGACGATCAGGTTGGTACAAGGTTTCTTGGGGTGAGAAGAAGGCC 
TTGGGGAAGATACGCAGCTGAGATTAGAGACCCAACTACGAAGGAGCGTCACTGGCTTGG 
CACTTTCGATACGGCGGAAGAAGCTGCCTTGGCCTACGATAGAGCTGCTCGGTCCATGCG 
TGGCACACGTGCCAGAACCAACTTTGTTTACTCAGACATGCCTCCTTCCTCATCCGTCAC 
CTCCATTGTTTCTCCTGACGATCCTCCTCCTCCTCCACCTCCTCCTGCTCCTCCTAGCAA 
TGATCCTGTCGATTACATGATGATGTTTAACCAATACTCATCCACTGACTCGCCAATGCT 
TCAGCCTCATTGTGATCAAGTGGACAGTTACATGTTTGGTGGCTCTCAATCTTCGAATTC 
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TTATTGCTATTCTAATGACAGTAGTAATGAGCTGCCTCCTCTCCCGAGCGACTTGTCGAA 
TTCGTGTTATAGCCAACCACAGTGGACCTGGACCGGTGACGACTACTCGTCTGAGTACGT 
ACATAGTCCAATGTTCAGCAGAATGCCTCCGGTTTCTGACTCTTTCCCTCAAGGTTTCAA 
CTACTTTGGCTCCTAATTCTTTCTCATCGTCCATATTTAATACCTTCCTCATTTGTACCT 
TTTCCTTCTTCTTCTTTTTTGGGTTTATCTATGTTTCGCCGTCCTTGATCTCTGCCTATG 
TGATCAAAGTGACTGTTTGTCATTAGTTTTTCAATAACAAGTTATCATTTGTATCTTGAA 
AAAAAAAAAAA 

>G3 2 Amino Acid Sequence (domain in aa coordinates: 17-84) 

MNTTSSKSKKKQDDQVGTRFLGVRRRPWGRYAAEIRDPTTKERHWLGTFDTAEEAALAYD 

RAARSMRGTRARTNFVYSDMPPSSSVTSIVSPDDPPPPPPPPAPPSNDPVDYMMMFNQYS 

STDSPMLQPHCDQVDSYMFGGSQSSNSYCYSNDSSNELPPLPSDLSNSCYSQPQWTWTGD 

DYSSEYVHSPMFSRMPPVSDSFPQGFNYFGS * 

>G436 (1..2157) 

ATGGATTTTACTCGCGATGACAACTCAAGTGATGAACGGGAAAATGATGTAGACGCCAAC 
ACCAACAACCGTCACGAGAAGAAGGGTTACCATCGCCACACTAATGAACAAATTCATAGG 
CTTGAAACGTATTTCAAGGAATGTCCTCATCCAGACGAATTTCAGCGACGTCTGTTGGGT 
GAAGAACTGAATCTGAAACCAAAACAAATCAAATTTTGGTTTCAAAACAAAAGAACTCAA 
GCTAAGAGTCACAATGAAAAAGCAGACAATGCAGCGCTTAGGGCAGAAAATATTAAGATT 
AGACGTGAGAACGAATCAATGGAAGATGCACTGAATAATGTGGTTTGCCCTCCATGTGGT 
GGTCGTGGTCCTGGGAGAGAAGACCAACTTCGACATCTCCAAAAACTCCGTGCACAAAAC 
GCTTATCTCAAAGATGAGTATGAAAGAGTCTCAAACTACCTAAAACAGTACGGAGGTCAC 
TCAATGCATAACGTCGAGGCCACAC CCTATCTCCATGGTCCATCAAACCATG CATCAACG 
TCCAAGAACCGTCCAGCATTGTACGGAACCTCTTCTAACCGTCTCCCCGAGCCTTCAAGC 
ATATTTAGAGGACCATACACTCGTGGAAACATGAACACCACCGCACCGCCTCAGCCGCGA 
AAGCCGCTGGAAATGCAGAATTTCCAACCACTATCTCAACTGGAGAAAATTGCAATGTTG 
GAAGCAGCGGAAAAAGCGGTGTCAGAGGTTTTGAGCCTCATTCAAATGGATGATACAATG 
TGGAAAAAGTCGTCTATTGATGATAGGCTCGTCATTGATCCAGGGCTCTATGAGAAATAT 
TTTACTAAGACTAACACAAATGGTCGTCCTGAGTCTTCTAAAGATGTCGTGGTGGTTCAA 
ATGGATGCTGGAAACTTGATCGACATCTTCTTAACTGCGGAGAAATGGGCGAGGCTTTTT 
CCAACAATTGTGAACGAAGCTAAAACGATTCACGTCTTGGATTCCGTTGACCATCGAGGA 
AAAACTTTCTCAAGAGTGATTTATGAGCAACTGCACATACTGTCACCATTGGTGCCACCG 
AGGGAATTTATGATCCTAAGGACTTGCCAACAAATTGAAGACAATGTCTGGATGATTGCT 
GATGTGTCGTGTCATCTCCCAAACATTGAGTTTGATCTTTCGTTTCCCATTTGCACCAAA 
CGTCCCTCAGGTGTGCTCATTCAAGCCTTGCCCCACGGCTTCTCTAAGGTGACGTGGATA 
GAGCATGTGGTAGTGAATGATAATAGAGTGCGGCCACATAAGCTTTACAGAGACCTCTTA 
TACGGCGGCTTTGGCTACGGAGCTCGACGTTGGACCGTTACTCTTGAGAGGACGTGTGAG 
AGGCTGATTTTCTCCACCTCCGTCCCTGCCTTGCCCAACAATGACAATCCCGGAGTTGTG 
CAAACAATACGAGGCAGAAATAGCGTAATGCATTTGGGAGAAAGAATGTTGAGGAACTTT 
GCATGGATGATGAAAATGGTTAACAAACTCGACTTCTCGCCACAGTCTGAAACTAACAAC 
AGCGGAATTAGGATTGGGGTGCGGATAAACAATGAGGCGGGTCAACCGCCCGGTCTCATT 
GTCTGTGCTGGTTCATCTTTATCCCTCC 

AAGAATCTGGAGGTTCGTCACCAGTGGGACGTTCTGTGCCATGGGAATCCAGCGACTGAG 

GCTGCTCGTTTCGTCACCGGATCAAACCCAAGGAACACTGTGTCTTTTCTCGAGCCTTCA 

ATTAGGGATATTAATACTAAGCTAATGATACTCCAAGATAGCTTCAAAGATGCATTGGGA 

GGAATGGTGGCCTACGCTCCAATGGATCTAAACACCGCCTGCGCTGCCATTTCAGGCGAT 

ATCGATCCTACCACCATTCCAATCCTCCCTTCCGGTTTTATGATCTCCCGTGACGGCCGT 

CCTTCCGAGGGCGAAGCCGAGGGTGGCAGCTATACACTCCTCACCGTGGCTTTCCAGATC 

CTTGTCTCCGGTCCBAGTTACTCTCCTCATACCAACCTGGAAGTTTCTGC^COVCAGTC 

AATACCTTGATTAGCTCCACCGTTCAAAGGATCAAAGCCATGCTCAAGTGCGAATGA 

>G436 Amino Acid Sequence (domain in AA coordinates: 22-85) 

MDFTRDDNSSDERENDVDANTNNRHEKKGYHRHTNEQIHRLETYFKECPHPDEFQRRLLG 

EELNLKPKQIKFWFQNKRTQAKSHNEKADNAALRAE^ 

GRGPGREDQLRHLQKLRAQNAYLKDEYERVSNYLKQYGGHS^fflNVEATPYlJ^GPSNHAST 
S KNRPALYGTS SNRLPEPS S I FRGP YTRGNMNTTAPPQPRKPLEMQNFQPLSQLEKI AML 
EAAEKAVSEVLSLIQMDDTMWKKSSIDDRLVIDPGLYEKYFTKTNTNGRPESSKDVVVVQ 
MDAGNLIDIFLTAEKWARLFPTIVNEAKTIHVLDSVDHRGKTFSRVIYEQLHILSPLVPP 
REFMILRTCQQIEDNWMIADVSCHLPNIEFDLSFPICTKRPSGVLIQALPHGFSKVTWI 
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EHVVVNDNRVRPHKLYRDLLYGGFGYGARRWTVTLERTCERLIFSTSVPALPNNDNPGVV 
QTIRGRNSVMHLGERMLRNTFAWMMKMVNKLDFSPQSETNNSGIRIGVRINNEAGQPPGLI 
VCAGSSLSLPLPPVQVYDFLKNLEVRHQWDVLCHGNPATEAARFVTGSNPRNTVSFLEPS 
IRD INTKLM ILQDS FKD ALGGMVAYAPMDLNTACAAI SGD I DPTTI P I LPSGFM I SRDGR 
PSEGEAEGGSYTLLTVAFQILVSGPSYSPDTNLEVSATTVNTLISSTVQRIKAMLKCE* 
>G556 (50.. 1144) 

CTTTTTTGAAGCCCTTTTGACACAAAAGACCAGAACAAGTTGAAGAAATATGAATACAAC 
CTCGACACATTTTGTTCCACCGAGAAGGTTTGAAGTTTACGAGCCTCTCAACCAAATCGG 
TATGTGGGAAGAAAGTTTCAAGAACAATGGAGACATGTATACGCCTGGCTCTATCATAAT 
CCCGACTAACG7VAAAACCAGACAGCTTGTCAGAGGATACTTCTCATGGGACAGAAGGAAC 
TCCTCACAAGTTTGACCAAGAGGCTTCCACATCTAGACATCCTGATAAGATACAGAGAAG 
GCTAGCACAGAATCGAGAGGCAGCTAGGAAAAGTCGTTTGCGCAAGAAAGCTTATGTTCA 
GCAGCTAGAGACTAGCCGGTTAAAGCTAATTCATTTAGAGCAAGAACTCGATCGTGCTAG 
ACAACAGGGTTTCTATGTGGGGAACGGAGTAGATACCAATGCTCTTAGTTTCTCAGATAA 
CATGAGCTCAGGGATTGTTGCATTTGAGATGGAATATGGACATTGGGTGG7\AGAACAGAA 
CAGGCAAATATGTGAACTAAGAACGGTTTTACATGGACAAGTTAGTGATATAGAGCTTCG 
TTCTCTAGTCGAGAATGCCATGAAACATTACTTTCAACTCTTCCG7VATGAAGTCAGCCGC 
TGCAAAAATCGATGTTTTCTATGTCATGTCCGGAATGTGGAAAACTTCAGCAGAGCGGTT 
TTTCTTGTGGATAGGCGGATTTAGACCCTCAGAGCTTCTCAAGGTTCTGTTACCGCATTT 
TGATCCTTTGACGGATCAACAACTTTTGGATGTATGTAATCTGAGGCAATCATGTC7VACA 
ATCAGAAGATGCGTTATCCCAAGGTATGGAGAAACTGCAACATACATTAGCAGAGAGTGT 
AGCAGCCGGGAAACTTGGTGAAGGAAGTTATATTCCTCAAATGACTTGTGCTATGGAGAG 
ATTGGAGGCTTTGGTCAGCTTTGTAAATCAAGCTGATCATCTGAGACATGAGACATTGCA 
ACAGATGCATCGGATCTTAACCACGCGACAAGCGGCTAGAGGTTTGTTAGCATTAGGGGA 
GTATTTCCAAAGGCTTCGAGCTTTGAGTTCGAGTTGGGCGGCTAGGCAACGTGAACCAAC 
GTAATTAAGGTGTTTAGATGTCAAGAAAGGTTTGAGACCTTAACAATCAAGAATGGAGTT 
TGCTGGTGAGTGGATTTTTGGGTCAAGAACAAGAGCAATAACACAAGCTGCTGTGTGATG 
ATGAATCTTGTCTTGCGGCTAAAGGAAATGTTTGAGGAAAGTTGTACATATGATCAGCAA 
CGTAAAGTTTATAGCTTTTTAGAAACCAACTTTTCGATGGTTGTTCTTTTTTTTTTGTAT 
GTAATATTATAGATAAGCTTGTGGTATATATGATTTTAATGTGACATTACGAACTTGATT 
TATAACCATGGTAAAAT 

>G556 Amino Acid Sequence (domain in AA coordinates: 83-143) 

MOTTSTHFVPPRRFEWEPLNQIGMWEESFKNWGDMYTPGSIIIPTO 

TEGTPHKFDQEASTSRHPDKIQRRLAQNREAARKSRLRKKAYVQQLETSRLKLIHLEQEL 

DRARQQGFYVGNGVDTNALSFSDNMSSGIVAFEMEYGHWEEQNRQICELRTVLHGQVSD 

IELRSLVENAMKHYFQLFR>1KSAAAKIDVFYVMSGMWKTSAERFFLWIGGFRPSELLKVL 

LPHFDPLTDQQLLDVCNLRQSCQQSEDALSQGMEKLQHTLAESVAAGKLGEGSYIPQMTC 

AMERLETUjVSFVNQADHLRHETLQQMHRILTTRQAARGLLALGEYFQRLRALSSSWAARQ 

REPT* 

>G1420 (39.. 1238) 

AAAGTATCATCTCATAGATTCCATCTTTTCTCTATTACATGGAGAAGAAAAAAGAAGAGG 
ATCATCATCATCAACAACAACAACAACAACAAAAGGAGATCAAGAACACAGAGACAAAGA 
TCGAGCAAGAACAAGAACAAGAACAAAAACAAGAAATCTCTCAAGCATCATCATCATCAA 
ACATGGCGAATCTAGTTACGTCATCAGATCATCATCCGTTGGAGCTAGCTGGAAATCTCT 
CAAGCATCTTCGATACTTCATCTTTACCTTTTCCTTATTCTTATTTCGAAGATCACTCTT 
CTAATAATCCTAATTCTTTCCTAGACTTGCTCCGACAAGATCATCAGTTTGCTTCTTCCT 
CTAATTCCTCTTCTTTTTCATTCGATGCCTTTCCTCTCCCCAATAACAACAACAACACCT 
CTTTTTTTACGGATTTGCCCTTACCTCAAGCTGAGTCATCAGAAGTCGTGAACACAACAC 
CGACTTCTCCAAACTCAACCTCAGTCTCATCTTCCTCCAACGAAGCTGCAAATGATAACA 
ACAGTGGTAAAGAAGTTACTGTTAAAGATCAAGAAGAAGGAGATCAACAACAAGAGCAAA 
AGGGTACTAAGCCACAGTTGAAGGCAAAGAAGAAGAATCAAAAGAAAGCTAGAGAAGCTA 
GGTTTGCGTTTCTGACGAAGAGCGATATTGATAATCTTGACGACGGTTATAGGTGGAGAA 
AATACGGCCAAAAAGCTGTCAAAAACAGTCCTTATCCCAGAAGCTATTACCGTTGCACCA 
CAGTGGGTTGCGGAGTGAAGAAGAGAGTGGAGAGATCCTCCGATGATCCTTCGATCGTCA 
TGACAACCTACGAAGGTCAGCATACCCATCCTTTCCCCATGACGCCACGTGGACACATCG 
GAATGCTCACGTCACCAATCCTAGACCACGGTGCAACCACCGCGTCATCATCATCATTCT 
CCATCCCTCAGCCACGTTACTTGCTGACTCAACATCACCAGCCCTACAACATGTACAACA 
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ACAACTCTCTAAGTATGATCAATAGAAGATCATCCGATGGCACTTTCGTAAATCCAGGTC 

CATCATCATCATTCCCCGGCTTTGGTTATGATATGTCTCAAGCTTCTACTTCAACTTCTT 

CTTCCATTAGAGATCATGGATTGCTTCAAGATATTCTTCCTTCGCAGATCAGATCCGATA 

CTATTAACACTCAAACCAATGAAGAGAATAAGAAATGAAGAAGTTTTTTTTCCCGGGGCA 

ATTGTTTTTTTCTTTAGGCCGGATCCGGTAGGTAGGTTTCATGAGC 

>G1420 Amino Acid Sequence (domain in AA coordinates: 221-280) 

MEKKKEEDHHHQQQQQQQKEI KNTETKI EQEQEQEQKQE ISQAS S S SNMANLVTS SDHH P 

LELAGNLS S I FDTS S LPFPYS YFEDHS SNNPNS FLDLLRQDHQFAS SSNS SS FS FDAFPL 

PNNNNNTSFFTDLPLPQAESSEVVNTTPTS^^^ 

GDQQQEQKGTKPQLKAKKKNQKKAREARFAFLTKSDIDNLDDGYRWRKYGQKAVKNSPYP 

RSYYRCTTVGCGVKKRVERSSDDPSIVMTTYEGQHTHPFPMTPRGHIGMLTSPILDHGAT 

TASSSSFSIPQPRYLLTQHHQPYNMYNNNSLSMINRRSSDGTFVNPGPSSSFPGFGYDMS 

QASTSTSSSIRDHGLLQDILPSQIRSDTINTQTNEENKK* 

>G1412 (115. .1008) 

CCCACGCGTCCGCCCACGCGTCCGAAACAAAAACATATAATTTGGGTTTTTAGAGTTCGA 

GTTAG AG AG AAAG ATCCGTTAGCCCAGTTGAGTTTG CCAC C AGGTTTTAGATTTTATC CG 

ACAGATGAAGAGCTTCTTGTTCAGTATCTATGTCGGAAAGTTGCAGGCTATCATTTCTCT 

CTCCAGGTCATCGGAGACATCGATCTCTACAAGTTCGATCCTTGGGATTTGCCAAGTAAG 

GCTTTGTTTGGAGAGAAGGAATGGTATTTCTTTAGCCC7UVGAGATCGGAAATATCCGAAC 

GGGTCAAGACCCAATAGAGTAGCCGGGTCGGGTTATTGGAAAGCAACGGGTACTGACAAA 

ATTATCACGGCGGATGGTCGTCGTGTCGGGATTAAAAAAGCTCTGGTCTTTTACGCCGGA 

AAAGCTCCCAAAGGCACTAAAACCAACTGGATTATGCACGAGTATCGCTTAATAGAACAT 

TCTCGTAGCCATGGAAGCTCCAAGTTGGATGATTGGGTGTTGTGTCGAATTTACAAGAAA 

ACATCTGGATCTCAGAGACAAGCTGTTACTCCTGTTC7VAGCTTGTCGTGAAGAGCATAGC 

ACGAATGGGTCGTCATCGTCTTCTTCATCACAGCTTGACGACGTTCTTGATTCGTTCCCG 

GAGATAAAAGACCAGTCTTTTAATCTTCCTCGGATGAATTCGCTCAGGACGATTCTTAAC 

GGGAACTTTGATTGGGCTAGCTTGGCAGGTCTTAATCCAATTCCAGAGCTAGCTCCGACC 

AATGGATTACCGAGTTACGGTGGTTACGATGCGTTTCGAGCGGCGGAAGGTGAGGCGGAG 

AGTGGGCATGTGAATCGGCAGCAGAACTCGAGCGGGTTGACTCAGAGTTTCGGGTACAGC 

TCGAGTGGGTTTGGTGTTTCGGGTCAAACATTCGAGTTTAGGCAATGAGAGAGATGTGAA 

GTTACTGATGGGTGAAAAAAGTAAAAAAAAAACTTGGAGATAGTAGAGTGGCAATTGATG 

TAAATAATAGGGATTTATATGGGGCTTTTACCGATTCGGTGAGGCTTAGGATTCCCCAAA 

GGAAAAAGGCTCGACTGGGGACTAGTTTGATCCAACTTGACGGCCCCCAAATGTGTAATG 

TTTCTCAACGGAGAGAAAAATAAATGGTTACCAATATTTTTCCAAAAAAAAAAAAAAAAA 

>G1412 Amino Acid Sequence (domain in AA coordinates: 17-159) 

MGVREKDPLAQLSLPPGFRFYPTDEELLVQYLCRKVAGYHFSIiQVIGDIDLYKFDPWDLP 

SKALFGEKEWYFFSPRDRKYPNGSRPNRVAGSGYWKATGTDKIITADGRRVGIKKALVFY 

AGKAPKGTKTNWIMHEYRLIEHSRSHGSSKLDDWVLCRIYKKTSGSQRQAVTPVQACREE 

HSTNGS SS S S S SQLDDVLDSFPE I KDQSFNLPRMNSLRTI IiNGNFDWASLAGLNP I PELA 

PTNGLPSYGGYDAFRAAEGEAESGHVNRQQNSSGLTQSFGYSSSGFGVSGQTFEFRQ* 
>G738 (1..885) 

ATGGACCATC7VTCAGTATCATCATCATGATCAATACCAACATCAGATGATGACTAGTACT 
AACAATAATTCCTATAACACCATCGTCACAA<^^ 

GATTCAACAACAGCAACAACTATGATAATGGATGACGAGAAGAAGTTGATGACGACAATG 
AGCACTAGGCCGCAAGAACCAAGAAACTGTCCAAGATGCAACTCAAGCAACACCAAGTTT 
TGTTATTACAACAACTAC^GCTTAGCACAGCCTAGGTACTTGTGTAAGTCTTGTCGGAGA 
TATTGGACTGAAGGTGGCTCTCTCCGTAACGTCCCCGTAGGCGGAGGTTCTAGAAAGAAC 
AAGAAGCTTCCATTTCCTAATTCCTCTACTTCTTCTTCCACCAAGAACCTCCCGGATCTC 
AACCCTCCTTTCGTCCTCACATCATCAGCTT^^ 

AACAATAATGACCTCAGCCTATCCTTCTCCTCCCCTATGCAAGACAAGCGAGCTCAAGGG 
(^TTACGGTCATTTCAGTGAGCAAGTTGTGACT^GGAGGGCAGAACTGTCTTTTCCAAGOT 
CCTATGGGAATGATTCAGTTTCGTCAAGAGTATGAT 

GGGTTTTCATTAGACAGGAACGAGGAAGAGATTGGTAATCATGATAACTTCGTTGTTAAT 
GAGGAAGGAAGTAAGATGATGTATCCTTATGGAGATCATGAAGACCGTCAACAACATCAC 
CATGTGAGACACGATGATGGTAATAAGAAGAGAGAAGGTGGTTCAAGCAATGAGCTATGG 
AGCGGAATCATCCTAGGTGGTGATAGTGGTGGACCAACATGGTGA 
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>G738 Amino Acid Sequence (domain in aa coordinates: 351-393) 

MDHHQYHHHDQYQHQMMTSTNTOSYNTIVTTQPPPTTTTMDSTTATTMIMDDEKKLMTTM 

STRPQEPRNCPRCNSSNTKFCYYNNYSLAQPRYLCKSCRRYWTEGGSLRNVPVGGGSRKN 

KKLPFPNSSTSSSTKNLPDLNPPFVFTSSASSSNPSKTHQNNNDLSLSFSSPMQDKRAQG 

HYGHFSEQWTGGQNCLFQAPMGMIQFRQEYDHEHPKKNLGFSLDRNEEEIGNHDNFWN 

EEGS KMMYPYGDHEDRQQHHHVRHDDGNKKREGGSSNELWSG 1 1 LGGDSGGPTW * 

>G2426 (1..1038) 

ATGGGCAGATCGCCATGTTGTGATAAGGCCGGGTTGAAGAAAGGGCCTTGGACTCCAGAA 
GAGGATCAGAAACTTTTGGCTTATATTGAAGAACATGGCCATGGAAGCTGGCGTTCTTTG 
CCTGAGAAAGCCGGTCTCCAAAGGTGTGGAAAGAGTTGCAGACTCAGATGGACTAACTAC 
CTAAGACCTGACATCAAGAGAGGCAAATTCACTGTACAAGAAGAACAAACCATCATTCAA 
CTCCACGCTCTCCTCGGAAACAGGTGGTCAGCGATTGCAACTCATTTACCAAAGAGGACA 
GACi\ACGAGATCAAGMCTACTGGAACACACACTTGAAGAAACGTCTGATCAAAATGGGG 
ATAGATCCAGTGACTCACAAGCACAAAAACGAGACTCTTTCGTCTTCCACAGGACAATCA 
AAGAACGCAGCCACGCTTAGTCATATGGCTCAATGGGAGAGTGCAAGACTCGACGCTGAA 
GCAAGGCTAGCTAGAG7VATCAAAGCTTCTCCATTTACAGCATTACCAAAACAATAACAAC 
CTTAACAAATCAGCAGCTCCTCAACAACATTGCTTCACTCAAAAAACATCAACAAACTGG 
ACTAAACCAAACCAAGGAAACGGAGACCAACAGCTTGAATCTCCGACATCGACGGTGACA 
TTCTCTGAGAATCTTCTGATGCCTTTAGGAATCCCTACGGATAGCAGCAGAAATAGAAAC 
AATAACAACAATGAGTCCTCGGCGATGATTGAATTGGCCGTATCTTCGTCAACCTCCTCC 
GATGTGAGTCTGGTCAAAGAACATGAACACGACTGGATTAGGCAGATCAACTGTGGTAGT 
GGAGGAATAGGAGAAGGATTCACGAGTCTATTGATCGGTGATTCGGTCGGCCGGGGTTTA 
CCCACCGGGAAAAACGAAGCGACGGCGGGCGTGGGGAATGAGAGTGAGTATAACTACTAT 
GAGGATAACAAGAATTACTGGAATAGCATTCTCAACTTGGTTGATTCTTCACCGTCCGAT 
TCCGCGACGATGTTCTGA 

>G2426 Amino Acid Sequence {conserved domain in AA coordinates : 14-114 ) 
MGRSPCCDKAGLKKGPWTPEEDQKLLAYIEEHGHGSWRSLPEKAGLQRCGKSCRLRWTNY 
LRPD I KRGKFTVQEEQTI X QLHALLGNRWS AI ATHLPKRTDNE I KNYWNTHLKKRL I KMG 
IDPVTOKHKNETLSSSTGQSKNAATLSHM 

LNKSAAPQQHCFTQKTSTNWTKPNQGNGDQQLESPTSTVTFSENLLMPLGIPTDSSRNRN 
NWNNESSAMIEIiAVSSSTSSDVSLVKEHEHDWIRQINCGSGGIGEGFTSLLIGDSVGRGL 
PTGKNEATAGVGNESEYWYYEDNKNYWNSILNLVDSSPSDSATMF* 
>G1524 (1. .825) 

ATGGGGAGAACTAAGGAGCAGGCAACATTAACTCGGTATCCACCCTGTCCTAGGAATCCT 
GCTAAATTCAATGATATAAACAAAGCACTCCAGGAAAAAGGATATGGTAAGGCTCTGAAA 
AGAAAACCTTGGACGGGTGTGACATGCCCTGTCTGTCTTGAGGTTCCTCACAACTCGGTC 
GTCCTCCTTTGTTCATCTTACCACAAAGGATGCCGTCCGTACATGTGTGCCACGGGAAAC 
CGTTTCTCAAATTGTCTAGAGCAGTACAAAAAGGCATATGCCAAGGATGAGAAAAGTGAC 
AAACCGCCAGAGCTATTGTGCCCGCTTTGTAGGGGTCAGGTGAAAGGCTGGACCGTTGTG 
GAAAAGGAACGTAAGTATCTGAATTCTAAGAAAAGGTCATGCATGAACGACGAGTGTTTG 
TTTTATGGAAGCTATAGACAGCTC^GAAGCATGTTAAGGAGAACCATCCGAGAGCCAAG 
CCAAGAGCGATAGACCCTGTGCTGGAGGCGiyU^TG 

AGGAGTGATGTAATCAGCACAGTCATGTCGTCAACACCTGGGGCTATGGTATTTGGAGAC 
TATGTGATTGAGCCATACAATGGTTATGATCATCAAGATGAC^GTGACGATTACAGTGAT 
TCGTCGGATGACGAAATGGAAGGTGGGGTATTCGAGCTTGGAGCATTCGACCTGGGCCGT 
CTTCAACCGCGTTCGGCTGCCATCTCAAGCCGGGGAATTCGCGGTATGATCATAAGGAAC 
CGGTGGGCTCGAAGCAGAGGTGCGAGCAGAAGGCGACAAACATAA 

>G1524 Amino Aeid Sequence (conserved domain in AA coordinates : 49-110) 
MGRTKEQATLTRYPPCPRNPAKFNDINKALQEK^^ * 
VXLCSSYHKGCRPYMCATG3SrRFSNCLEQYKKAYAKDEKSDKPPELIiCPLCRGQVKGWTW 
EKERKYLNSKKRSCM1TOECLFYGSYRQ 

RSDVISTVMSSTPGAlWFGDYVIEPYNGYDHQDDSDDYSDSSDDEMEGGVFEIiGAFDLGR 
LQPRSAAISSRGIRGMI IRNRWARSRGASRRRQT* 
>G1243 (1..3174) 

ATGGCGAGAAATTCGAATTCCGATGAGGCTTTCTCGTCAGAGGAGGAAGAAGAGCGGGTT 
AAGGATAATGAAGAAGAAGATGAGGAGGAGCTCGAGGCTGTTGCTCGTTCTTCTGGCTCC 
GACGATGACGAAGTAGCCGCCGCCGACGAATCACCAGTCTCCGACGGAGAGGCTGCTCCC 
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GTAGAAGATGATTACGAGGACGAAGAAGATGAGGAAAAAGCTGAAATCAGCAAACGTGAG 

AAAGCCAGACTTAAAGAGATGCAGAAGTTGAAGAAGCAGAAGATTCAAGAGATGCTGGAG 

TCGCAGAATGCTTCCATTGACGCGGATATGAACAATAAGGGAAAAGGGAGACTGAAGTAT 

CTTCTGCAGCAAACTGAGTTATTTGCCCACTTTGCTAAAAGTGATGGATCTTCTTCTCAG 

AAGAAGGCAAAAGGAAGGGGACGTCATGCTTCCAAAATAACTGAAGAGGAGGAAGACGAA 

GAGTATCTAAAGGAAGAAGAGGATGGCTTAACTGGATCTGGAAACACACGGTTACTCACA 

CAGCCCTCTTGTATTCAAGGGAAGATGAGAGATTACCAATTAGCTGGTTTGAACTGGCTC 

ATTCGTCTTTATGAGAATGGCATAAATGGAATTCTTGCTGATGAAATGGGTCTGGGGAAG 

ACGCTTCAAACGATTTCTTTGTTGGCATATCTTCATGAATACAGGGGAATCAATGGTCCC 

CATATGGTGGTTGCTCCAAAATCAACACTTGGTAATTGGATGAACGAAATTCGCCGGTTT 

TGTCCTGTCCTACGTGCTGTGAAGTTCCTTGGTAATCCTGAGGAGAGGAGACATATTCGA 

GAAGACCTGCTAGTTGCTGGGAAATTTGATATTTGTGTCACAAGCTTTGAGATGGCCATC 

AAAGAGAAGACAGCACTTCGTCGGTTTAGCTGGCGTTATATTATCATTGATGAAGCGCAT 

CGAATCAAGAACGAGAATTCACTCCTTTCTAAAACCATGAGACTTTTTAGCACCAATTAT 

CGGCTTCTTATCACGGGGACCCCCCTTCAGAATAATCTCCATGAACTGTGGGCTCTTCTA 

AATTTTCTTCTGCCTGAGATTTTTAGTTCAGCAGAGACTTTTGATGAATGGTTTCAAATT 

TCTGGTGAGAATGACCAGCAAGAAGTTGTGCAACAACTGCACAAGGTTCTTCGACCATTT 

CTTCTTCGAAGACTAAAGTCAGATGTTGAGAAAGGTTTGCCACCGAAGAAGGAGACCATA 

CTTAA^GTTGGTATGTCTCAGATGCAAAAGCAATACTACAAGGCTTTACTGCAGAAGGAT 

CTTGAAGCGGTTAATGCTGGTGGAGAACGCAAACGTCTGCTAAACATTGCAATGCAACTG 

CGTAAATGCTGCAATCACCCCTATCTCTTCCAGGGTGCAGAACCTGGTCCCCCATATACC 

ACAGGAGATCACCTTATAACAAATGCTGGTAAGATGGTTCTCTTGGATAAATTGCTTCCT 

AAGTTGAAAGAACGTGATTCAAGGGTGCTGATATTTTCTCAGATGACAAGACTTTTGGAT 

ATTCTTGAGGACTATTTAATGTATCGTGGTTACTTGTATTGCCGTATTGATGGAAACACT 

GGTGGTGACGAACGAGATGCCTCCATAGAAGCCTACAACAAGCCAGGAAGTGAGAAATTT 

GTTTTCTTGTTATCTACTAGAGCTGGAGGGCTTGGTATCAATCTTGCTACTGCAGATGTT 

GTGATCCTTTACGATAGTGATTGGAACCCACAAGTCGACTTGCAAGCTCAGGATCGTGCC 

CATAGGATTGGTCAAAAAAAAGAAGTTCAAGTGTTTCGATTCTGCACTGAGTCTGCTATT 

GAGGAGAAAGTGATTGAAAGAGCTTACAAGAAGTTAGCACTTGATGCTCTGGTTATTCAA 

CAAGGGAGATTGGCAGAACAGAAAAGTAAGTCTGTC^ATAAGGATGAGTTGCTTCAAATG 

GTAAGATATGGTGCTGAGATGGTGTTCAGTTCTAAAGATAGCACAATCACAGACGAGGAT 

ATTGATAGAATCATTGCCAAAGGAGAAGAGGCAACAGCTGAACTTGATGCTAAGATGAAG 

AAATTCACAGAAGATGCTATACAGTTTAAAATGGATGACAGTGCTGACTTCTATGATTTT 

GATGATGACAATAAGGATGAAAACAAGCTCGATTTTAAAAAGATTGTAAGCGACAATTGG 

AATGATCCCCCCAAGCGGGAGAGAAAGCGCAACTACTCTGAATCTGAGTACTTTAAGCAA 

ACATTGCGGCAAGGTGCTCCAGCTAAACCTAAAGAGCCTAGAATTCCGCGCATGCCCCAG 

TTGCACGATTTCCAGTTCTTTAACATTCAGAGATTGACCGAGTTGTATGAAAAGGAAGTA 

CGTTATCTCATGCAAACACATCAGAAAAATCAGTTGAAAGACACAATTGATGTTGAAGAA 

CCAGAAGGTGGGGATCCCTTAACTACTGAAGAAGTAGAAGAAAAGGAGGGATTATTGGAG 

GAGGGTTTCTCAA(^TGGAGCAGAAGAGATTTTAATACTTTCCTCAGGGCTTGTGAGAAG 

TATGGCCGCAACGACATAAAAAGCATTGCCTCTGAGATGGAAGGGAAAACAGAGGAAGAA 

GTTGAAAGATATGCCAAAGTATTTAAAGAGCGGTACAAGGAGCTGAACGACTATGATAGA 

ATCATTAAGAACATTGAGAGGGGAGAGGCAAGGATCTCTAGGAAAGACGAAATCATGAAG 

GCCATAGGGAAGAAACTGGATCGCTACAGAAACCCTTGGCTGGAACTGAAGATTCAATAT 

GGTCAGAACAAAGGCAAGCTGTACAATGAAGAGTGTGACCGTTTCATGATCTGCATGATT 

CAGAAACTTGGTTATGGGAATTGGGATGAGCTrAAAGGCAGGATTTAGGACATCGTOT 

TTCAGGTTTCACTGGTTTGTGAAATCCCGCACGAGTCAGGAACTTGCAAGAAGATGCGAC 

ACTCTGATTCGACTGATCGAGAAAGAGAACCAGGAGTTTGATGAAAGAGAGAGGCAAGCC 

nn^ AA ^ G ^ AA ^ G< " TCG CGAAGAGTG CAACACCATCAAAGCGACCTTTAGGAAG ACAA 

GCAAGTGAGAGTCCTTCATCGACGAAGAAGCGGAAGCACCTGTCGATGAGATGA 

m^L^I^^ Sequence (domain in AA coordinates: 216-609) 

MARNSNSDEAFS SEEEEERVKDNEEEDEEELEAVARS SGSDDDEVAAADES PVSDGEAAP 

VEDDYEDEEDEEKAE I SKREKARLKEMQKLKKQKI QEMLESQNAS IDADMNNKGKGRLKY 

^QQTBLFAHFAKSDGSSSQK^ 

QPSCIQGKMRDYQIAGLNWMR^ 

^APKSTLGNW^ 

KEKTALRRFSWRYIIIDEAHRIKNENSLLSKTMRLFSTNYRLLITGTPLQNl^HELWAL 
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NFLLPEIFSSAETFDEWFQISGENDQQEWQQLHKVLRPFLLRRLKSDVEKGLPPKKETI 

LKVGMSQMQKQYYKALLQKDLEAVNAGGERKRLLNIAMQLRKCCNHPYLFQGAEPGPPYT 

TGDHLITNAGKMVLLDKLIiPKLKERDSRVLIFSQMTRLLDILEDYLMYRGYLYCRIDGNT 

GGDERDAS I EAYNKPG S EKFVFLLS TRAGG LG I NLATAD W I L YDSDWNPQVDLQ AQDRA 

HRIGQKKEVQVFRFCTESAIEE1CVIERAYKKLALDALVIQQGRLAEQKSKSVNKDELLQM 

VRYGAEMVFSSKDSTITDEDIDRIIAKGEEATAELDAKMKKFTEDAIQFKMDDSADFYDF 

DDDNKDENKLDFKKIVSDNWNDPPKRERKRNYSESEYFKQTLRQGAPAKPKEPRIPRMPQ 

LHDFQFFNIQRLTELYEKEVRYLMQTHQKNQLKDTIDVEEPEGGDPLTTEEVEEKEGLLE 

EGFSTWSRRDFNTFLRACEKYGRNDI KS IASEMEGKTEEEVERYAXVFKERYKELNDYDR 

IIKNIERGEARISRKDEIMKAIGKKLDRYRNPWLELKIQYGQNKGIOjYNEECDRFMICMI 

HKLGYGNWDELKAAFRTSSVFRFDWFVKSRTSQELARRCDTLIRLIEIvENQEFDERERQA 

RKEKKLAKS ATPS KRPLGRQASE S P S STKKRKHLSMR * 

>G631 (190.. 1461) 

CTTCTTCTTCTTCTTCTTCTTCTTCTTCCTCCTCTCTCGTCGGATCTCTCTGATTTAGTG 
ATTTTTCAAATTTCAAGTTTTCTTCACCTTTAATTTTGTGTCTCGTTGATCTCTCTTTGG 
ACATTCTGCTTTGGATTCTGGAGGCTTCTCATTAGATCTCTATTAGTGGGTTTAGGTCAA 
GTTCTTGAAATGGATAAGGAGAAATCTCCTGCACCACCACCTAGTGGAGGTCTTCCTCCA 
CCATCGGGTCGTTACTCTGCGTTTTCACCTAATGGAAGTAGCTTTGCAATGAAAGCTGAA 
TCATCTTTTCCTCCTTTGACTCCAAGTGGAAGC AATAG CTCAGATG CTAACCGATTCAGC 
CATGATATTAGCCGAATGCCGGATAATCCACCTAAGAACCTAGGCCATCGCCGAGCTCAT 
TCAGAGATTCTTACTCTTCCTGATGACTTAAGCTTTGATAGTGATCTTGGTGTGGTTGGT 
GCTGCTGATGGACCTTCTTTCTCTGATGATACTGACGAGGACTTACTCTATATGTATCTT 
GATATGGAAAAATTCAATTCTTCTGCTACATCGACTTCTCAAATGGGTGAGCCATCAGAA 
CCGACTTGGAGGAATGAATTAGCCTCGACTTCTAACCTTCAGAGTACACCCGGTAGCTCT 
AGTGAAAGACCGAGAATTAGACACCAACACAGCCAATCGATGGATGGTTCAACAACTATC 
AAGCCTGAGATGCTTATGTCAGGGAATGAAGATGTGTCTGGAGTTGACTCTAAGAAAGCC 
ATCTCTGCTGCTAAACTTTCTGAGCTTGCTCTCATTGATCCAAAACGCGCCAAGAGGATA 
TGGGCAAACAGGCAGTCTGCTGCGAGGTCAAAAGAAAGGAAGATGAGATACATTGCAGAG 
CTCGAGAGAAAAGTACAGACTTTACAAACAGAGGCCACATCTCTCTCAGCCCAGTTGACT 
CTCTTACAGAGAGATACAAATGGCCTGGGTGTTGAAAACAATGAGCTTAAACTGCGAGTA 
CAAACTATGGAGCAACAGGTCCACCTACAGGATGCTTTAAATGATGCACTAAAGGAGGAA 
GTCCAGCATCTTAAGGTATTGACGGGGCAAGGTCCATCAAATGGTACATCAATGAACTAC 
GGTTCTTTTGGATCAAACCAGCAATTCTATCCCAATAATCAGTCGATGCACACTATCTTA 
GCCGCACAACAGTTACAGCAGCTCCAGATCCAGTCACAGAAACAGCAACAACAACAACAG 
CAACACCAGCAACAACAACAGCAGCAGCAGCAGCAATTTCACTTTCAACAGCAGCAACTG 
TACCAGCTTCAGCAGCAGCAACGGCTTCAACAACAGGAACAACAAAGCGGGGCTTCAGAG 
CTAAGAAGACCCATGCCTTCTCCTGGTCAGAAAGAGAGTGTGACATCGCCTGATCGTGAA 
ACTCCCTTGACAAAAGACTGAGTCTAGACTGTGCTAATGTCCAATTTAGTAAGTTACTCT 
TGGAAAATCOTCTTTTTCATCGCAGGCTCATGGATTTGGGATTTACTGCATTATAGAGTT 
AAAAACAAGAC^GCTTAGAAGTTGCGGATTTAGAAGTTGTTAGTGAAGCTTTTGTTCTCG 
TCTGTTGGTAGTTTACAATCTTCTCTTTGTATGATCCTAAG 

>G631 Amino Acid Sequence (domain in AA coordinates: TBD) 

MDKEKSPAPPPSGGLPPPSGRYSAFSPNGSSFAMKAESSFPPLTPSGSNSSDANRFSHDI 

SRMPDNPPKNLGHRRAHSEILTLPDDLSFDSDLGWGAADGPSFSDDTDEDLLYMYLDME 

KFNSSATSTSQMGEPSEPTWRNEIiASTSNLQSTPGSSSERPRIRHQHSQSMDGSTTIKPE 

MLMSGNEDVSGVDSKKAISAAKLSELALIDPKRAKRIWANRQSAARSKERKMRYI 

KVQTLQTEATSLSAQLTLLQRDTNGLGVENNELKLRVQTMEQQVHLQDALNDALKEEVQH 

LKVLTGQGPSNGTSMNYGSFGSNQQFYPlsrtlQSMHTILAAQQLQQLQIQSQKQQQQQQQHQ 

QQQQQQQQQFHFQQQQLYQLQQQQRLQQQEQQSGASELRRPMPSPGQKESVTSPDRETPL 

TKD* 

>G1909 (1..828) 

ATGGGTGGATCGATGGCGGAGAGAGCAAGGCAGGCCAACATTCCTCCACTAGCGGGACCC 
CTAAAGTGTCCTCGATGCGACTCCAGGAACACTAAGTTCTGTTACTACAACAACTATAAC 
CTCACTCAGCCTCGTCACTTCTGCAAAGGTTGCCGTCGCTACTGGACACAAGGGGGCGCC 
CTGAGiU^ACGTCCCTGTAGGTGGAGGCTGCCGGAGGAATAACAAGAAGGGCAAAAATGGA 
AATTTAAAATCTTCTTCTTCTTCGTCCAAACAGTCTTCCTCGGTCAACGCTCAAAGTCCT 
AGCTCAGGACAGCTAAGGACAAATCATCAGTTCCCTTTTTCACCAACTCTTTACAATCTC 
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ACTCAACTCGGAGGTATTGGTTTGAACTTAGCCGCTACTAATGGCAACAACCT^GCTCAC 
CAGATCGGTTCCAGTTTGATGATGAGCGATCTAGGGTTTCTCCATGGACGAAATACTTCA 
ACTCCGATGACGGGAAACATTCATGAAAACAACAACAATAATAACAATGAAAACAACCTA 
ATGGCATCCGTTGGATCTTTGAGCCCCTTTGCTCTCTTCGATCCAACGACGGGGCTATAC 
GCTTTCCAGAACGACGGT7LATATCGGGAACAACGTTGGGATATCTGGTTCTTCTACTTCC 
ATGGTTGATTCTAGGGTTTATCAGACGCCTCCGGTGAAGATGGAAGAACAACCTAATTTG 
GCTAACTTGTCTAGACCGGTCTCCGGTTTGACGTCTCCTGGGAATCAAACAAATCAGTAC 
TTTTGGCCTGGTTCGGATTTCTCGGGTCCTTCTAATGATCTCTTGTGA 

>G1909 Amino Acid Sequence (conserved domain in AA coordinates : 23-51) 
MGGSMAERAi^QANIPPLAGPLKCPRCDSSNTKFCYYNNYNLTQPRHFCKGCRRYWTQGGA 
LRNVPVGGGCRRNNKKGKNGNLKSSSSSSKQSSSVNAQSPSSGQLRTNHQFPFSPTLYNL 
TQLGGIGLNLAATNGNNQAHQIGSSLM 

MASVGSLSPFALFDPTTGLYAFQNDGNIGNNVGISGSSTSMVDSRVYQTPPVKMEEQPNL 

ANLSRPVSGLTSPGNQTNQYFWPGSDFSGPSNDLL* 

>G1663 (64.. 630) 

TTCTCTCTGTGAATCCTTGTTCATCGTCACTGAAATTAGTTTACAAAATCGACGAATTCG 

GAGATGATTTTTCAGAATGTGTGCAG7LAATGAGTCCAACTTCAACGCTATAGCTTCCGAA 

TCGCGTTCCCAAACGCAGTTCGGTGTTTCGAAATCCTCCTCGAGCGGCGGCGGATGTATC 

TCCGCCAGGACTAAAGACCGTCACACGAAGGTTAACGGACGAAGCCGTCGAGTTACGATG 

CCGGCTCTCGCCGCCGCTAGGATTTTCCAGTTAACGCGTGAGCTCGGTCACAAAACTGAA 

GGAGAAACCATCGAATGGCTTCTTAGTCAAGCTGAACCGTCGATTATTGCCGCCACTGGC 

TACGGGACTAAGCTCATTTCGAATTGGGTTGATGTTGCGGCGGACGATTCCTCGTCGTCG 

TCGTCGATGACGTCGCCGCAAACGCAAACGCAAACGCCACAATCGCCGAGTTGTAGGTTG 

GATCTTTGTCAGCCAATCGGAATTCAGTATCCGGTGAATGGTTACAGTCATATGCCGTTC 

ACAGCGATGCTTTTAGAGCCGATGACCACGACGGCGGAATCTGAGGTTGAGATCGCGGAG 

GAGGAGGAACGTAGACGCCGTCACCATTAGTAAAATTAGGCTTTTGATTTAGAGTGTTAA 

AATTAGGATTTTAAAAGTTTAGGAGGTAACAGATAAGGATAATT 

>G1663 Amino Acid Sequence (domain in AA coordinates: TBD) 

MIFQNVCRNESNFNAIASESRSQTQFGVSKSSSSGGGCISARTKDRHTKVNGRSRRVTMP 

AIiAAAR I FQLTRELGHKTEGETI EWLLSQAEPS 1 1 AATG YGTKL I SNWVDVAADDS S S S S 

SMTSPQTQTQTPQSPSCRLDLCQPIGIQYPWGYSHMPFTAMLLEPMTTTAESEVEIAEE 
EERRRRHH* 

>G1231 (103.. 870) 

CAAACCCAAATTCTCTCAGCGCCGGTCAAATACTTGTCTCTCTCTCTCTCTCTCTTTCAC 
TCTTGTCTTGTCTCCTTCGAAGCTGTTTGTTCTGTAAGAAAGATGGAAGCAGGTGGCGCG 
TACAATCCACGCACTGTTGAAGAGGTGTTTAGGGATTTTAAGGGTCGTAGAGCTGGCATG 
ATTAAGGCTTTAACCACTGATGTTCAGGAGTTTTTCCGACTTTGTGATCCCGAAAAGGAG 
AACCTTTGCCTTTACGGACATCCAAATGAGCACTGGGAAGTGAATTTGCCAGCTGAAGAG 
GTTCCTCCTGAGCTCCCAGAGCCTGTCTTGGGTATCAATTTTGCCAGAGACGGGATGGCG 
GAAAAGGATTGGTTGTCCCTTGTTGCTGTCCACAGTGATGCTTGGCTTCTTGCTGTTGCT 
TTCTTTTTTGGAGCCAGGTTTGGATTTGACAAAGCTGATAGGAAGAGG 

GTGAATGACCTCCCAACAATCTTTGAGGTTGTAGCTGGCACTGCTAAGAAACAAGGAAAA 
GATAAGTCCTCTGTTTCCAACAACAGCAGGAAC^^ 

TCTGAATCCCGTGCCAAGTTCTCAAAGCCGGAGCCCAAAGATGATGAGGAGGAGGAAGAG 
GAAGGTGTGGAAGAGGAGGATGAGGATGAGCAAGGTGAAACACAGTGTGGAGCATGTGGT 
GAGAGCTATGCAGCTGATGAGTTCTGGATTTGCTGTGACCTCTGTGAGATGTGGTTTCAT 
GGAAAGTGTGTTAAGATAACACCAGCAAGAGCTGAGCACATCAAGCAATACAAGTGCCCT 
TCTTGCAGCAACAAAAGGGCTCGTTCCTAAATTTGTTG 

CCTTTGCATATGATGATGAACAGCTTAACTGTTTGGTTTAGATC^GATTTGTCATATGGA 
TTTGGTAATTTAGGAAGACATTTTAGTTTTTTC^ 

TAACTCTTTGTTTAGGGGTAATGATCTTTTGCTCTGTTTTATGTTTGTTTAT^ 
TTCAAACTCAATCAAAAGTATTTTGGTTAGTCTTAAAA 

>G1231 Amino Acid Sequence (domain in AA coordinates: TBD) 
MEAGG A YNPRTVEEVFRD F KGRRAGM I KALTTDVQE F FRLCD PE KENLCL YGHPNEH WEV 
I^PAEEVPPELPEPVXGINFARDGMAEKDWLSLVAVIISDAWLLAVAFFFGARFGFDKADR 
KRLFNIWNDLPTIFEWAGTAKKQGKDKSSVSNNSSNRSKSSSKRGSESRAKFSKPEP^ 
DEEEEEEGV^EEDEDEQGETQCGACGESYAADEFWICCDLCEMWFHGKCVKITPARAEHI 
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KQYKCPSCSNKRARS* 
>G227 (21 . .983) 

GTACCGTCGACGATCCGGCGATGTCAAACCCGACCCGTAAGAATATGGAGAGGATTAAAG 
GTCCATGGAGTCCAGAAGAAGATGATCTGTTGCAGAGGCTTGTTCAGAAACATGGTCCGA 
GG7VACTGGTCTTTGATTAGCAAATCAATCCCTGGACGTTCCGGCAAATCTTGTCGTCTCC 
GGTGGTGTAACCAGCTATCTCCGGAGGTAGAGCACCGTGCTTTTTCGCAGGAAGAAGACG 
AGACGATTATTCGAGCTCACGCTCGGTTTGGTAACAAGTGGGCTACGATCTCTCGTCTTC 
TCMTGGACGAACCGATAACGCTATCAAGAATCATTGGAACTCGACGCTGAAGCGAAAAT 
GCAGCGTCGAAGGGCAAAGTTGTGATTTTGGTGGTAATGGAGGGTATGATGGTAATTTAG 
GAGAAGAGCAACCGTTGAAACGTACGGCGAGTGGTGGTGGTGGTGTCTCGACTGGCTTGT 
ATATGAGTCCCGGAAGTCCATCGGGATCTGACGTCAGCGAGCAATCTAGTGGTGGTGCAC 
ACGTGTTTAAACCAACGGTTAGATCK3AG6TTACAGCGTCATCGTCTGGTGAAGATCCTC 
CAACTTATCTTAGTTTGTCTCTTCCTTGGACTGACGAGACGGTTCGAGTCAACGAGCCGG 
TTCAACTTAACCAGAATACGGTTATGGACGGTGGTTATACGGCGGAGCTGTTTCCGGTTA 
GAAAGGAAGAGCAAGTGGAAGTAGAAGAAGAAGAAGCGAAGGGGATATCTGGTGGATTCG 
GTGGTGAGTTCATGACGGTGGTTCAGGAGATGATAAGGACGGAGGTGAGGAGTTACATGG 
CGGATTTACAGCGAGGAAACGTCGGTGGTAGTAGTTCTGGCGGCGGAGGTGGCGGTTCGT 
GTATGCCACAAAGTGTAAACAGCCGTCGTGTTGGGTTTAGAGAGTTTATAGTGAACCAAA 
TCGGAATTGGGAAGATGGAGTAGGCGGCC 

>G227 Amino Acid Sequence (domain in AA coordinates: 13-112) 

MSNPTRKNMERIKGPWSPEEDDLLQRLVQKHGPRNWSLISKSIPGRSGKSCRLRWCNQLS 

PEVEHRAFSQEEDETIIRAHARFGNICWATISRLLNGRTDNAIKNHWNSTLKRKCSVEGQS 

CDFGGNGGYDGNLGEEQPLKRTASGGGGVSTGLYMSPGSPSGSDVSEQSSGGAHVFKPTV 

RSEVTASSSGEDPPTYLSLSLPWTDETVRTOEPVQLNQNTVMDGGYTAELFPVRKEEQVE 

VEEEE AKG I SGGFGGEFMTWQEM I RTEVRS YMADLQRGNVGG S S SGGGGGGS CMPQS VN 

SRRVGFREFIVNQIGIGKME* 

>G1842 (219.. 809) 

ACTATTACATGCCTCTTCCTCGCTTCAAAACGGCACCGTTTCCACTTGTTATTATTTTTC 
TCTCTATCGTCTAACAAAAAAAAAAACTGACTTGGGATTTTTTTTCATTTGTCTAGCCCA 
AAAGAAGAAGATAGAAACGAAGAAAAAAAGCAAACACATTTTGGGTCCCCGGTGGTTAGG 
ATCAAATTAGGGCACAAACCTTATCGGAGAAAGAAGCCATGGGAAGAAGAAAAGTCGAGA 
TCAAGCGAATCGAGAACAAAAGCAGTCGACAAGTCACTTTCTCCAAACGACGCAAAGGTC 
TCATCGAAAAAGCTCGACAACTTTCAATTCTCTGTGAATCTTCCATCGCTGTTGTCGCCG 
TCTCCGGTTCCGGAAAACTCTACGACTCTGCCTCCGGTGACAACATGTCAAAGATCATTG 
ATCGTTATGAT^ATACATCATGCTGATGAACTTAAAGCCTTAGATCTTGCAGAAAAAATTC 
GGAATTATCTTCCACACAAGGAGTTACTAGAAATAGTCCAAAGCAAGCTTGAAGAATCAA 
ATGTCGATAATGTAAGTGTAGATTCTCTAATATCTATGGAGGAACAGCTCGAGACTGCTC 
TGTCAGTAATTAGAGCTAAGAAGACAGAACTAATGATGGAGGATATGAAGTCACTTCAAG 
AAAGGGAGAAGTTGCTGATAGAAGAGAACCAGATTCTGGCTAGCCAGGTGGGGAAGAAGA 
CGTTTCTGGTTATAGAAGGTGACAGAGGAATGTCACGGGAAAATGGCTCCGGCAACAAAG 
TACCGGAGACTCTTTCGCTGCTCAAGTAATCACCATCATCAACGGCTGAGCTTTCACCAT 
AAACTTACTCACAGCCTGATTCAGAAGCTTTTACAAAATTGTAAATTATAAAAAGCTGCA 
TAATAATCTCAACCTTTTTATCTTCCTCGCGCCAATGTGGAAATAAAGGTAAAACAAAAC 
GAAGCTCTTTTCTTTTATGCG7UVAGAATTGTAAAACTAAGATAAAGCTACCGATCTTTGT 
TGTACCTTAGTAGACAAATATCAGAGTTCTTGTGCTTGT 

>G1842 Amino Acid Sequence (domain in AA coordinates: 2-57) 

MGRRKVEIKRIENKSSRQVTFSKRRKGLIEKARQLSILCESSIAWAVSGSGKLYDSASG 

DNMSKIIDRYEIHHADELKALDLAEKIRNYLPHKELLEIVQSKLEESmnDNVSVDSLISM 

EEQLETALSVIRAKKTELMMEDMKSLQEREKLLIEENQILASQVGKKTFLVIEGDRGMSR 

ENGSGNKVPETLSLLK* 

>G1505 (1..681) 

ATGGATGATATAGCGGAACTTGAATGGTTATCAAATTTCGTAGATGATTCTTCTTTCACG 
CCGTATTCTGCTCCGACGAATAAACCGGTTTGGTTAACCGGAAATCGGAGACATCTTGTA 
CAACCGGTTAAAGAGGAGACCTGCTTCAAATCCCAACATCCGGCCGTCAAAACCAGACCC 
AAACGAGCCAGAACCGGAGTCAGAGTCTGGTCTCATGGTTCGCAGtCGTTAACCGACTCA 
TCTTCAAGCTCTACAACATCTTCGTCGTCCTCTCCTCGTCCTTCAAGCCCTCTATGGCTC 
GCCAGCGGTCAGTTTCTTGATGAGCCAATGACTAAAACACAAAAGAAGAAGAAAGTTTGG 
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AAAAACGCTGGTCAGACGCAAACGCAAACGCAGACGCAGACGCGGCAGTGTGGTCATTGT 
GGAGTTCAGAAAACGCCGCAGTGGAGAGCAGGACCATTAGGAGCGAAGACGTTGTGTAAT 
GCGTGTGGTGTGCGTTACAAATCGGGTCGGTTACTACCCGAATATAGACCCGCTTGTAGC 
CCAACATTTTCGAGTGAGCTTCACTCAAACCACCACAGTAAAGTCATTGAGATGCGTAGG 
AAGAAAGAGACTTCTGACGGTGCTGAAGAAACCGGTTTGAACCAGCCGGTTCAGACGGTT 
CAGGTTGTCTCGAGTTTTTGA 

>G1505 Amino Acid Sequence (domain in AA coordinates: TBD) 

MDDIAELEWLSNFVDDSSFTPYSAPTNKPVWLTGNRRHLVQPVKEETCFKSQHPAVKTRP 

KRARTGVRWSHGSQSLTDSSSSSTTSSSSSPRPSSPLWLASGQFLDEPMTKTQKKKKVW 

KNAGQTQTQTQTQTRQCGHCGVQKTPQWRAGPLGAKTLCNACGVRYKSGRLLPEYRPACS 

PTFSSELHSNHHSKVIEMRRKKETSDGAEETGLNQPVQTVQWSSF* 

>G657 (1..2331) 

ATGAAGCGTGAGATGAAAGCACCTACTACTCCACTAGAGAGTCTCCAAGGTGACCTCAAA 

GGAAAACAAGGGAGGACATCTGGCCCTGCTAGACGATCTACCAAAGGACAATGGACACCT 

GAAGAGGACGAAGTCTTGTGTAAAGCTGTTGAGCGTTTTCAAGGi\AAGAACTGGAAGAAG 

ATAGCTGAATGTTTTAAGGATCGGACTGATGTTCAGTGTCTTCATAGATGGCAAAAGGTC 

TTGAACCCAGAGCTTGTGAAAGGACCGTGGTCAAAAGAGGAGGATAACACAATAATTGAC 

CTGGTTGAAAAATATGGGGCAAAGAAATGGTCTACTATATCTCAGCATTTACCTGGGCGC 

ATAGGAAAGCAATGTAGGGAAAGGTGGCATAACCATCTTAACCCTGGGATTAATAAAAAT 

GCATGGACTCAGGAAGAGGAACTGACTCTTATTCGTGCGCATCAAATTTATGGGAATAAA 

TGGGCAGAGCTTATGAAATTTTTGCCAGGAAGGTCAGATAATTCGATAAAAAATCATTGG 

AACAGCTCAGTTAAGAAGAAGTTGGATTCCTACTATGCATCAGGTCTTTTAGATCAGTGT 

CAAAGCTCGCCATTAATTGCCCTTCAGAACAAATCTATCGCTTCATCTTCCTCGTGGATG 

CACAGCAATGGAGATGAAGGTAGTTCAAGGCCAGGGGTTGATGCTGAGGAATCAGAATGC 

AGCCAAGCTTCAACTGTTTTCTCACAATCAACCAACGATTTACAAGATGAAGTTCAACGT 

GGAAATGAGGAATATTACATGCCTGAATTTCATTCAGGAACGGAGCAGCAAATCTCAAAC 

GCTGCATCTCATGCAGAACCGTACTACCCTTCCTTTAAAGATGTCAAAATTGTTGTCCCC 

GAAATTTCTTGTGAAACAGAATGTTCGAAGAAGTTTCAGAATCTTAATTGTTCTCACGAG 

CTAAGAACTACCACAGCTACGGAGGATCAATTGCCGGGTGTATCTAATGATGCTAAACAG 

GACCGTGGTCTAGAGTTATTGACCCATAACATGGACAACGGTGGAAAAAACCAAGCACTT 

CAACAAGATTTTCAAAGTTCAGTAAGATTAAGTGATCAACCTTTTTTGTCAAACTCGG 

ACAGATCCAGAAGCTCAAACTTTGATCACGGATGAGGAGTGTTGTAGGGTTCTTTTTCCA 

GATAACATGAAAGATAGCAGTACATCTTCTGGTGAGCAAGGTCGGAATATGGTTGACCCT 

CAAAACGGCAAAGGATCTCTTTGTTCTCAGGCTGCAGAAACCCATGCTCATGAAACTGGA 

AAAGTTCCAGCTTTACCGTGGCATCCTTCAAGTTCTGAGGGCCTGGCGGGTCATAATTGT 

GTCCCTTTGTTGGATTCAGACTTGAAGGACTCACTTTTACCCCGTAATGATTCCAACGCT 

CCTATACAAGGTTGTCGCCTTTTTGGAGCTACCGAATTAGAATGTAAGACTGATACAAAT 

GACGGTTTCATCGATACTTACGGACATGTAACTTCCCATGGCAATGATGATAATGGTGGT 

TTCCCAGAACAACAGGGGCTGTCATATATTCCCAAGGATTCTTTGAAGCTAGTACCTTTG 

AATAGTTTTTCTTCTCCTTCTAGAGTGAACAAGATTTATTTO 

GCTGAAAAAGACAAAGGAGCTCTTTGTTATGAACCTCCACGTTTTCCAA^TGCAGATATT 
CCTTTCTTCAGCTGTGATCTTGTACCATCAAATAGTGACTTACGGCAAGAGTACAGTCCC 
TTTGGTATCCGTCAGTTGATGATTTCTTCAATGAATTGTACAACTCCGTTAAGGTTATGG 
GATTCACCGTGTCACGATAGGAGCCCTGATGTC^TGCTTAATGATACTGCCAAAAGTTTO 
AGTGGTGCACCATCCATCTTAAAGAAGCGGCATCGAGACTTGCTTTCACCTGTGCTTGAT 
AGAAGAAAAGACAAAAAGCTTAAAAGGGCTGCGACTTCCTCCTTGGCTAATGATTTTTCG 
CGCTTAGATGTAATGCTTGATGAAGGAGATGATTGCATGACCTCTCGTCCGTCAGAGTCT 
CCTGAAGATAAAAATATATGTGCCTCCCCTTCCATAGCC^GAGATAAC^GAAATTGTG^ 
TCAGCTCGGTTATATCAAGAAATGATTCCGATAGATC 

TCAGGTGGAGTGACTTCTATGCAAAATGAAAATGGATGTAATGACGGTGGTGCTTCAGCT 

AAAAATGTAAGTCCGTCTTTGTCCTTGCATATTATCTGGTATCAGTTATAA 

>G657 Amino Acid Sequence (domain in AA coordinates: TBD) 

MKREMKAPTTPLESLQGDLKGKQGRTSGPARRSTKGQWTPEEDEVLCKAVERFQGKNWKK 
IAECFKDRTDVQCLHRWQKNHjNPELVKGPWSKEEDNTIIDLVEKYGPKKWSTISQHL 
IGKQCRERWHNHLNPGINKNAWTQEEEIjTLIRAHQ I YGNKWAELMKPLPGRSDNS I KNHW 
NSSVKKKLDSYYASGLLDQCQSSPLIALQNKSIASSSSWMHSNGDEGSSRPGVDAEESEC 
SQASTVFSQSTNDLQDEVQRGNEEYYMPEFHSGTEQQISNAASHAEPYYPSFKDVKIVVP 
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EISCETECSKKFQNLNCSHELRTTTATEDQLPGVSNDAKQDRGLELLTHNMDNGGKNQAL 

QQDFQSSVRLSDQPFLSNSDTDPEAQTLITDEECCRVLFPDNMKDSSTSSGEQGRNMVDP 

QNGKGSLCSQAAETHAHETGKVPALPWHPSSSEGLAGHNCVPLLDSDLKDSLLPRNDSNA 

PIQGCRLFGATELECKTDTNDGFIDTYGHVTSHGNDDNGGFPEQQGLSYIPKDSLKLVPL 

NSFSSPSRVNKIYFPIDDKPAEKDKGALCYEPPRFPSADIPFFSCDLVPSNSDLRQEYSP 

FGIRQLMISSMNCTTPLRLWDSPCHDRSPDVMLNDTAKSFSGAPSILKKRHRDLLSPVLD 

RRKDKKLKRAATSSLANDFSRLDVMLDEGDDCMTSRPSESPEDKNICASPSIARDNRNCA 

SARLYQEMIPIDEEPKETLESGGVTSMQNENGCNDGGASAKNVSPSLSLHIIWYQL* 

>G1959 (141.. 1028) 

CGTCGACTGTCCATAAATCCGGAGCCTGACCCGACGTTTGACCCGGATCCGAAACTCCCA 
CAATCTCCATACCACCCAAATTCATCTCCCCTAAAGCTTTCTCTCACTTTCCCGGGAAAA 
TCGGCGACCAAMTTGGAAAATGTACTCAGCGATTCGCTCGCTTCCACTCGATGGTGGAC 
ACGTTGGTGGTGACTACCATGGACCTCTTGACGGAACCAATCTTCCCGGTGACGCTTGTT 
TGGTTTTAACGACTGACCCTAAACCTCGTCTCCGGTGGACAACTGAGCTTCATGAGAGAT 
TCGTTGACGCCGTTACTCAGCTCGGTGGTCCTGACAAAGCGACTCCCAAAACTATTATGA 
GAACAATGGGAGTGAAGGGTCTCACTCTCTACCACCTCAAATCACATCTTCAGAAATTCC 
GCCTAGGGAGGCAAGCTGGCAAAGAATCAACTGAG7VACTCTAAAGATGCTTCTTGTGTAG 
GGGAGAGTCAGGACACAGGTTCATCTTCGACATCATCAATGAGAATGGCGCAGCAGGAGC 
AGAACGAGGGTTACCAAGTCACCGAAGCTCTACGTGCTCAGATGGAAGTCCAAAGAAGAC 
TACACGATCAATTGGAGGTGCAACGGAGGCTCCAGCTGAGGATAGAGGCACAAGGAAAAT 
ACCTGCAATCGATTCTTGAAAAAGCTTGCAAGGCCTTTGACGAGCAAGCTGCTACTTTTG 
CTGGACTTGAGGCTGCTAGGGAAGAGCTATCAGAGCTAGCCATCAAAGTCTCCAATAGCT 
CTCAAGGAACATCAGTCCCGTACTTCGATGCAACAAAGATGATGATGATGCCATCGTTGT 
CAGAGCTTGCAGTAGCAATAGACAACAAAAACAACATCACAACCAACTGTTCAGTAGAAA 
GCTCTCTGACTTCCATCACACATGGGAGCTCTATATCTGCTGCATCAATGAAGAAGCGTC 
AACGTGGAGACAATTTGGGCGTAGGGTATGAATCAGGCTGGATTATGCCTAGTAGCACCA 
TTGGATAAAGTTTAGGAGAGGGAAAAAGTTCATTATGGGAAAGGTAGAGATAAGATTTAA 
CTGTTCTTTACTTGCTTTGAGGGGCCTGCGGCCGCT 

>G1959 Amino Acid Sequence (conserved domain in AA coordinates : 46- 97) 

MYSAIRSLPLDGGHVGGDYHGPLDGTNLPGDACLVLTTDPKPRLRWTTELHERFVDAVTQ 

LGGPDKATPKTIMRTMGVKGLTLYHLKSHLQKFRLGRQAGKESTENSKDASCVGESQDTG 

SSSTSSMRMAQQEQNEGYQVTEALRAQMEVQRRLHDQLEVQRRLQLRIEAQGKYLQSILE 

KACKAFDEQAATFAGLEAAREELSELAIKVSNSSQGTSVPYFDATKMMMMPSLSELAVAI 

DNKl^ITTNCSVESSLTSITHGSSISAASMKKRQRGDNLGVGYESGWIMPSSTIG* 

>G2180 (1..1440) 

ATGGCTCCTGTCTCGTTACCTCCAGGTTTCCGATTCCATCCAACAGACGAGGAACTAATT 
ACTTACTATCTAAAAAGAAAGATCAACGGTCTAGAAATCGAACTTGAAGTTATCGCTGAA 
GTTGATCTTTACAAGTGTGAGCCATGGGACTTACCAGGGAAGTCCTTGCTTCCGAGCAAA 
GACCAAGAATGGTACTTCTTCAGCCCACGAGACCGGAAGTATCCCAACGGCTCAAGGACA 
AACCGGGCAACTAAAGGCGGTTATTGGAAGGCTACAGGTAAAGACCGCCGAGTTAGTTGG 
AGAGACCGAGCCATAGGAACCAAGAAGACATTGGTTTACTACCGTGGGCGCGCGCCACAT 
GGCATAAGAACTGGTTGGGTCATGCACGAATATCGACTTGATGAAACAGAATGTGAGCCT 
TCTGCATACGGCATGCAGGACGCATATGCACTTTGTCGTGTGTTCAAAAAGATTGTTATT 
GAAGCTAAGCCAAGAGATCAACATCGGTCATATGTCCACGCGATGTCGAATGTGAGTGGT 
AATTGCTCATCGAGTTTTGACACTTGTOCGGATCTC 

GTTCA7^CACATTCC^CCGCGAOTTGGCAACGAGCGATTTAACTCCAACGCAATCAGC 
AACGAGGATTGGTCACAATACTACGGTTCTTCTTATAGACCGTTCCCTACTCCATATAAG 
GTTAACACAGAGATCGAATGTTGAATGTTACAACACAATATATATCTACCACCGTTGCGT 
GTAGAGAACTCTGCGTTTAGTGATTCCGATTTCTTCACGAGTATGACTCACAACAACGAC 
CATGGCGTTTTCGATGACTTTAC1TTTGCTGCAAGTAACTCCAACCACAATAATAGCGTT 
GGTGATCAAGTGATCCACGTTGGCAATTATGATGAACAATTAATAACATCTAACCGTCAT 
ATGAACCAGACTGGTTATATAAAAGAGCAGAAGATCAGATCGAGTTTGGATAATACTGAC 
GAAGATCCAGGATTTCATGGTAACAATACCAATGACAACATAGATATCGATGATTTTCTC 
TCGTTTGATATATATAACGAGGACAACGTGAATCAAATAGAAGATAATGAAGACGTGAAT 
ACAAATGAAACCCTTGATTCATCGGGATTCGAGGTGGTTGAAGAAGAAACTAGATTTAAC 
AACCAAATGCTCATCTCGACATATCAAACGACAAAGATTCTATATCACCAAGTCGTACCT 
TGTCACACGTTGAAAGTTCACGTCAATCCTATTAGTCACAATGTGGAAGAGAGAACATTG 
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TTCATTGAAGAGGACAAAGATTCTTGGTTACAAAGAGCTGAGAAGATCACGAAGACAAAA 

>G2180 Amino Acid Sequence (conserved domain in AA coordinates :7-156) 

MAPVSLPPGFRFHPTDEELITYYLKRKINGLEIELEVIAEVDLYKCEPWDLPGKSLLPSK 

DQEWYFFSPRDRKYPNGSRTNRATKGGYWKATGKDRRVSWRDRAIGTKKTLVYYRGRAPH 

GIRTGWVMHEYRLDETECEPSAYGMQDAYALCRVFKKIVIEAKPRDQHRSYVHAMSNVSG 

NCSSSFDTCSDLEISSTTHQVQNTFQPRFGNERFNSNAISNEDWSQYYGSSYRPFPTPYK 

TOTEIECSMLQHNIYLPPLRVENSAFSDSDFFTSMTHNNDHGVFDDFTFAASNSNHNNSV 

GDQVIHVGNYDEQLITSNRHMNQTGYIKEQKIRSSLDNTDEDPGFHGNNTNDNIDIDDFL 

SFDIYNEDNVNQIEDNEDVNTNETLDSSGFEWEEETRFNNQMLISTYQTTKILYHQWP 

CHTLKVHVNPI SHNVEERTLFI EEDKDS WLQRAEKITKTKLTLFSLMAQQY YKCLAI FF* 

>G1817 (1. .1308) 

ATGAAGGACGCAGAGT^AGCGAGAGGTGATTGCATCATCATCATTACAAAGAAAGAGAAAC 
AGAGGAAGAAGACTAAGGAAAAGAAGAAGAAGAAACGAGAAGCGAGTACTAATGGTTCCA 
TCATCATTACCAAACGACGTGCTAGAGGAGATCTTTTTAAGATTTCCGGTTAAAGCCCTA 
ATCCGACTCAAGTCTCTCTCGAAACAATGGAGATCGACGATCGAATCTCGCAGTTTTGAA 
GAGAGACACTTGACGATCGCTAAGAAAGCCTTCGTGGATCATCCCAAGGTCATGCTCGTA 
GGAGAAGAAGATCCCAT7VAGAGGAACCGGGATTCGTCCAGACACTGACATTGGTTTTAGG 
TTATTCTGCTTGGAATCGGCTTCTCTTCTATCCTTTACTCGTCTCAATTTCCCTCAAGGG 
TTCTTCAACTGGATCTACATATCTGAAAGCTGTGATGGCCTTTTCTGCATCCATTCCCCA 
AAATCACATTCCGTATATGTAGTGAATCCGGCTACACGGTGGCTCCGCCTACTTCCTCCG 
GCAGGGTTTCAGATTTTGATCCACAAGTTTAACCCCACTGAACGTGAGTGGAATGTAGTG 
ATGAAATCAATCTTTCATCTAGCATTCGTGAAGGCCACCGATTACAAATTAGTGTGGTTG 
TACAATTGTGATAAGTACATTGTTGATGCGTCGAGTCCAAACGTGGGAGTCACAAAGTGC 
GAGATTTTTGACTTTAGGAAAAATGCTTGGAGGTACTTGGCTTGCACTCCAAGTCATCAG 
ATATTCTATTACCAAAAGCCAGCATCTGCAAACGGGTCGGTTTATTGGTTTACAGAACCA 
TATAATGAAAGAATCGAAGTAGTGGCTTTTGATATTCAGACCGAAACATTCCGGTTGCTG 
CCTAAGATTAATCCGGCTATTGCTGGTTCAGATCCTCACCATATTGACATGTGCACTCTG 
GATAATAGTTTGTGTATGTCGAAAAGGGAGAAAGATACTATGATCCAAGATATTTGGAGG 
TTGAAACCATCAGAAGACACATGGGAAT^GATTTTTAGCATAGACTTGGTTTCCTGTCCT 
TCTTCTCGGACTGAGAAGCGTGATCAATTTGATTGGAGCAAGAAGGATAGGGTTGAGCCA 
GCCACACCCGTCGCGGTTTGTAAGAATAAGAAGATCCTTCTCTCACATCGCTATTCCCGA 
GGTTTGGTAAAGTACGATCCCCTAACAAAATCTATCGATTTTTTTTCCGGACATCCTACC 
GCTTACAGAAAAGTTATTTATTTTCAAAGTTTGATATCTCATCTATAA 

>G1817 Amino Acid Sequence (conserved domain in AA coordinates : 47-331) 
MKDAEKREVIASSSLQRKRNRGRRLRKRRRRNEKRVL^ 

IRLKSLSKQWRSTIESRSFEERHLTIAKKAFVDHPKVMIiVGEEDPIRGTGIRPDTDIGFR 

LFCLESASLLSFTRLNFPQGFFNWIYISESCDGLFCIHSPKSHSVYWNPATRWLRLLPP 

AGFQILIHKFNPTERE WNWMKS I FHLAF VKATD YKL VWLYNCDKYI VD AS S PNVGVTKC 

EIFDFRKNAWRYLACTPSHQIFYYQKPASANGSVYWFTEPYNERIEWAFDIQTETFRLL 

PKINPAIAGSDPHHIDMCTLDNSLCMSKREKDTMIQDIWRLKPSEDTWEKIFSIDLVSCP 

SSRTEKRDQFDWSKKDRTOPATPVAVCKNKKILLSHRYSRGLVTCYDPLTKSIDF 

AYRKVIYFQSLISHL* 

>G1649 (61.. 1311) 

ATTCACAAAAACCGGAAAAAAAAAAAGACAAGTAAAGAAAGCTTTGTTCAGTTTACTTCA 
ATGGAAGCAAAACCCTTAGCATCATCATCATCTGAACCAAACATGATTTCTCCATCATCA 
AACATTAAACCAAAATTATUUVGATGAAGATTATATGGAGCTGGTGTGTGAAAATGGGCAG 
ATTCTTGCAAAGAT^CGAAGACCAAAGAACAACGGTTCTTTTCAAAAGCAACGTAGGCAA 
TCTCTCCTGGATTTGTATGAGACCGAGTAGAGCGAG 

CTTGGAGACACAC^GTTGTTCCGGTGAGTC^GTCTAAGCCAC^CAAGATAAAGAAACC 
AATGAACAAATGAACAACAATAAGAAGAAGCTAAAGTCCTCCAAAATCGAATTTGAGAGA 
AATGTTTCGAAAAGCAACAAATGTGTTGAATCATCAACATTAATTGATGTTTCTGCTAAA 
GGTCCAAAGAATGTTGAAGTTACTACAGCTCCTCCTGATGAGCAATCTGCAGCTGTTGGT 
AGATCCACGGAATTGTATTTTGCTTCirCATCGAAGTTTTCTCGAGGAACTTCGAGAGAT 
CTAAGTTGTTGTTCTTTAAAGAGGAAGTATGGAGATATTGAAGAAGAAGAATCAACCTAT 
TTAAGTAATAATTCAGATGATGAATCAGATGATGCGAAGACACAAGTTCATGCGAGAACA 
AGAAAGCCGGTGACTAAAAGAAAACGAAGCACAGAAGTCCATAAGTTATATGAAAGAAAA 
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CGAAGA6ATGMTTCAACAAGAAAATGCGTGCTTTGCAGGACCTACTACCAAATTGTTAC 
AAGGATGATAAGGCTTCATTGTTGGATGAGGCTATCAAATATATGCGGACCCTTCAACTT 
C7VAGTTCAGATGATGAGTATGGGAAATGGATTMTAAGACCACCTACGATGTTGCC7VATG 
GGTCATTACTCTCCCATGGGTCTAGGAATGCATATGGGTGCAGCAGCAACACCAACATCA 
ATACCGCAATTCCTGCCTATGAATGTTCAAGCAACCGGTTTTCCGGGGATGAACAATGCA 
CCACCACAAATGCTAAGCTTTCTTAATCACCCAAGTGGACTAATTCCAAACACTCCTATC 
TTTTCTCCATTGGAAAATTGCTCTCAGCCATTCGTGGTGCCTTCGTGTGTTTCTCAGACT 
CAGGCTACTTCTTTTACTCAATTCCCAAAGTCTGCGTCCGCCTCAAACTTAGAAGATGCA 
ATGCAATATAGAGGAAGCAACGGTTTTAGTTATTATCGCTCGCCAAACTAATGATTTGTA 
GAAAGTTGATGTTTTCTCCAACTAACTAACTTTAAGCAAA7VAAAAATGATCGTCTACTCT 
GTGTTGTTAGTCTATGGGCTTTTGGGCCTTGATTCTTGGAACGATTTGAACTTAATTCCA 
ACTATTTTCAAAGTGGATGTACAAAGTAAAA 

>G1649 Amino Acid Sequence (conserved domain in AA coordinates : 225-295 ) 

MEAKPLASSSSEPNMISPSSNIKPKLKDEDYMELVCENGQILAKIRRPKNNGSFQKQRRQ 

SLLDLYETEYSEGFKKNIKILGDTQVVPVSQ 

NVSKSNKCVESSTLIDVSAKGPKNVEVTTAPPDEQSAAVGRSTELYFASSSKFSRGTSRD 

LSCCSLKRKYGDIEEEESTYLSNWSDDESDDAKTQVHARTRKPVTKRKRSTEVHKLYERK 

RRDEFNKKMRALQDLLPNCYKDDKASLLDEAIKYMRTLQLQVQMMSMGNGLIRPPTMLPM 

GHYSPMGLGMHMGAAATPTSIPQFLPMNVQATGFPGMNNAPPQMLSFLNHPSGLIPNTPI 

FSPLENCSQPFWPSCVSQTQATSFTQFPKSASASNLEDAMQYRGSNGFSYYRSPN* 

>G2131 (69. .1010) 

GTCTCTCATTTTCATAATTCCATTTTCAGGATTGTCTCTCAATCTTTTATTCTTCTCATT 
CACCGGTAATGGCAAAAGTCTCTGGGAGGAGCAAGAAAACAATCGTTGACGATGAAATCA 
GCGATAAAACAGCGTCTGCGTCTGAGTCTGCGTCCATTGCCTTAACATCCAAACGCAAAC 
GTAAGTCGCCGCCTCGAAACGCTCCTCTTCAACGCAGCTCCCCTTACAGAGGCGTCACAA 
GGCATAGATGGACTGGGAGATACGAAGCGCATTTGTGGGATAAGAACAGCTGGAACGATA 
CACAGACCAAGAAAGGACGTCAAGTTTATCTAGGGGCTTACGACGAAGAAGAAGCAGCAG 
CACGTGCCTACGACTTAGCAGCATTGAAGTACTGGGGACGAGACACACTCTTGAACTTCC 
CTTTGCCGAGTTATGACGAAGACGTCA7^AGAAATGGAAGGCCAATCCAAGGAAGAGTATA 
TTGGATCATTGAGAAGAAAAAGTAGTGGATTTTCTCGCGGTGTATCAAAATACAGAGGCG 
TTGCAAGGCATCACCATAATGGGAGATGGGAAGCTAGAATTGGAAGGGTGTTTGGTAATA 
AATATCTATATCTTGGAACATACGCCACGCAAGAAGAAGCAGCAATCGCCTACGACATCG 
CGGCAATAGAGTACCGTGGACTTAACGCCGTTACCAATTTCGACGTCAGCCGTTATCT7VA 
ACCCTAACGCCGCCGCGGATAAAGCCGATTCCGATTCTAAGCCCATTCGAAGCCCTAGTC 
GCGAGCCCGAATCGTCGGATGATAACAAATCTCCGAAATCAGAGGAAGTAATCGAACCAT 
CTACATCGCCGGAAGTGATTCCAACTCGCCGGAGCTTCCCCGACGATATCCAGACGTATT 
TTGGGTGTCAAGATTCCGGCAAGTTAGCGACTGAGGAAGACGTAATATTCGATTGTTTCA 
ATTCTTATATAAATCCTGGCTTCTATAACGAGTTTGATTATGGACCTTAATCGTATTTTC 
TACAAGTTTTGTTTTGATTATCTACACAATACATCAATATATTCT 

>G2131 Amino Acid Sequence (conserved domain in AA coordinates : 50-186, 112-183) 
MAKVSGRSKECT I VDDE I SDKTAS ASE S AS I ALTS KRKRKS PPRNAPLQRS S PYRGVTRHR 
WTGRYEAHLWDKNSWNDTQTKKGRQVT 

S YDEDVKEMEGQSKEE YIGSLRRKS S GFSRGVS KYRGVARHHHNGRWEARIGRVFGNKYIi 

YLGTYATQEEAAIAYDIAAIEYRGIiNAVTNFDVSRYLNPNAAADKADSDSKPIRSPSREP 

ESSDDNKSPKSEEVIEPSTSPEVIPTRRSFPDDIQTYFGCQDSGKLATEEDVIFDCFNSY 

INPGFYNEFDYGP* 

>G215 (1..1110) 

ATGACTCGTCGGTGTTCGCATTGTAGCAACAATGGGCACAATTCACGCACGTGTCCAACG 
CGTGGGTCTGGTTCCTCCTCCGCCGTGAAGTTATTTGGTGTGAGGTTAACGGATGGCTCG 
ATTATTAAAAAGAGTGCGAGTATGGGTAATCTCTCGGCATTGGCTGTTGCGGCGGCGGCG 
GCAACGCACCACCGTTTATCTCCGTCGTCTCCTCTGGCGACGTCAAATCTTAATGATTCG 
CCGTTATCGGATCATGCCCGATACTCTAATTTGCATCAT71ATGAAGGGTATTTATCTGAT 
GATCCTGCTCATGGTTCTGGGTCTAGTCACCGTCGTGGTGAGAGGAAGAGAGGTGTTCCT 
TGGACTGAAGAGGAACATAGACTATTCTTAGTCGGTCTTCAGAAACTCGGGAAAGGAGAT 
TGGCGCGGTATTTCGAGAAACTATGTAACGTCAAGAACTCCTACACAAGTGGCTAGTCAT 
GCTCAAAAGTATTTTATTCGACATACTAGTTCAAGCCGCAGGAAAAGACGGTCTAGCCTC 
TTCGACATGGTTACAGATGAGATGGTAACCGATTCATCGCCAACACAGGAAGAGCAGACC 
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TTAAACGGTTCCTCTCCAAGCAAGGAACCTGAAAAGAAAAGCTACCTTCCTTCACTTGAG 
CTCTCACTCAATAATACCACAGAAGCTGAAGAGGTCGTAGCCACGGCGCCACGACAGGAA 
AAATCTCAAGAAGCTATAGAACCATCAAATGGTGTTTCACCAATGCTAGTCCCGGGTGGC 
TTCTTTCCTCCTTGTTTTCCAGTGACTTACACGATTTGGCTCCCTGCGTCACTTCACGGA 
ACAGAACATGCCTTAAACGCTGAGACTTCTTCTCAGCAGCATCAGGTCCTAAAACCAAAA 
CCTGGATTTGCTAAAGAACGTGTGAACATGGACGAGTTGGTCGGTATGTCTCAGCTTAGC 
ATAGGAATGGCGACAAGACACGAAACCGAAACTTCCCCTTCCCCGCTATCTTTGAGACTA 

GAGCCCTCAAGGCCATCAGCGTTTCACTCGAATGGCTCGGTTAATGGTGCAGATTTGAGT 

AAAGGCAACAGCGCGATTCAGGCTATCTAA ~ 

>G215 Amino Acid Sequence (domain in AA coordinates- TBD) 

MTRRCSHCSNNGHNSRTCPTRGSGSSSAVKLFGVRLTDGSIIKKSASMGNLSALAVAAAA 

ATHHRLSPSSPLATSNLNDSPLSDHARYSNLHHNEGYLSDDPAHGSGSSHRRGERKRGVP 

WTBEEHRLFLVGLQKLGKGDWRGISRNYVTSRTPTQVASHAQKYFIRHTSSSRRKRRSSL 

FDMVTDEMVTDSSPTQEEQTLNGSSPSKEPEICKSYLPSLELSLNNTTEAEEWATAPROE 

KSQEAIEPSNGVSPMLVPGGFFPPCFPVTYTIWLPASLHGTEHALNAETSSQQHQVLKPK 

^™»™ ELVGM SQL3IGMATRHETETSPSPI,SLRI.EPSRPSAFHSNGSVWGADLS 

>G1508 (1. .420) 

ATGCTAGATCACAGTGAAAAGGTCTTATTGGTTGATTCAGAAACCATGAAAACAAGAGCT 
GAAGATATGATCGAACAGAACAACACTAGTGTTAACGACAAGAAGAAGACTTGTGCTGAT 
TGTGGAACCAGTAAAACTCCTCTTTGGCGTGGTGGTCCTGTTGGTCCAAAGTCGTTGTGT 
AACGCGTGTGGGATCAGAAACAGAAAGAAGAGAAGAGGAGGAACAGAAGATAATAAGAAA 
TTAAAGAAATCGAGTTCTGGCGGCGGAAACCGTAAATTTGGTGAATCGTTAAAACAGAGT 
TTGATGGATTTGGGGATAAGGAAGAGATCAACGGTGGAGAAGCAACGACAGAAGCTTGGT 
GAAGAAGAACAAGCCGCTGTGTTACTCATGGCTCTTTCTTATGGCTCTGTTTACGCTTAG 
>G1508 Amino Acid Sequence (domain in AA coordinates: 38-63) 

MLDHSEKVLLVDSETMKTRAEDMIEQNNTSVMDKKKTCADCGTSKTPLWRGGPVGPKSLC 

NACGIRNRKKRRGGTEDNKKLKKSSSGGGNRKFGESLKQSLMDLGIRKRSTVEKQRQKLG 
EEEQAAVLLMALSYGSVYA* 

>G2110 (36..a622) 

GAGAGCTAATAAAAAATTTATCAAAGAAGACTAATATGGAGAAGGACGATTTCTTGAGGA 
GTGGTCATGGAAGAGAAGAAAGCCATGATGAGATGAGAAAACTTGATTCATCTCACGATT 
ATTCTCATCAAGAACACGACCATATTATAAGATCCAAGTTGGACTCAACTAAAGTCGAAA 
TGGATGAGGCTAAAGAGGAAAATCGAAGACTAAAGTCATCATTGAGTAAAATCAAGAAAG 
A^GACATCCTTCAAACACAATACAACCAAWAATGGCCAAACATAACGAACCAACCA 
AGTTCCAATCAAAAGGGCATCATCAAGACAAAGGCGAAGATGAAGACAGAGAAAAAGTTA 
ACGAACGTGAAGAACTTGTCTCGTTGAGCCTAGGCAGACGGTTAAATTCAGAGGTTCCAA 

TGAGTAATCCTAATGAGAAGTTAGAGATTGATCATAATCAAGAAACCATGTCGTTGGAGA 
^AGTAACAATAATAAGATCAGATCAOU^TAGTTTTGGGTTTAAGAATGATGGAGATG 

TGAGATCAAGATGTGAGACACCAACGATGAACGACGGATCTCAATGGAGGAAATATGGCC 

AGAAAATAGCTAJ^GGCAATCCATGTCCCCGAGCTTACTATCGTTGCACCATTGCAGCT^ 

CTOGTCCAGTAAGAAAACAGGTGCAAAGATGTTCAGAAGATATGTCTATACTTATCTCAA 

CGTACGAAGGAACACATAACCATCCACTTCCCATGTCAGCAACTGCCATGGCCTCTCCCA 
OTCCGCTGCCGCCTCCATGCTTC^^ 

AtotcATGGCCTTAACTTCTCTCTTTCCGGCAACAACATCACTCCAAAACCTA^CTC 
A^CCTCCAATCCCCTTCTTCTTCTGGCCATCCGACCGTCACTCTCGACCTCA^AACCt 
CCTCCTCGTCGCAGCAACCGTTCTTATCAATGCTC^TAGATTCAGCTCTCCTC^S 

atgtctcacgatctaatagttatccttcaaccaatctcaacttttcaaacaacacca^ 
catogatgaattggggtggtggtggtaatcccagtgatcaataccgtgcagcSacSca 
acattaacacccatcagcaatcaccttaccacaaaatcattcaaacccgaaSc^^^ 
catctttcgatccgtttggaagatcatcttcatcacattctccacWta^tc 

ATATCGGAATCAAGAACATCATCAGTCACCAAGTGCCATCTTTAC^cSJSiS 
AGGCAATCACC^CAC^TCCAAGTTTCCAATCGGCTTTCGCGACAG^ 

TGGGCGGCGArrTAAAGAirGATCACAATGTGACTAGAAATGAAGCTSSSIS 
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AAAGAGAATTGTTATATATATGTTCTTATATACTCAGTACATTGGTAAATGGGTTTAGAC 

TTTCACTAGTTTCCTAGTTCATCTATATATTGGTTGTTTAATCACAAGTTTATTTTGTTG 

TTGGAGTTTATGGAACTAATGTGTACATATGAAACTTTAGAACGAATAAATAAAACTTGG 
AATTCCTTTTTAAAAAAAAAAAAAAAAA 

>G2110 Amino Acid Sequence (conserved domain in AA coordinates:239-298) 

MBKDDFLRSGHGREESHDEMRKLDSSHDDSHQEHDHI IRSKLDSTKVEMDEAKEENRRLK 

SSLSKIKKDFDILQTQYNQLMAKHHEPTKFQSKGHHQDKGEDEDREICVNEREELVSLSLG 

RRLNSEVPSGSNKEEKNKDVEEAEGDRNYDDNEKSSIQGLSMGIEYKALSNPNEKLEIDH 

NQETMSLEISNNNKIRSQNSFGFKNDGDDHEDEDEILPQNLVKKTRVSVRSRCETPTMND 

GCQWRKYGQKIAKGNPCPRAYYRCTIAASCPVRKQVQRCSEDMSILISTYEGTHNHPLPM 

SATAMASATSAAASMLLSGASSSSSAAADLHGLNFSLSGNNITPKPKTHFLQSPSSSGHP 

TVTLDLTTSSSSQQPFLSMLNRFSSPPSNVSRSNSYPSTNLNFSNNTNTLMNWGGGGNPS 

DQYRAAYGNINTHQQSPYHKIIQTRTAGSSFDPFGRSSSSHSPQINLDHIGIKNIISHQV 

PSLPAETI KAITTDPSFQSALATALS S IMGGDLKIDHNVTRNEAEKS P* 

>G2442 (71.. 997) 

TCGACCAATTTAGACCATTCCAAATTCGTCGTCCTTTTCTCTGTGTAGTCTAATTATATA 
TTACAAGTAGATGAATTGGTTACCTGAAGCTGAAGCTGAGGAGCACTTGAAAGGTATTCT 
CTCTGGTGATTTCTTTGATGGTCTCACCAATCACCTTGATTGCCCACTTGAAGACATCGA 
TTCCACCAATGGTGAGGGAGATTGGGTCGCCAGGTTTCAAGACCTTGAGCCTCCTCCCTT 
GGATATGTTCCCTGCTTTGCCTTCTGACCTCACCTCTTGTCCCAAGGGCGCCGCTCGTGT 
GCGGATTCCCAACAACATGATTCCTGCTTTGAAGCAGTCCTGTTCTTCTGAAGCCTTGTC 
CGGCATTAATAGCACTCCCCACCAATCTTCAGCTCCTCCTGATATCAAAGTTTCATATCT 
ATTTCAGTCTCTAACTCCAGTGTCAGTTCTCGAGAACAGTTATGGTTCTCTCTCCACCCA 
AAACTCCGGATCTCAGAGATTGGCTTTCCCTGTGAAAGGCATGAGAAGCAAGCGCAGACG 
CCCCACAACAGTGAGACTTAGCTACCTTTTCCCCTTTGAACCCAGAAAGTCAACTCCGGG 
TGAATCAGTAACCGAGGGTTACTATTCTTCTGAGCAACATGCCAAGAAGAAGCGCAAGAT 
TCATCTGATCACCCACACCGAGTCTTCCACTTTGGAGTCAAGTAAGTCGGATGGGATAGT 
CCGGATATGCACTCATTGTGAGACAATCACGACCCCACAGTGGAGGCAAGGACCCAGTGG 
ACCCAAGACCCTCTGCAACGCTTGCGGAGTCCGGTTCAAATCTGGTCGCCTAGTTCCAGA 
ATACCGGCCAGCCTCAAGCCCGACCTTCATCCCATCTGTGCATTCAAACTCACACAGGAA 
GATCATTGAGATGAGAAAGAAGGACGACGAGTTTGATACCAGCATGATTCGCAGTGATAT 
CCAGAAGGTAAAGCAGGGGAGGAAGAAAATGGTATAAAAGTA 

>G2442 Amino Acid Sequence (domain in aa coordinates: 220-246) 

MNWLPEAEAEEHLKGILSGDFFDGLTNHLDCPLEDIDSTNGEGDWVARFQDLEPPPLDMF 

PALPSDLTSCPKGAARVRIPNKMIPALKQSCSSEALSGINSTPHQSSAPPDIKVSYLFQS 

LTPVSVLENSYGSLSTQNSGSQRLAFPVKGMRSKRRRPTTVRLSYLFPFEPRKSTPGESV 

TEGYYSSEQHAKKKRKIHLITHTESSTLESSKSDGIVRICTHCETITTPQWRQGPSGPKT 

LCNACGVRFKSGRLVPEYRPASSPTFIPSVHSNSHRKIIEMRKKDDEFDTSMIRSDIQKV 
KQGRKKMV* 

>G1051 (66.. 1031) 

CCTGTAAATTCAGATTTGCTTTCTTTGGTAATCTTTTGGATCAAGATCCATCTATTTTTT 
CTTCAATGGCACAACTCCCTCCTAAAATCCCCAACATGACACAACATTGGCCTGATTTCT 
CTTCCCAAAAGCTCTCTCCTTTCTCTACCCCAACCGCAACCGCTGTCGCCACCGCTACAA 
CCACCGTACAAAACCCCTCATGGGTCGACGAATTCCTCGACTTCTCAGCGTCTCGCCGTG 
GCAACCACCGTCGTTCCATCAGCGACTCTATCGCATTCCTCGAAGCTCCAACAGTCAGCA 
TCGAAGACCACCAATTCGACAGGTTCGATGACGAACAGTTCATGTCGATGTTCACCGACG 
ACGACAACCTTCATAGCAATCCTTCCCATATCAACAACAAAAATAACAATGTGGGGCCCA 
CGGGATCTTCCTCGAACACATCCACGCCGTCCAATAGCTTCAACGACGATAACAAAGAAT 
TACCACCGTCCGATCATAACATGAACAATAATATCAACAACAACTATAACGATGAAGTCC 
AAAGCCAATGCAAGATGGAGCCAGAAGATGGTACGGCGTCGAATAACAATTCCGGTGATA 
GCTCCGGCAACCGGATTCTCGATCCCAAAAGGGTTAAGAGAATATTAGCAAATCGGCAAT 
CAGCACAGAGATCAAGGGTGAGGAAACTGCAATACATATCAGAGCTCGAACGTAGCGTCA 
CTTCGTTGCAGGCGGAAGTGTCAGTGTTATCGCCAAGAGTTGCATTCTTGGATCATCAAC 
GTTTGCTTCTTAACGTTGACAACAGCGCTCTCAAGCAACGAATCGCTGCTTTATCTCAAG 
ACAAGCTTTTCAAAGACGCACATCAAGAAGCATTGAAGAGAGAAATAGAGAGACTTCGAC 
AAGTGTATAATCAACAAAGCCTCACGAATGTGGAAAATGCAAATCATTTATCGGCGACCG 
GAGCCGGTGCTACTCCGGCCGTCGACATCAAGTCGTCCGTTGAAACAGAGCAGCTCCTCA 
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ATGTCTCATAAATTAACCATCATGCATCATCATCAACATTTCTCTCTTTTAGCTTCTTGG 
CAAAAGTTCTTGACTATAAAATCTCTTTCGGGTAAGAAATTCAGGAGATATACATTTTTT 
ATTCTAATCACATTGTTTTTAAGTTGTGATGAATTCAGTTTGATGTATCTTATTTATTTT 
GTTTATGTCGTCTTTTTTTCTTGGGGTTGATGGAAGGGAATCATCAATTGTTGTTTGTAC 
AAAGAACTAGTTGAATTTTTTTTTTTTTTTT 

>G1051 Amino Acid Sequence (domain in AA coordinates 189-250) 

MAQLPPKIPNMTQHWPDFSSQICLSPFSTPTATAVATATTTVQNPSWVDEFLDFSASRRGN 

HRRSISDSIAFLEAPTVSIEDHQFDRFDDEQFMSMFTDDDNLHSNPSHINNKNNNVGPTG 

SSSNTSTPSNSFNDDNKELPPSDHNMNNNINNNYNDEVQSQCKMEPEDGTASNNNSGDSS 

GNRILDPKRVKRILANRQSAQRSRVRKLQYISELERSVTSLQAEVSVLSPRVAFLDHQRL 

LLNVDNSALKQRIAALSQDKLFKDAHQEALKREIERLRQVYNQQSLTNVEN/^HLSATGA 

GATPAVDIKSS VETEQLLNVS * 

>G1052 (138. .1127) 

TGATCATCTAAAACTTTCAATTTCTCTCTTGATCCTCACTTGAATTTTTTGTTGTTTCTC 
TCAAATCTTTGATCCTTTCCTTTGTTTTTCATTTGACCTCTTACAAAAAAATCTGGTGTG 
CCATTAAATCTTTATTAATGGCACAACTTCCTCCGAAAATCCCAACCATGACGACGCCAA 
ATTGGCCTGACTTCTCCTCCCAGAAACTCCCTTCCATAGCCGCAACGGCGGCAGCCGCAG 
CAACCGCTGGACCTCAACAACAAAACCCTTCATGGATGGATGAGTTTCTCGACTTCTCAG 
CGACTCGCCGTGGGACTCACCGTCGTTCTAT7\AGCGACTCCATTGCTTTCCTTGAACCAC 
CTTCCTCCGGCGTCGGAAACCACCACTTCGATAGGTTTGACGACGAGCAATTCATGTCCA 
TGTTCAACGACGACGTACACAACAATAACCACAATCATCATCATCATCACAGCATCAACG 
GCAATGTGGGTCCCACGCGTTCATCCTCCAACACCTCCACGCCGTCCGATCATAATAGCC 
TTAGCGACGACGACAACAACAAAGAAGCACCACCGTCCGATCATGATCATCACATGGACA 
ATAATGTAGCCAATCAAAACAACGCCGCCGGTAACAATTACAACGAATCAGACGAGGTCC 
AAAGCCAGTGCAAGACGGAGCCACAAGATGGTCCGTCGGCGAATCAAAACTCCGGTGGAA 
GCTCCGGTAATCGTATTCACGACCCTAAAAGGGTAAAAAGAATTTTAGCAAATAGGCAAT 
CAGCACAGAGATCAAGGGTGAGGAAATTGCAATACATATCAGAGCTTGAAAGGAGCGTTA 
CTTCATTGCAGACTGAAGTGTCAGTGTTATCGCCAAGAGTTGCGTTTTTGGATCATCAGC 
GATTGCTTCTCAACGTCGACAATAGTGCTATCAAGCAACGAATCGCAGCTTTAGCACAAG 
ATAAGATTTTCAAAGACGCTCATCAAGAAGCATTGAAGAGAGAAATAGAGAGACTTCGAC 
AAGTATATCATCAACAAAGCCTCAAGAAGATGGAGAATAATGTCTCCGATCAATCTCCGG 
CCGATATCAAACCGTCCGTTGAGAAGGAACAGCTCCTCAATGTCTAAAGCTGTTCGTTCA 
CTAAGATCTTTCTTTTCATGGCGAAAAGATTCTTGACTATAAAACCTCTTTGTGTCAAGA 
AATTAATTTATCAAAGAAGATGGCCTTTTTTATTTGATCTAATCACATTTTTTTAAGTTG 
TGATGAATTTGCTTTTGATGTATCTGTTTTTrTTTTTTTTTTTT 

>G1052 Amino Acid Sequence (domain in AA coordinates 201-261) 
MAQLPPKIPTMTTPNWPDFSSQKLPSIAATAAAAATAGPQQQNPSWMDEFLDFSATRRGT 
HRRS ISDSIAFLEPPSSGVGNHHFDRFDDEQFMSMFNDDVH^ INGNVGPT 
RSSSNTSTPSDHNSLSDDDlsTNKKAPPSDH^ 

EPQDGPSANQNSGGSSGNRIHDPKRVKRILANRQSAQRSRVRKLQYISELERSVTSLQTE 
VS VliS PRVAFLDHQRLLLNVDNS AI KQRI AALAQDKI FKDAHQEALKREI ERLRQ VYHQQ 
SLKKMENNVSDQSPADIKPSVEKEQLLNV* 
>G1079 (1..1995) 

ATGGGTTGTGCTGCTTCAAGAATTGATAATGAAGAAAAGGTTTTAGTGTC 
AAGAGGCTAATGAAAAAGTTATTAGGGTTCAGGGGAGAATTTGCAGATGCACAGTTGGCT 
TATCTTAGAGCTTTGAGGAACACTGGTGTTACTCTTAGGCAATTCACTGAGTCTGAGACC 
TTGGAGCTTGAAAACACTAGTTATGGTTTAAGTTTGCCTTTGCCTC 

ACATTGCCTCCTTCACCTCCACCACCTCCTCCATTTAGCCCGGATTTGAGAAATCCTGAG 
ACTAGTCATGACTTGGCTGATGAGGAGGAAGAGGGTGAAAATGATGGTGGTAATGATGGA 
AGTGGTGCAGCTCCTeCGCCTCCATTGCCGAATTCTTGGAACATTTGGAACCCTTTTGAG 
TC^CTTGAGCTGCATAGTCATCCAAATGGTGACAATGTAGTTA(^CAAGTTGAACTGAAG 
AAGAAACAACAAATTCAGCAAGCTGAAGAGGAAGATTGGGCGGAGACGAAGTCTCAATTT 
GAGGAAGAAGATCAGCAACAAGAAGCAGGAGGTACTTGCCTTGATTTGAGTGTTCATCAA 
ATAGAGGCTGTTAGTGGCTGTAACATGAAGAAGCCACGTCGTCTGAAGTTTAAGCTGGGA 
GAAGTTATGGACGGTAACTCATCTATGACAAGCTGCTCCGGTAAAGATCTTGAGAAAACT 
CATGTGACTGATTGTAGAATCAGGAGGACCTTAGAAGGAATCATCAGAGAGTTGGATGAT 
TATTTTCTTAAAGCATCGGGTTGCGAGAAGGAGATAGCTGTGATAGTAGACATCAACAGT 
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AGGGATACTGTTGATCCTTTCAGGTACCAGGAAACAAGAAGGAAGAGAAGCAGCTCGGCA 

AAGGTATTCAGTGCATTGTCATGGAGTTGGTCTTCAAAGTCTCTTCAGTTGGGCAAAGAT 

GCTACAACAAGCGGGACTGTTGAACCCTGTAGGCCTGGAGCTCACTGCAGCACACTTGAG 

AAGCTATACACAGCTGAGAAGAAACTTTACCAGCTAGTCAGAAACAAAGAGATTGCCAAA 

GTGGAGCATGAGAGGAAGTCTGCATTACTGCAAAAGCAAGATGGGGAAACCTATGATTTG 

AGCAAAATGGAGAAAGCACGCTTGTCTTTGGAGAGTTTGGAAACCGAGATACAGCGTCTA 

GAAGATTCCATAACTACAACACGCTCATGTTTGCTTAACTTGATCAATGATGAGCTGTAT 

CCGCAGCTAGTTGCTTTAACTTCAGGGCTAGCACAGATGTGGAAAACAATGCTCAAGTGT 

CATCAAGTTCAAATTCATATATCCCAGCAACTGAACCATCTTCCGGATTACCCGAGTATA 

GATCTCAGTTCGGAATACAAACGCCAGGCGGTTAATGAACTAGAGACCGAGGTTACTTGC 

TGGTACAATAGCTTTTGCAAGTTAGTAAATTCCCAGCGAGAATACGTGAAAACACTCTGT 

ACGTGGATCCAACTTACTGATCGCCTCTCTAACGAAGACAACCAAAGAAGTAGCTTGCCT 

GTTGCTGCTCGTAAGCTCTGCAAAGAGTGGCAGCTTGAATACAACCTGCGTAGGAAATGC 

AATAAACTTGAGAGGAGGCTTGAGAAAGAGCTAATTTCACTGGCTGAGATTGAAAGAAGG 

CTCGAGGGGATTTTAGCAATGGAAGAGGAGGAAGTAAGCTCAACGAGTTTGGGCTCTAAG 

CATCCGTTGTCAATCAAACAAGCCAAGATCGAAGCCTTGAGAAAACGAGTGGATATTGAG 

AAAACTAAGTACTTAAACTCGGTCGAGGTTAGTAAGAGAATGACACTAGACAACCTCAAA 

TCAAGCCTTCCCAATGTCTTTCAGATGTTGACTGCTCTAGCTAATGTCTTTGCCAATGGG 

TTTGAATCCGTTAATGGCCAAACCGGTACAGATGTTTCCGACACATCCCAACATTCCGAT 
GAATCTCAACCCTAA 

>G1079 Amino Acid Sequence (conserved domain in AA coordinates: 1-50) 

MGCAASRIDNEEKVLVCRQRKRLMKKLLGFRGEPADAQLAYLRALRNTGVTLRQFTESET 

LELENTSYGLSLPLPPSPPPTLPPSPPPPPPFSPDLRNPETSHDLADEBEEGENDGGNDG 

SGAAPPPPLPNSWNIWNPFESLELHSHPNGDNWTQVELKKKQQIQQAEEEDWAETKSQF 

BEEDEQQEAGGTCLDLSVHQIEAVSGCNMKKPRRLKFKLGEVMDGNSSMTSCSGKDLEKT 

HVTDCRIRRTLEGIIRELDDYFLKASGCEKEIAVIVDINSRDTVDPFRYQETRRKRSSSA 

KVFSALSWSWSSKSLQLGKDATTSGTVEPCRPGAHCSTLEKLYTAEKICLYQLVRNKEIAK 

VEHERKSALLQKQDGETYDLSKMEKARLSLESLETEIQRLEDSITTTRSCLLNLINDELY 

PQLVALTSGLAQMWKTMLKCHQVQ IH I SQQLNHLPDYPS IDLS SE YKRQAVNELETEVTC 

WYNSFCKLVNSQREYVKTLCTWIQLTDRLSNEDNQRSSLPVAARKLCKEWQLEYNLRRKC 

NKLERRLEKELISLAEIERRLEGILAMEEEEVSSTSLGSKHPLSIKQAKIEALRKRVDIE 

KTKYLNSVEVSKRMTLDNLKSSLPNVFQMLTALANVFANGFESVNGQTGTDVSDTSQHSD 
ESQP* 

>G1335 (56. .667) 

TTTTTTTTTAAAAGATTTAGAGAGAAAAGTGAGTTATTAAGAGATTCCAATCAAAATGAG 

CGGAGACAACGGCGGTGGTGAGAGGCGCAAAGGCTCCGTCAAGTGGTTTGATACCCAGAA 

GGGTTTCGGCTTCATCACTCCTGACGACGGTGGCGACGATCTCTTCGTTCACCAGTCCTC 

CATCAGATCTGAGGGTTTCCGTAGCCTCGCTGCCGAAGAAGCCGTAGAGTTCGAGGTTGA 

GATCGACAACAACAACCGTCCCAAGGCCATCGATGTTTCTGGACCCGACGGCGCTCCCGT 

CCAAGGAAACAGCGGTGGTGGTTCATCTGGCGGACGCGGCGGTTTCGGTGGAGGAAGAGG 

AGGTGGACGCGGATCTGGAGGTGGATACGGCGGTGGCGGTGGTGGATACGGAGGAAGAGG 

AGGTGGTGGTCGAGGAGGCAGCGACTGCTACAAGTGTGGTGAGCCCGGTCACATGGCGAG 

AGACTGTTCTGAAGGCGGTGGAGGTTACGGAGGAGGCGGCGGTGGCTACGGAGGTGGAGG 

CGGATACGGCGGAGGAGGTGGTGGTTACGGAGGTGGTGGCCGTGGAGGTGGTGGCGGCGG 

GGGAAGCTGCTAC^GCTGTGGCGAGTCGGGACATTTCGCCAGGGATTGCACCAGCGGTGG 

ACGTTAAAACCAACGCCGGTTACGCGGTGGAGAAGAGTGAGTTGGTTATCTCACAAGTGA 

TCGGTTCTTTCTCCCGCCGCCTTCTATCTCTCTATTATCCACTTTTTGCTTATTATGATG 
GATCTCTATOTTGOTAGTTGGTTTTra 

GTTITGCTACTTATGGTTGGTTTTATTTATGGTACTTGTGATATGGGTGAMTGCTCTAC 
^TGTOCTCTGTTTCAAGTC^^ 

f" 5 Rmino Acid Sequence (domain in AA coordinates: 24-43 131-144 185-203) 

MSGDNGGGERRKGSVKWFDTQKGFGFITPDIXSGDDLFVHQSSIRSEGFRSl^EAVEFE 

^IDNNNRPKAIDVSGPDGAPVQGNSGGGSSGGRGGFGGGRGGGRGSGGGYGGGGGGYGG 

RGGGGRGGSDCYKCGEPGHMARDCSEGGGGYGGGGGGYGGGGGYGGGGGGYGGGGRGGGG 
GGGSCYSCGESGHFARDCTSGGR* 

>G157 (31.. 621) 
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GGGCATAACCCTTATCGGAGATTTGAAGCCATGGGAAGAAGAAAAATCGAGATCAAGCGA 

ATCGAGAACAAAAGCAGTCGACAAGTCACTTTCTCCAAACGACGCAATGGTCTCATCGAC 

AAAGCTCGACAACTTTCGATTCTCTGTGAATCCTCCGTCGCTGTTGTCGTCGTATCTGCC 

TCCGGAAAACTCTATGACTCTTCCTCCGGTGACGACATTTCCAAGATCATTGATCGTTAT 

GAAATACAACATGCTGATGAACTTAGAGCCTTAGATCTTGAAGAAAAAATTCAGAATTAT 

CTTCCACACAAGGAGTTACTAGAAACAGTCCAAAGCAAGCTTGAAGAACCAAATGTCGAT 

AATGTAAGTGTAGATTCTCTAATTTCTCTGGAGGAACAACTTGAGACTGCTCTGTCCGTA 

AGTAGAGCTAGGAAGGCAGAACTGATGATGGAGTATATCGAGTCCCTTAAAGAAAAGGAG 

AAATTGCTGAGAGMGAGAACCAGGTTCTGGCTAGCCAGATGGGAAAGAATACGTTGCTG 

GCAACAGATGATGAGAGAGGAATGTTTCCGGGAAGTAGCTCCGGCAACAAAATACCGGAG 

ACTCTCCCGCTGCTCAATTAGCCACCATCATCAACGGCTGAGTTTTCACCTTAAACTCAA 

AGCCTGATTCATAATTAAGAGAATAAATTTGTATATTATAAAAAGCTGTGTAATCTCAAA 

CCTTTTATCTTCCTCTAGTGTGGAATTTAAGGTCAAAAAGAAAACGAGAAAGTATGGATC 

AGTGTTGTACCTCCTTCGGAGACAAGATCAGAGTTTGTGTGTTTGTGTCTGAATGTACGG 

ATTGGATTTTTAAAGTTGTGCTTTCTTTCTTCAAAAAAAAAAA 

>G157 Amino Acid Sequence (domain in AA coordinates: 2-57) 

MGRRKIEIKRIENKSSRQVTFSKRRNGLIDKARQLSILCESSVAVVVVSASGKLYDSSSG 

DD I S KI IDRYE I QHADELRAJjDLEEKI QN YLPHKELLETVQS KLEEPNVDNVS VDS L XSJj 

EEQLETALSVSRARKAELMMEYIESLKEKEKLLREENQVLASQMGKNTLLATDDERGMFP 
GSSS GNKI PETLPLLN * 

>G1895 (1..954) 

ATGAATAACCAATCTGTTACTGACAATACAAGTCTTAAGCTGTCATCTAATCTTAACAAC 

GAGTCAAAAGAAACATCTGAGAACAGTGATGACCAACACAGCGAGATCACAACAATTACA 

TCGGAAGAAGAGAAAACAACTGAACTGAAGAAACCAGACAAGATTCTTCCATGTCCGAGA 

TGCAACAGCGCAGACACCAAATTCTGTTACTACAACAACTACAACGTTAACCAGCCACGT 

CACTTCTGTAGAAAATGCCAGAGGTATTGGACCGCTGGTGGATCCATGAGGATCGTCCCG 

GTTGGCTCAGGCCGTCGCAAGAACAAGGGATGGGTTTCTTCAGACCAGTACCTGCACATC 

ACTTCCGAGGATACTGACAATTACAATAGCTCCTCAACAAAGATTCTAAGCTTCGAGTCT 

TCGGACTCTTTGGTAACTGAGAGGCCTAAGCATCAATCAAACGAAGTGAAGATAAACGCT 

GAACCTGTTTCACAAGAACCCAACAACTTCCAAGGGTTACTTCCTCCCCAAGCATCCCCT 

GTTTCGCCTCCTTGGCCTTACCAATACCCTCCAAACCCTAGTTTCTACCACATGCCCGTC 

TACTGGGGCTGCGCGATACCGGTTTGGTCTACCCTCGACACTTCTACATGTCTTGGGAAA 

AGGACAAGAGACGAAACTTCTCATGAAACTGTTAAAGAGAGTAAAAATGCTTTTGAGAGA 

ACAAGCTTGCTTTTGGAATCTCAGAGCATCAAAAATGAAACAAGTATGGCTACAAATAAC 

CATGTGTGGTATCCAGTACCGATGACCCGCGAGAAGACACAAGAATTCAGCTTTTTCAGT 

AATGGAGCTGAAACAAAGAGCAGCAACAACAGATTCGTCCCTGAAACGTATCTTAACCTG 

CAAGCAAACCCTGCAGCCATGGCAAGATCTATGAACTTCAGAGAGAGCATATAA 

>G1895 Amino Acid Sequence (domain in AA coordinates: 55-110) 

I^NQSVTDNTSLKLSSNLNNESKETSENSDDQHSEITTITSEEEKTTELKKPDKILPCPR 
CNSADTKFCYYNNYNWQPRHFCRKCQRYW^ 

TSEDTDNYNS S STKI LS FES SDSLVTERPKHQSNE VKINAEPVSQEPNNFQGLLPPQAS P 
VSPPWPYQYPPNPSFYHMPVYWGCAIPWSTLDTSTCLGKRTRDETSHETVKESKNAFER 
TSLLLES QS I KNETSMATNNHVW YP VPMTREKTQE FS FFSNGAETKS SNNRFVPETYLNIi 
QANPAAMARSMNFRES I * 
>G1900 (1..897) 

ATGCTGGAAACTAAAGATCCTGCGATAAAGCTCTTTGGTAT^ 

GTTTTAGAGGTTGCTGATGAAGAAGAAGAAAAGAACCAAAACAAGACATTAACTGATCAA 
TCGGAGAAAGAC^AAACCCTAAAGAAACC^CCAAGATTCTTCCATGTCCAAGATGCAAC 
AGCATGGAGACTAAGTTCTGTTACTACAACAACTACAACGT^ 

TGTAAAGCTTGTCAGAGATATTGGACCTCAGGTGGGACCATGAGAAGTGTTCCAATCGGA 
GCAGGACGGCGCAAGAACAAGAACAACTCACGAACTTCACATO 

TCCGAAACAAATGGTCCGGTCCTTAGTTTCAGCCTCGGAGATGATCAAAAGGTCTCGAGT 
AATAGGTTTGGTAATCAAAAGCTAGTTGCTAGGATAGAGAACAATGACGAGCGCTCTAAT 
AACAACACTTCGAACGGTTTGAATTGTTTTCCGGGAGTTTCGTGGCCGTAC^^ 

CCTGCGTTTTACCCGGTTTACCCTTATTGGAGCATGCCAGTGTTGTCTTCTCCGGTAAGT 
TCAAGTCCTACTTCTACTCITGGTAAGCATTCGAGAGACGAAGACGAGACGGTGAAGCAA 
AAACAGAGGAATGGATCTGTATTGGTTCCAAAGACTTTGAGAATTGATGATCCTAATGAA 
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GCTGCAAAGAGTTCGATATGGACAACACTTGGGATCAAGAACGAAGTTATGTTCAATGGG 
TTTGGTTCGAAGAAAGAGGTTAAGCTCAGTAACAAAGAAGAAACAGAGACCTCACTTGTT 
CTTTGTGCAAACCCTGCTGCGTTATCAAGATCAATCAATTTCCATGAGCAGATGTGA 
>G1900 Amino Acid Sequence (domain in AA coordinates: 54-106) 

MLETKDPAIKLFGMKIPFPTVLEVADEEEEKNQNKTLTDQSEKDKTLKKPTKILPCPRCN 

SMETKFCYYNNYNVNQPRHFCKACQRYWTSGGTMRSVPIGAGRRKNKNNSPTSHYHHVTI 

SETNGPVLSFSLGDDQKVSSNRFGNQKLVARIENNDERSNNNTSNGLNCFPGVSWPYTWN 

PAFYPVYPYWSMPVLSSPVSSSPTSTLGKHSRDEDETVKQKQRNGSVLVPKTLRIDDPNE 

AAKSSIWTTLGIKNEVMFNGFGSKKEVKLSNKEETETSLVLCANPAALSRSINFHEOM* 
• >G2007 (1. .861) 

ATGGGAAGGCAGCCATGTTGTGACAAGCTCATGGTGAAGAAGGGGCCGTGGACGGCGGAG 
GAAG ACAAG AAACTG ATAAACTTTATCTTG ACC AACGG CCACTGTTG CTGG AGGG CTTTG 
CCGAAGCTGGCCGGTCTCCGTCGCTGTGGGAAGAGCTGCCGTCTACGGTGGACCAATTAT 
CTCCGACCTGACTTGAAGAGAGGTCTTCTCTCCGACGCCGAGGAACAGCTTGTCATCGAC 
CTTCATGCTCTTCTCGGCAACAGATGGTCCAAGATCGCTGCAAGATTACCAGGAAGAACA 
GACAACGAAATAAAAAATCATTGGAATACTCATATCAAGAAGAAGCTCCTTAAGATGGAA 
ATCGATCCTTCGACCCATCAACCTTTAAACAAAGTATTTACCGATACAAACTTAGTCGAT 
AAATCTGAAACTTCATCGAAAGCCGACAATGTAAATGATAATAAAATCGTAGAGATCGAT 
GGGACAACGACAAATACAATAGATGATAGCATTATCACTCATCAAAATAGTTCAAATGAT 
GATTATGAATTACTTGGTGATATAATTCATAATTATGGAGATTTATTTAATATTCTATGG 
ACCAACGATGAACCTCCTCTAGTCGATGATGCATCATGGAGCAATCATAACGTTGGTATT 
GGAGGAACAGCTGCAGTTGCAGCCTCAGACAAGAACAACACTGCTGCCGAGGAAGATTTC 
CCGGAAAGATCATTTGAAAAACAGAACGGCGAAAGTTGGATGTTCTTGGATTATTGCCAA 

GAATTTGGTGTTGAAGATTTTGGGTTCGAGTGTTACCATGGTTTTGGTCAAAGCTCCATG 
AAGACGGGTCACAAGGACTAG 

>G2007 Amino Acid Sequence (domain in AA coordinates- TBD) 

MGRQPCCDKLMVKKGPWTAEEDKKLINFILTNGHCCWRALPKLAGLRRCGKSCRLRWTNY 
LRPDLKRGLLSDAEEQLVIDLHALLGNRWSKIAARL^ 

IDPSTHQPLNKVFTDTNLVDKSETSSKADNVNDNKIVEIDGTTTNTIDDSIITHQNSSN^ 
DYELLGDIIHNYGDLFNILWTNDEPPLVDDASWSNHNVGIGGTAAVAASDKNNTAAEEDF 
PERSFEKQNGESWMFLDYCQEFGVEDFGFECYHGFGQSSMKTGHKD* 
>G214 (238.. 2064) 

TGAGATTTCTCCATTTCCGTAGCTTCTGGTCTCTTTTCTTTGTTTCATTGATCAAAAGCA 

AATCACTTCTTCTTCTTCTTCTTCTCGATTTCTTACTGTTTTCTTATCCAACGAAATCTG 

GAATTAAAAATGGAATCTTTATCGAATCCAAGCTGATTTTGTTTCTTTCATTGAATCATC 

TCTCTAAAGTGGAATTTTGTAAAGAGAAGATCTGAAGTTGTGTAGAGGAGCTTAGTGATG 

GAGACAAATTCGTCTGGAGAAGATCTGGTTATTAAGACTCGGAAGCCATATACGATAACA 

AAGCAACGTGAAAGGTGGACTGAGGAAGAACATAATAGATTCATTGAAGCTTTGAGGCTT 

TATGGTAGAGCATGGCAGAAGATTGAAGAACATGTAGCAACAAAAACTGCTGTCCAGATA 

AGAAGTCACGCTCAGAAATTTTTCTCCAAGGTAGAGAAAGAGGCTGAAGCTAAAGGTGTA 

GCTATGGGTCAAGCGCTAGACATAGCTATTCCTCCTCCACGGCCTAAGCGTAAACCAAAC 

AATCCTTATCCTCGAAAGACGGGAAGTGGAACGATCCTTATGTCAAAAACGGGTGTGAAT 

GATGGAAAAGAGTCCCTTGGATCAGAAAAAGTGTCGCATCCTGAGATGGCCAATGAAGAT 

CGACAACAATCAAAGCCTGAAGAGAAAACTCTGCAGGAAGACAACTGTTCAGATTGTTTC 

ACTCATCAGTATCTCTCTGCTGCATCCTCCATGAATAAAAGTTGTATAGAGACATCAAAC 

GCAAGCACTTTCCGCGAGTTCTTGCCTTCACGGGAAGAGGGAAGTCAGAATAACAGGGTA 

AGAAAGGAGTCAAACTCAGATTTGAATGCAAAATCTCTGGAAAACGGTAATGAGCAAGGA 

CCTCAGACTTATCCGATGCATATCCCTGTGCTAGTGCCATTGGGGAGCTCAATAACAAGT 

TCTCTATCACATCCTCCTTCAGAGCCAGATAGTCATCCCCACACAGTTGCAGGAGATTAT 

CAGTCGTTTCCTAATCATATAATGTCAACCCTTTTACAAACACCGGCTCTTTATACTGCC 

GCAACTTTCGCCTCATCATTTTGGCCTCCCGATTCTAGTGGTGGCTCACCTGTTCCAGGG 

AACTCACCTCCGAATCTGGCTGCCATGGCCGCAGCCACTGTTGCAGCTGCTAGTGCTTGG 

TGGGCTGCCAATGGATTATTACCTTTATGTGCTCCTCTTAGTTCAGGTGGTTTCACTAGT 

CATCCTCCATCTACTTTTGGACCATCATGTGATGTAGAGTACACAAAAGCAAGCACTTTA 

CAACATGGTTCTGTGCAGAGCCGAGAGCAAGAACACTCCGAGGCATCAAAGGCTCGATCT 

TCACTGGACTCAGAGGATGTTGAAAATAAGAGTAAACCAGTTTGTCATGAGCAGCCTTCT 

GCAACACCTGAGAGTGATGCAAAGGGTTCAGATGGAGCAGGAGACAGAAAACAAGTTGAC 
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CGGTCCTCGTGTGGCTCAAACACTCCGTCGAGTAGTGATGATGTTGAGGCGGATGCATCA 
GAAAGGCAAGAGGATGGCACCAATGGTGAGGTGAAAGAAACGAATGAAGACACTAATAAA 
CCTCAAACTTCAGAGTCCAATGCACGCCGCAGTAGAATCAGCTCCAATATAACCGATCCA 
TGGAAGTCTGTGTCTGACGAGGGTCGAATTGCCTTCCAAGCTCTCTTCTCCAGAGAGGTA 
TTGCCGCAAAGTTTTACATATCGAGAAGAACACAGAGAGGAAGAACAACAACAACAAGAA 
CAAAGATATCCAATGGCACTTGATCTTAACTTCACAGCTCAGTTAACACCAGTTGATGAT 
CAAGAGGAGAAGAGAAACACAGGATTTCTTGGAATCGGATTAGATGCTTCAAAGCTAATG 
AGTAGAGGAAGAACAGGTTTTAAACCATACAAAAGATGTTCCATGGAAGCCAAAGAAAGT 
AGAATCCTCAACAACAATCCTATCATTCATGTGGAACAGAAAGATCCCAAACGGATGCGG 
TTGGAAACTCAAGCTTCCACATGAGACTCTATTTTCATCTGATCTGTTGTTTGTACTCTG 
TTTTTMGTTTTCAAGACCACTGCTACATTTTCTTTTTCTTTTGAGGCCTTTGTATTTGT 
TTCCTTGTCCATAGTCTTCCTGTAACATTTGACTCTGTATTATTCAACAAATCATAAACT 
GTTTAATCTTTTTTTTTCCA 

>G214 Amino Acid Sequence {domain in AA coordinates: 22-71) 

METNSSGEDLVIKTRKPYTITKQRERWTEEEHNRFIEALRLYGRAWQKIEEHVATKTAVQ 

IRSHAQKFFSKA/EICEAEAKGVAMGQALDIAIPPPRPKRKPNNPYPRKTGSGTILMSKTGV 

NDGICESLGSEKVSHPEMANEDRQQSKPEEKTLQEDNCSDCFTHQYLSAASSMNKSCIETS 

NASTFREFLPSREEGSQNNRVRKESNSDLNAKSLENGNEQGPQTYPMHIPVLVPLGSSIT 

SSLSHPPSEPDSHPHTVAGDYQSFPNHIMSTLLQTPALYTAATFASSFWPPDSSGGSPVP 

GNSPPNLAAMAAATVAAASAWWAANGLLPLCAPLSSGGFTSHPPSTFGPSCDVEYTKAST 

LQHGSVQSREQEHSEASKARSSLDSEDVENKSKPVCHEQPSATPESDAKGSDGAGDRKQV 

DRSSCGSNTPSSSDDVEADASERQEDGTNGEVKETNEDTNKPQTSESNARRSRISSNITD 

PWKSVSDEGRIAFQALFSREVLPQSFTYREEHREEEQQQQEQRYPMALDLNFTAQLTPVD 

DQEEKRNTGFLGIGLDASKLMSRGRTGFKPYKRCSMEAKES 

RLETQAST* 

>G2155 (63.. 740) 

CTCATATATACCAACCAAACCTCTCTCTGCATCTTTATTAACACAAAATTCCAAAAGATT 
AT^ATGTTGTCGAAGCTCCCTACACAGCGACACTTGCACCTCTCTCCCTCCTCTCCCTCCA 
TGGAAACCGTCGGGCGTCCACGTGGCAGACCTCGAGGTTCCAAAAACAAACCTAAAGCTC 
CAATCTTTGTCACCATTGACCCTCCTATGAGTCCTTACATCCTCGAAGTGCCATCCGGAA 
ACGATGTCGTTGAAGCCCTAAACCGTTTCTGCCGCGGTAAAGCCATCGGCTTTTGCGTCC 
TCAGTGGCTCAGGCTCCGTTGCTGATGTCACTTTGCGTCAGCCTTCTCCGGCAGCTCCTG 
GCTCAACCATTACTTTCCACGGAAAGTTCGATCTTCTCTCTGTCTCCGCCACTTTCCTCC 
CTCCTCTACCTCCTACCTCCTTGTCCCCTCCCGTCTCCAATTTCTTCACCGTCTCTCTCG 
CCGGACCTCAGGGGAAAGTCATCGGTGGATTCGTCGCTGGTCCTCTCGTTGCCGCCGGAA 
CTGTTTACTTCGTCGCCACTAGTTTCAAGAACCCTTCCTATCACCGGTTACCTGCTACGG 
AGGAAGAGCAAAGAAACTCGGCGGAAGGGGAAGAGGAGGGACAATCGCCGCCGGTCTCTG 
GAGGTGGTGGAGAGTCGATGTACGTGGGTGGCTCTGATGTCATTTGGGATCCCAACGCCA 
AAGCTCCATCGCCGTACTGACCACAAATC 

TAGATCATCAAGAATCAACAAAAAGATTGCATTTTTAGATTCTTTGTAATATCATAATTG 
ACTCACTCTTTAATCTCTCTATCACTTCTTCTTO 

CATATTTGTAGTTTGATTTGACTATCCCCAAGTTTTGTATTTTATCATACAAA 

CTGTCTCTAATGGTTGTTTTTTCGTTTGTATAATCTTATGCATO 

GAGATTGAATGTATAATATAATGGTTTAAT 

>G2155 Amino Acid Sequence (domain in AA coordinates : 18-3 8) 

MLSKLPTQRHLHLSPSSPSMETVGRPRGRPRGSKNKPKAPIFVTIDPPMSPYILEVPSGN 

DVVEALNRFCRGKAIGFCVLSGSGSVADVTLRQPSPAAPGSTITFHGKFDLLSVSATFLP 

PLPPTSLSPPVSNPFTVSLAGPQGKVIGGFVAGPLVAAGTVYFVATSFKNPSYHRLPATE 

EEQRNSAEGEEEGQSPPVSGGGGESMYVGGSDVIWDPNAKAPSPY* 

>G234 (106.. 1035) 

CACAACATCATACCCACCAACATATATAATGTTGATCATAGAGAGATAAACAGAGGCCGC 
TATCAAGAACAAGACTAAGAACAAGACTTCACTAGGAGTACAAGTATGGGAAGAGCACCG 
TGTTGTGACAAAGCAAACGTGAAGAAAGGGCCTTGGTCTCCTGAGGAAGATGCAAAACTC 
AAATCTTACATTGAAAATAGTGGCACCGGAGGCAATTGGAT^ 

GGTTTAAAGAGATGTGGAAAGAGTTGCAGGCTGAGGTGGCTTAACTATCTTAGACCAAA 

ATCAAACATGGTGGCTTCTCTGAGGAAGAAGAAAACATCATTTGTAGCCTTTACCTTACA 

ATTGGTAGCAGGTGGTCTATAATCGCTGCTCAATTGCCGGGACGAACAGACAACGATATA 
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AAAAACTATTGGAACACGAGGCTCAAGAAGAAACTCATTAACAAACAACGCAAGGAGCTT 

CAAGAAGCTTGTATGGAGCAGCAAGAGATGATGGTGATGATGAAGAGACAACACCAACAA 

CAACAAATCCAAACTTCTTTTATGATGAGACAAGACCAAACAATGTTCACATGGCCACTA 

CATCATCATAATGTTCAAGTTCCAGCTCTTTTCAGAATCAAACCAACTCGTTTTGCGACC 

AAGAAGATGTTAAGCCAGTGCTCATCAAGAACATGGTCAAGATCGAAGATCAAGAACTGG 

AGAAAACAAACCTCATCATCATCAAGATTCAATGACAACGCTTTTGATCATCTCTCTTTC 

TCTCAACTCTTGTTAGATCCTAATCATAACCACTTAGGATCAGGAGAGGGTTTCTCCATG 

AACTCTATCTTGAGCGCCAACACAAACTCTCCATTGCTTAACACAAGTAATGATAATCAG 

TGGTTCGGGAATTTCCAGGCCGAAACCGTAAACTTGTTCTCAGGAGCCTCCACAAGTACT 

TCGGCAGATCAAAGCACTATAAGTTGGGAAGACATAAGCTCTCTTGTTTATTCTGATTCA 

AAGCAATTTTTTTAATTATAATAATATATTATTCTTAAGATGAAACGTACATCATTATTA 

TTAATTGGGGGTACGTAACGTATATATGGAATAACGATCTAGTTTGTTTAAATTTAAAA 

>G234 Amino Acid Sequence (domain in AA coordinates: 14-115) 

MGRAPCCDKANVKKGPWSPEEDAKLKSYIENSGTGGNWIALPQKIGLKRCGKSCRLRWLN 

YLRPNIKHGGFSEEEENIICSLYLTIGSRWSIIAAQLPGRTDNDIKNYWNTRLKiOCLINK 

QRKELQEACMEQQEMrmiMKRQHQQQQIQTSFMMRQDQTMFTWPLHHHNVQVPALFRIKP 

TRFATKKMLSQCSSRTWSRSKIKNWRKQTSSSSRFNDWAFDHLSFSQLLLDPNHNHLGSG 

EGFSMNSILSAJ^TNSPLLNTSNDNQWFGNFQAETVNLFSGASTSTSADQSTISWEDISSL 
VYSDSKQFF* 

>G361 (54.. 647) 

TCTGTCTCTCTCTCTCTCTTTGTAAATATACATATATAGATAAGCTCACATATATGGCGA 

CTGAAACATCTTCTTTGAAGCTCTTCGGTATAAACCTACTTGAAACGACGTCGGTTCAAA 

ACCAGTCATCGGAACCAAGACCCGGATCCGGATCAGGATCCGAGTCACGTAAGTACGAGT 

GTCAATACTGTTGTAGAGAGTTTGCTAACTCTCAAGCTCTTGGTGGTCACCAAAACGCTC 

ACAAGAAAGAGCGTraGCTTCTTAAACGTGCACAGATGTTAGCTACTCGTGGTTTGCCAC 

GTCATCATAATTTTCACCCTCATACCAATCCGCTTCTCTCCGCCTTCGCGCCGCTGCCTC 

ACCTCCTCTCTCAGCCGCATCCTCCGGCGCATATGATGCTCTCTCCTTCTTCTTCGAGTT 

CTAAGTGGCTTTACGGTGAACACATGTCGTCACAAAACGCCGTTGGGTACTTTCATGGTG 

GAAGGGGACTTTACGGAGGTGGCATGGAGTCTATGGCCGGAGAAGTAAAGACTCATGGTG 

GTTCTTTGCCGGAGATGAGGAGGTTCGCCGGAGATAGTGATCGGAGTAGCGGAATTAAGT 

TAGAGAATGGTATTGGGCTGGACCTCCATTTAAGCCTTGGGCCATGAATGATTATAATTT 

TGGCCCAGTAAAGATCTGTAAAATACTACTAGGATTTCATTTTTATAGAGTATGTTTTTT 

TCCTTAATTTCGGTTGAAATTGGTGAATATTTTTATCTCTTACTTACCAAATCTCATATT 

TCTATGTATGCGTTTGCTTTCACTTTTTTTTTTTATATAATTCTTCTTGTAAAAAATGCA 

ATGTGAGTTTTCTTCCCTATCATTCTGTCAAGCTTTGGTTCAATTATTTAGTAATCGAAT 
AATATAGGAATAGTGTTGAAAG 

>G361 Amino Acid Sequence (domain in AA coordinates: 43-63) 

MATETSSLKLFGINLLETTSVQNQSSEPRPGSGSGSESRKYECQYCCREFANSQALGGHQ 

NAHKKERQLLKRAQMLATRGLPRHHNFHPHTNPLLSAFAPLPHLLSQPHPPPHMMLSPSS 

SSSKWLYGEHMSSQNAVGYFHGGRGLYGGGMESMAGEVKTHGGSLPEMRRFAGDSDRSSG 
I KLENGI GLDLHLSLGP * 

>G562 (137.. 1285) 

ATTTGAATTTCTGGGTTTCTCTCTGTTTAAGCTTCTTCTTCTTCATCTTCTGCTTACGTT 
TCTTCTTCAAGGAGCTTTCGGATTCTTGTAGAAAGAGTCATTGTTCTCTTGAGTGGGAAA 
CCTTGAAACCATTCCTATGGGAAATAGCAGCGAGGAACCAAAGCCTCCTACCAAATCAGA 
TAAACCATCTTCACCCCCGGTGGATCAAACAAATGTTCATGTCTACCCTGATTGGGCAGC 
TATGCAGGC^TATTATGGTCCAAGAGTAGCAATGCCTCCTTATTACAATTCAGCTATGGC 
TGCATCTGGTCATCeTCCTCCTCCTTACATGTGGAATCCTCAGCATATGATGTCACCATC 
TGGAGCACCCTATGCTGCTGTTTATCCTCATGGAGGAGGAGTTTACGCTCATCCCGGTAT 
TCCCATGGGATCACTGCCTC7VAGGTCAAAAGGATCCACCTTTAACAACTCCGGGGACGCT 
TTTGAGCATCGACACTCCTACTAAATCTACAGGGAACACAGACAATGGATTGATGAAGAA 
GCTGAAAGAGTTTGATGGGCTTGCTATGTCTCTAGGAAATGGGAATCCTGAAAATGGTGC 
AGATGAACATAAACGATCACGGAACAGCTCAGAAACTGATGGTTCTACTGATGGAAGTGA 
TGGGAATACAACTGGGGCAGATGAACCGAAACTTAAAAGAAGTCGAGAGGGAACTCCAAC 
AAAAGATGGGAAAC^TTGGTTC^GCTAGCTCATTTCATTCTGTTTCTCCGTCAAGTGG 
TGATACCGGCGTAAAACTCATTCAAGGATCTGGAGCTATACTCTCTCCTGGTGTAAGTGC 
AAATTCCAACCCCTTCATGTC^CAATCTTTAGCCATGGTTCOTCCTGAAACTTGGCTO 
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GAACGAGAGAGAACTGAAACGGGAGCGAAGGAAACAGTCTAATAGAGAATCTGCTAGAAG 

GTCAAGATT7iAGGAAACAGGCCGAGACAGAAGAACTTGCTAGGAAAGTGGAAGCCTTGAC 

AGCCGAAAACATGGCATTAAGATCTGAACTAAACCAACTTAATGAGAAATCTGATAAACT 

AAGAGGAGCAAATGCAACCTTGTTGGACAAACTGAAATGCTCGGAACCCGAAAAGAGAGT 

CCCCGCAAATATGTTGTCTAGAGTTAAGAACTCAGGAGCTGGAGATAAGAACAAGAACCA 

AGGAGACAATGATTCTAACTCTACAAGCAAATTCCATCAACTGCTCGATACGAAGCCTCG 

AGCTAAAGCAGTAGCTGCAGGCTGAATCGATGGTAATTCATGTCGATTTCTACTTAATTT 

GTCGACATAAACAAAGAAAATAAGTGCTACTAATTTCAG7VAAAACTTGATAGATAGATAG 

TATAGTAGAGAGAGAGAGAGAGAGAGAGGTGTGATGATTATTGATCTATAAATTTTCGGA 

GAGAGAGAGGGAGAAAGAGAAACTTTTCCTCCAGATGAAAATTTGGTGTTATGGTTTGTT 

ACTGTTAATATAGAGAGGCTTTTCTTTTTTTATAAAATGGCTTCCTTTGTTGCA 

>G562 Amino Acid Sequence (domain in AA coordinates: 253-315) 

MGNSSEEPKPPTKSDKPSSPPVDQTNVHVYPDWAAMQAYYGPRVAMPPYYWSAMAASGHP 

PPPYMWNPQHMMSPSGAPYAAVYPHGGGVYAHPGIPMGSLPQGQKDPPLTTPGTLLSIDT 

PTKSTGNTDNGLMKKLKEFDGLAMSLGNGNPENGADEHKRSRNSSETDGSTDGSDGNTTG 

ADEPKLKRSREGTPTKDGKQLVQASSFHSVSPSSGDTGVKLIQGSGAILSPGVSANSNPF 

MSQSLAMVPPETWLQNERELKRERRKQSNRESARRSRLRKQAETEEI^KVEALTAENMA 

LRSELNQLNEKSDKLRGANATLLDKLKCSEPEKRVPANMLSRVraSGAGDKNKNQGDNDS 
NSTSKFHQLLDTKPRAKAVAAG* 

>G591 (88.. 1020) 

GTAAATCTCTCTTTGAAGGTTCCTAACTCGTTAATCGTAACTCACAGTGACTCGTTCGAG 

TCAAAGTCTCTGTCTTTAGCTCAAACCATGGCTAGTAACAACCCTCACGACAACCTTTCT 

GACCAAACTCCTTCTGATGATTTCTTCGAGCAAATCCTCGGCCTTCCTAACTTCTCAGCC 

TCTTCTGCCGCCGGTTTATCTGGAGTTGACGGAGGATTAGGTGGTGGAGCACCGCCTATG 

ATGCTGCAGTTGGGTTCCGGAGAAGAAGGAAGTCACATGGGTGGCTTAGGAGGAAGTGGA 

CCAACTGGGTTTCACAATCAGATGTTTCCTTTGGGGTTAAGTCTTGATCAAGGGAAAGGA 

CCTGGGTTTCTTAGACCTGAAGGAGGACATGGAAGTGGGAAAAGATTCTCAGATGATGTT 

GTTGATAATCGATGTTCTTCTATGAAACCTGTTTTCCACGGGCAGCCTATGCAACAGCCA 

CCTCGATCGGCCCCACATCAGCCTACTTCAATCCGTCCCAGGGTTCGAGCTAGGCGTGGT 

CAGGCTACTGATCCACATAGCATCGCTGAGCGGCTACGTAGAGAAAGAATAGCAGAACGG 

ATCAGGGCGCTGCAGGAACTTGTACCTACTGTGAACAAGACCGATAGAGCTGCTATGATC 

GATGAGATTGTCGATTATGTAAAGTTTCTCAGGCTCCAAGTCAAGGTTTTGAGCATGAAC 

CGACTTGGTGGAGCCGGTGCGGTTGCTCCACTTGTTACTGATATGCCTCTTTCATCATCA 

GTTGAGGATGAAACGGGTGAGGGTGGAAGGACTCCGCAACCAGCGTGGGAGAAATGGTCT 

AACGATGGGACTGAACGTCAAGTGGCTAAACTGATGGAAGAGAACGTTGGAGCCGCGATG 

CAGCTTCTTCAATCAAAGGCTCTTTGTATGATGCCAATCTCATTGGCAATGGCAATTTAC 

CATTCTCAACCTCCGGATACATCTTCAGTGGTCAAGCCTGAGAACAATCCTCCACAGTAG 
GATTTCTGCAATAAAGAGTTTGTACAGCT^ 

GCTCTAATGACTCTGGTTTCTTCTCTCCTCTCTCACCGACTTGAAAGGTAAAAAAGTGAA 
AAAGGCTTTGTAGATGGAATCAATGTAGGATTTGCAGTAGAGGGCAAAAAAATGTCATAT 
AGCTCAATTGATCAAGTCTTAAAAAAAAAAAAAAAAAAAA 

>G591 Amino Acid Sequence (domain in AA coordinates: 143-240) 
MASNNPHDNLSDQTPSDDFFEQILGLPNFSASSAAGLSGVDGGLGGGAPPMMLQLGSGEE 
GSHMGGLGGSGPTGFHNQMFPLGLSLDQGKGPGFLRPEGGHGSGKRFSDDWDNRCSSMK 
PVFHGQPMQQPPPS APHQPTS IRPRVRARRGQATDPHS I AERLRRERI AERIRALQELVP 
TVNKTDRAAMIDE I VD YVKFLRLQVKVLSMNRLGGAGAVAPLVTDMPLS S S VEDETGEGG 

RTPQPATOKWSNDGTERQVAKLMEENVGAAMQLLQSKALCTMPISLAI^ 
WKPENNPPQ* - 

>G8 (247.. 1596) 

AAAAAAAAATATCCGTCTCACTCTCTCGCCGCCGGTAACATTTCCCGGCGACAAAACTTC 
TCTACTCTCACCATTCCTCCATCGTAATCTC 

CGATCATCTCGAGCTCTTCGTGAGAGATTATGTGATTATGTAATCGTTGTTGCTGTAG^ 

GACGATCTCTAACAACTGATTCCTTCATCATCACCTTCGCTAGATTTGTAATTCT 

CTTGAGATGTTGGATCTTAACCTCAACGCTGATTCTCCCGAGTCGACTCAGTACGGTGGT 

GACTCATACTTAGATCGGCAGACATCAGACAACTCCGCCGGGAATCGAGTGGAAGAGTCC 

GGTACATCGACGTCGTCAGTTATCAATGCCGATGGAGACGAAGACTCTTGCTCTACTCGA 

GCTTTCACTCTCTVGTTTCGATATTTTAAAAGTCGGAAGTAGTAGCGGCGGAGAC^ 
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CCCGCCGCTTCAGCTTCCGTTACTAAAGAGTTTTTTCCGGTGAGTGGAGACTGTGGACAT 
CTACGAGATGTTGAAGGATCATCAAGCTCTAGAAACTGGATAGATCTTTCTTTTGACCGT 
ATTGGTGACGGAGAAACGAAATTGGTAACTCCGGTTCCGACTCCGGCTCCGGTTCCGGCT 
CAGGTTAAAAAGAGTCGGAGAGGACCAAGGTCTAGAAGTTCACAGTATAGAGGAGTTACT 
TTTTATAGAAGAACTGGTCGATGGGAGTCACATATTTGGGATTGTGGGAAAC7VAGTTTAT 
TTAGGTGGTTTCGACACTGCTCATGCTGCAGCTAGAGCTTATGATCGAGCTGCTATTAAA 
TTTAGAGGTGTTGATGCTGATATCAACTTTACTCTTGGTGATTATGAGGAAGATATGAAA 
CAGGTACAAAACTTGAGTAAGGAAGAGTTTGTGCATATACTGCGTAGACAGAGCACGGGG 
TTTTCGCGGGGGAGTTCGAAGTATCGAGGGGTTACGTTACACAAATGTGGTAGATGGGAA 
GCTAGGATGGGGCAGTTTCTTGGTAAAAAGGCTTATGACAAGGCTGCAATCAACACTAAT 
GGTAGAGAAGCAGTCACGAACTTCGAGATGAGTTCATACCAAAATGAGATTAACTCTGAG 
AGCAATAACTCTGAGATTGACCTCAACTTGGGAATCTCTTTATCGACCGGTAATGCGCCA 
AAGCAAAATGGGAGGCTCTTTCACTTCCCTTCTAATACTTATGAAACTCAGCGTGGAGTT 
AGCTTGAGGATAGATAACGAATACATGGGAAAGCCGGTGAATACACCTCTTCCTTATGGA 
TCCTCGGATCATCGCCTTTACTGGAACGGAGCATGCCCGAGTTATAATAATCCCGCCGAG 
GGAAGAGCAACAGAAAAG AGAAGTG AAG CTGAAGGG ATGATGAGTAACTGGGGATGG CAG 
AGACCGGGGCAAACAAGCGCCGTGAGACCGCAGCCACCGGGACCACAACCACCACCATTG 
TTCTCAGTTGCAGCAGCATCATCAGGATTCTCACATTTCCGGCCACAACCTCCCAATGAC 
AATGCAACACGTGGTTACTTTTATCCACACCCTTAACTTGTAAGGGGACATATGAGAGTT 
TTTTTACCATCTCTCTCTCTCTCAACACTCTAGTCCCCTTTCAAAAATGTCATTTGGGTT 
TTAGATTTTTCACATACAATGATCAATTTTTCC 

>G8 Amino Acid Sequence (domain in AA coordinates: 151-217, 243-296) 

MLDLNLNADSPESTQYGGDSYLDRQTSDNSAGNRVEESGTSTSSVINADGDEDSCSTRAF 

TLSFDILKVGSSSGGDESPAASASVTICEFFPVSGDCGHLRDVEGSSSSRNWIDLSFDRIG 

DGETKLVTPVPTPAPVPAQVKKSRRGPRSRSSQYRGVTFYRRTGRWESHIWDCGKQVYLG 

GFDTAHAAARAYDRAAIKFRGVDADINFTLGDYEEDMKQVQNLSKEEFVHILRRQSTGFS 

RGSSKYRGVTLHKCGRWEARMGQFLGKKAYDKAAINTNGREAVTNFEMSSYQNEINSESM 

NSEIDLNLGISLSTGNAPKQNGRLFHFPSNTYETQRGVSLRIDNEYMGKPVWTPLPYGSS 

DHRLYWNGACPSYimPAEGRATEKRSEAEGMMSNWGWQRPGQTSAVRPQPPGPQPPPLFS 

VAAASSGFSHFRPQPPNDNATRGYFYPHP* 

>G859 (162. .752) 

GATTTGTCATTTTTTGTCTAGCCAAAAAAAAAAAAAAAAAAGGAGAGAGAGAGAGA'GAGA 
GAGAGAGAGAGAAACGAAGAAAAAAAAAGAAGCAAAAAACATTGTGGGTCTCCGGTGATT 
AGGATC7VAATTAGGGCACCAGCCTTATCGGAGGAAGAAGCCATGGGTAGAAAAAAAGTCG 
AGATCAAGCGAATCGAGAACAAAAGTAGTCGACAAGTCACTTTCTCCAAACGACGCAATG 
GTCTCATCGAGAAAGCTCGACAACTTTCAATTCTCTGTGAATCTTCCATCGCTGTTCTCG 
TCGTCTCCGGCTCCGGAAAACTCTACAAGTCTGCCTCCGGTGACAACATGTCAAAGATCA 
TTGATCGTTACGAAATACATCATGCTGATGAACTTGAAGCCTTAGATCTTGCAGAAAAAA 
CTCGGAATTATCTGCCACTCAAAGAGTTACTAGAAATAGTCC^AAGCAAGCTTGAAGAAT 
CAAATGTCGATAATGCAAGTGTGGATACTTTAATTTCTCTGGAGGAACAGCTCGAGACTG 
CTCTGTCCGTAACTAGAGCTAGGAAGACAGAACTAATGATGGGGGAAGTGAAGTCCCTTC 
AAAAAACGGAGAACTTGCTGAGAGAAGAG7UVCCAGACTTTGGCTAGCCAGGTGGGGAAGA 
AGACGTTTCTGGTTATAGAAGGTGACAGAGGAATGTCATGGGAAAATGGCTCCGGCAACA 
AAGTACGGGAGACTCTTCCGCTGCTCAAGTAATCACCATCATCAACGGCTGAGCTTTCAC 
CTTAAACTTACAG CCTG ATTCAGAAGTTTTTACAAATTTGTAAATTATAAAAAG CTTCAT 
AATAATCTGAACCTTTTTATCTTCCTCGCGCCAATGTGGAAATTAAGGTTAAAAATAAAA 
TAAAACAGAAGCTCATGCGAAAGAATTGTAAAACTAAGATAAAGCTATAGTAGATCTTTA 
TTGTACCTTCGTAGACGATATAAGATTTATTCGTGTGTTTGTCTTCCCCTCNAAAAAAAA 
AAAAAAAAAAAAAAAA 

>G859 Amino Acid Sequence (domain in AA coordinates; TBD) 
MGRKKVEI KRIENKSSRQVTFSKRRNGLIBKARQLS I LCESS IAVLWSGSGKLYKSASG 
DNMSKIIDRYEIHHADELEALDLAEKTRNYLPIiKELLElVQSKLEESNVDNASVDTLIS 

EEQLETALSVTRARKTELMMGEVKSLQKTENLLREENQTLASQVGKKTFLVIEGDRGMSW 

ENGSGNKVRETLPLLK* 

>G878 (197.. 1738) 

CAAAAAAAATCTCTCCCATTAAAAGACTGCCCAAAGTU^TATTTTATACAAAATGAAAGA 
GAGAAACACGACACGAATTTTGTATAATTAAGATTACACAAAAAAAAGTGTTAGAAAGAG 
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AAATATCTTCTTCTTTTTTCTGTGTGAGTTGGGTTTGTTAAAGTTTTATCCTTTTTGTTC 

TCAAAATCAAGMTCGATGGCGGAGAAGGAAGAAAAAGAACCATCGAAGTTAAAATCATC 

CACCGGAGTTTCACGGCCAACGATTTCACTACCTCCTCGACCGTTTGGTGAAATGTTTTT 

TAGCGGTGGCGTTGGATTTAGTCCTGGACCAATGACTCTCGTCTCAAATTTATTCTCTGA 

TCCTGATGAGTTCAAGTCTTTCTCTCAGCTTTTAGCTGGAGCTATGGCTTCTCCGGCGGC 

AGCTGCTGTTGCCGCCGCTGCTGTGGTTGCTACTGCTCATCATCAGACACCTGTGAGCTC 

TGTCGGTGATGGCGGTGGAAGCGGTGGTGATGTTGACCCGAGGTTTAAGCAGAGTAGACC 

AACGGGATTGATGATAACTCAACCACCGGGGATGTTTACTGTACCGCCGGGGTTAAGTCC 

GGCTACTCTTTTGGATTCTCCGAGCTTCTTTGGTCTTTTTTCACCTCTTCAGGGAACATT 

TGGTATGACACATC7VACAAGCTTTAGCACAAGTCACTGCACAAGCAGTTCAAGGCAATAA 

TGTTCATATGCAGCAATCACAACAATCTGA^TATCCTTCTTCTACACAACAACAACAACA 

ACAACAACAACAAGCTTCATTGACTGAGATTCCATCATTTTCTTCTGCACCTAGGTCTCA 

GATTCGAGCCTCGGTTCAAGAAACATCGCAGGGTCAGAGAGAGACTTCGGAAATATCTGT 

CTTTGAGCATCGGTCACAGCCTCAAAATGCTGACAAACCAGCTGATGATGGATACAACTG 

GCGGAAATATGGGCAGAAGCAAGTGAAGGGGAGCGATTTTCCTCGGAGTTATTACAAATG 

TACGCATCCAGCTTGTCCTGTCAAGAAGAAAGTGGAGAGGTCACTCGATGGACAAGTAAC 

GGAAATCATCTACAAGGGTCAACACAATCATGAGCTTCCTCAAAAGCGCGGTAACAATAA 

CGGGAGTTGTAAAAGTTCTGATATTGCAAATCAGTTTCAAACAAGTAATAGCAGTCTCAA 

CAAGAGTAAGAGGGACCAGGAAACAAGCCAAGTTACAACAACAGAGCAGATGTCTGAAGC 

AAGTGATAGCGAGGAGGTTGGGAATGCAGAGACTAGTGTGGGAGAAAGACATGAGGATGA 

GCCTGATCCCAAGCGAAGAAATACAGAAGTTCGGGTTTCAGAACCAGTTGCTTCATCGCA 

TAGAACTGTGACAGAGCCTAGGATTATTGTCCAAACGACGAGTGAAGTTGACCTCTTAGA 

TGATGGATATAGGTGGCGCAAGTATGGTCAGAAAGTAGTCAAAGGAAATCCTTATCCGAG 

GAGCTACTATAAGTGTACAACACCAGATTGCGGAGTAAGGAAACATGTAGAGAGAGCAGC 

AACTGACCCAAAAGCTGTTGTAACAACATATGAAGGTAAACATAACCATGATGTTCCAGC 

TGCTAGAACCAGCAGCCATCAGTTAAGACCAAACAATCAACACAACACCTCAACGGTTAA 

CTTCTUVTCATCAAC^GCCTGTTGCACGTTTAAGGCTTAAAGAAGAGCAAATCACTTGACA 

GAGAAGAAGAATACGACGGCGCTTGAGCTTTTGTGAGTTTAATGAATCTTCTTTTTGGTT 

AATGAACCTGTTTTTGTTGCCTCAAAACACCACAGGTTTCTCTGGACAGAATCTCTGATA 

TTACAGTTTCAAAAGGTATGTTCTTTTATTTCATGTTGGAATCTTCTGTGTAATCTTAAG 

AAGCTTTAGGAGGTAATGTAAAAAACCAGATTCAAAGTTATGCCCTTATGTGAATTCTTT 

TGTACATGGGATAAACAAAATTTACAGGTATCCTTTTTGTTCTTGTTGTAAAAAAAAAAA 
AAAA 

>G878 Amino Acid Sequence (domain in AA coordinates : 25 0-305, 415-475) 

MAEKEEKEPSKLKSSTGVSRPTISLPPRPFGEMFFSGGVGFSPGPMTLVSNLFSDPDEFK 

SFSQLLAGAMASPAAAAVAAAAWATAHHQTPVSSVGDGGGSGGDVDPRFKQSRPTGLMI 

TQPPGMFTVPPGLSPATLLDSPSFFGLFSPLQGTFGMTHQQALAQVTAQAVQGNNVHMQQ 

SQQSEYPSSTQQQQQQQQQASLTEIPSFSSAPRSQIRASVQETSQGQRETSEISVFEHRS 

QPQNADKPADDGY1WRKYGQKQVKGSDFPRSYYKCTHPACPVKKKVERSLDGQVTEIIYK 

GQHNHELPQKRGNmGSCKSSDIANQFQTSNSSLNKSKRDQETSQVTTTEQMSEASDSEE 

VGNAETSVGERHEDEPDPKRRNTEVRVSEPVASSHRTVTEPRIIVQTTSEVDLLDDGYRW 

RKYGQKVVKGNPYPRSYYKCTTPDCGVRKHVERAATDPKAWTTYEGKHNHDV 

HQLRPNNQHNTSTVNFNHQQPVARLRLKEEQIT* 

>G971 (131.. 1171) 

TTTTraTTCTTCOT 

AGATTCGATCATGTTGGATCTTAACCTAAAGATCTTTTCTTCTTATAACGAAGATCAAGA 
TCGGAAAGTACCATTAATGATCTCAACCACCGGTGAAGAAGAATCTAACTCATCTTCCTC 
CTCCACAACAGACTCTGCAGCGAGAGATGCTT^ 

CGATGACCTTGTTCCTCCTCCTCCTCCTCCTCCTCATAAAGAAACAGGAGATCTCTTTCC 

GGTGGTGGCTGATGCTCGTCGGAATATAGAATTCTCCGTGGAAGACAGTCACTGGTTGAA 

TCTTTCTTCTTTACAAAGAAATACACAGAAAATGGTGAAGAAGAGCAGAAGAGGACCAAG 

GTCTCGTAGCTCCCAATATCGTGGCGTCACTTTTTACCGTCGCACCGGTCGTTGGGAATC 

TCATAT1TGGGATTGTGGAAAGCAAGTTTATTTGGGCGGGTTTGATACT 

AGCAAGGGCTTACGACCGAGCTGCTATC^AATTCCGTGGTCTCGATGCAGACATCAAT^ 

CGTCGTGGATGATTATAGGCATGACATCGATAAGATGAAGAATTTAAATAAGGTGGAGTT 

CGTGCAAACACTTAGGCGAGAGAGTGCGAGTTTCGGAAGAGGAAGTTCCAAATACAAAGG 
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CTTGGCTCTTC7\AAAATGCACCCAATTCAAAACTCATGATCAGATTCATCTCTTCCAA7VA 
CAGGGGATGGGATGCAGCAGCAATAAAATACAATGAGTTGGGAAAGGGAGAAGGAGCCAT 
GAAGTTTGGTGCCCATATCAAAGGAAATGGTCACAATGATCTTGAACTAAGTCTCGGAAT 
TTCATCATCATCGGAAAGTATAAAGTTGACAACAGGCGATTACTATAAGGGTATCAATCG 
GTCCACGATGGGTTTATACGGTAAGCAATCATCGATATTTTTACCCATGGCAACCATGAA 
ACCTCTGAAGACAGTTGCAGCATCATCAGGATTCCCTTTTATCAGCATGACAAGTTCCTC 
TTCCTCCATGTCCAATTGTTTTGATCCATAGGATCGTTCTACACTCTCTTAACTAATATA 
TATTTTTACTCTATCTGATTATTGTATACAAGGATAAAATTTGATTCTTTCCTTAATGAG 
TGAGAAATATTGGAAGTGTTAAAAAAAAAAAAAAAAAAAAAAA 

>G971 Amino Acid Sequence (conserved domain in aa coordinates: 120-186) 

MLDLNLKIFSSYNEDQDRKVPLMISTTGEEESNSSSSSTTDSAARDAFIAFGILKRDDDL 

VPPPPPPPHKETGDLFPWADARRWIEFSVEDSHWLNLSSLQRNTQKMVKKSRRGPRSRS 

SQYRGVTFYRRTGRWESHIWDCGKQVYLGGFDTAYAAARAYDRAAIKFRGLDADINFVVD 

DYRHDIDKWKNLNKVEFVQTLRRESASFGRGSSKYKGLALQKCTQFKTHDQIHLFQNRGW 

DAAAIK^ELGKGEGAMKFGAHIKGNGHNDLELSLGISSSSESIKLTTGDYYKGINRSTM 

GLYGKQSSIFLPMATMKPLKTVAASSGFPFISMTSSSSSMSNCFDP* 

>G975 (58.. 657) 

ATTACTCATCATCAAGTTCCTACTTTCTCTCTGACAAACATCACAGAGTAAGTAAGAATG 

GTACAGACGAAGAAGTTCAGAGGTGTCAGGCAACGCCATTGGGGTTCTTGGGTCGCTGAG 

ATTCGTCATCCTCTCTTGAAACGGAGGATTTGGCTAGGGACGTTCGAGACCGCAGAGGAG 

GCAGCAAGAGCATACGACGAGGCCGCCGTTTTAATGAGCGGCCGCAACGCCAAAACCAAC 

TTTCCCCTCAACAACAACAACACCGGAGAAACTTCCGAGGGCAAAACCGATATTTCAGCT 

TCGTCCACAATGTCATCCTCAACATCATCTTCATCGCTCTCTTCCATCCTCAGCGCCAAA 

CTGAGGAAATGCTGCAAGTCTCCTTCCCCATCCCTCACCTGCCTCCGTCTTGACACAGCC 

AGCTCCCATATCGGCGTCTGGCAGAAACGGGCCGGTTCAAAGTCTGACTCCAGCTGGGTC 

ATGACGGTGGAGCTAGGTCCCGCAAGCTCCTCCCAAGAGACTACTAGTAAAGCTTCACAA 

GACGCTATTCTTGCTCCGACCACTGAAGTTGAAATTGGTGGCAGCAGAGAAGAAGTATTG 

GATGAGGAAGAAAAGGTTGCTTTGCAAATGATAGAGGAGCTTCTCAATACAAACTAAATC 

TTATTTGCTTATATATATGTACCTATTTTCATTGCTGATTTACAGCCAAAATAATCAATT 

ATACCGTGTATTTTATAGATGTTTTATATTAAAAGGTTGTTAGATATA 

>G975 Amino Acid Sequence (domain in AA coordinates: 4-71) 

MVQTKKFRG VRQRHWG S WVAE I RHPLLKKRI WLGTFETAEE AARAYDE AAVLM S GRNAKT 

NFPLNNNNTGETSEGKTDISASSTMSSSTSSSSLSSILSAKLRKCCKSPSPSLTCLRLDT 

ASSHIGVWQKRAGSKSDSSWVKTVELGPASSSQETTSKASQDAILAPTTEVEIGGSREEV 

LDEEEKVALQMIEELLNTN* 

>G994 (180.. 917) 

TGTATATATAGTTAGTTAGTTGAGATAAACTTGGTTACCACTTTTGTGTGGTCTTTCTTT 
TTCTTTTTCTCCATTTTCCATTTATCGACCCCTTGGGTGTAGCTAATTACTTTCGCGATT 
TTCAAATCCAATAAAGTTTTAATTTGATGAAGCTTTTTTTAAACCATATAATATAAATAA 
TGGGTGGTCGTAAACCATGTTGTGATGAGGTTGGATTAAGAAAGGGTCCATGGACAGTGG 
AAGAAGATGGGAAACTAGTTGATTTCTTAAGGGCACGTGGCAACTGCGGTGGTGGTGGAG 
GAGGATGGTGCTGGAGAGACGTGCCAAAACTGGCGGGGCTAAGGAGGTGTGGCAAAAGTT 
GCCGTCTCCGGTGGACTAATTATCTCCGGCCAGATCTCAAGAGAGGTCTTTTTACTGAAG 
AAGAAATCCAACTAGTCATTGATCTTCATGCTCGCCTTGGCAATAGATGGTCGAAGATTG 
CAGTGGAGTTACCAGGAAGAAC^GACAACGATATCAAATUVTTATTGGAACACTCATATAA 
AGAGGAAGCTTATAAGAATGGGTATTGATCCAAACACACATCGTCGATTTGACCAACAAA 
AAGTCAACGAGGAGGAAACGATATTGGTCAACGATCCAAAGCCTCTGTCTGAGACCGAGG 
TATCTGTTGCTTTGAAGAATGACACGTCAGCAGTGTTATCAGGAAATCTAAACCAATTGG 
CTGACGTGGACGGTGATGATCAGCCGTGGAGCTTTCTAATGGAAAATGACGAAGGAGGAG 
GTGGCGACGCCGCCGGAGAGCTTACGATGCTATTGTCCGGTGACATTACGTCATCATGTT 
CTTCTTCGTCATCTTTGTGGATGAAGTATGGAGAATTCGGATACGAAGATTTAGAACTTG 
GATGTTTCGATGTTTAGAGATTCAAGTATGTTTAATTAGGCCGTAGGTTGATTAATCATA 
AGGTTCATTGACTTCATTCTAGAATTGTGTAGTTGGACCAGTATAAAGAATCAAAGTTAT 
GAAACATTGTAATTTGATTTCCAAATTAATCTAATGAATAAATGTGCTTTGCAAAAAAAA 
AAAAAAAAAAAAAAA 

>G994 Amino Acid Sequence (domain in AA coordinates: 14-123) 
MGGRKPCCDEVGLRKGPWTVEEDGKLVDFLRARGNCGGGGGGWCWRDVPKLAGLRRCGKS 
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CRLRWTNYLRPDLKRGLFTEEEIQLVIDLHARLGNRWSKiAVELPGRTDNDIKNYWNTHI 
KRKLIRMGIDPNTHRRFDQQKVNEEETILVNDPKPLSETEVSVALKNDTSAVLSGNLNQL 
ADVDGDDQPWS FLMENDEGGGGDAAGELTMLLS GD ITS S CSS S S SL WMKYGEFG YEDLEL 
GCFDV* 

>G2347 (81.. 626) 

AGCCCATCCTTCAACATTGCTTCCTAACCAGAAATCCACCATCATCTTCCCACGAATACA 
ACTTAAAGCTTTACCAGAAAATGGAGGGTCAGAGAACACAACGCCGGGGTTACTTGAAAG 
ACAAGGCTACAGTCTCCAACCTTGTTGAAGAAGAAATGGAGAATGGCATGGATGGAGAAG 
AGGAGGATGGAGGAGACGAAGACAAAAGGAAGAAGGTGATGGAAAGAGTTAGAGGTCCTA 
GCACTGACCGTGTTCCATCGCGACTGTGCCAGGTCGATAGGTGCACTGTTAATTTGACTG 
AGGCCAAGCAGTATTACCGCAGACACAGAGTATGTGAAGTACATGCAAAGGCATCTGCTG 
CGACTGTTGCAGGGGTCAGGCAACGCTTTTGTCAACAATGCAGCAGGTTTCATGAGCTAC 
CAGAGTTTGATGAAGCTAAAAGAAGCTGCAGGAGGCGCTTAGCTGGACACAATGAGAGGA 
GGAGGAAGATCTCTGGTGACAGTTTTGGAGAAGGGTCAGGCCGGAGAGGGTTTAGCGGTC 
AACTGATCCAGACTCAAGAAAGAAACAGGGTAGACAGGAAACTTCCTATGACCAACTCAT 
CATTCAAGCGACCACAGATCAGATAAACCCTCCCGCTCTCTCTCTTCTGTCATCTACATA 
TGCTCTATCTACACTCTTATTAGACAAATAATGGCATCTAACAATGTCAAGAAAAGTTGG 
TCATGGTATTAAATCCTACACGGATATATAACTATAAACCTCTAGTCCCCTCTATGCTGT 
CCTGTAATGAATATCTATCCGGAAATGTATTCGCATAGTCTTGCGTCTAATAATGTTTAT 
TGATTTTGTA 

>G2347 Amino Acid Sequence (domain in AA coordinates: 60-136) 
MEGQRTQRRGYLKDKATVSNLVEEEMENGMDGEEEDGGDEDKRKKVMERVRGPSTDRVPS 
RLCQVDRCTWLTEAICQYYRRHRVCEVHAKAS AATVAG VRQR FCQQC SRFHEL PEFDEAK 
RS CRRRLAGHNERRRKI S GDS FGEG S GRRGFSGQL I QTQERNRVDRKLPMTNS S FKRPQ I 
R* 

>G2010 (1..525) 

ATGGAGGGTAAGAGATCACAAGGACAAGGTTACATGAAAAAGAAGTCTTACCTTGTGGAA 

GAAGATATGGAGACTGATACGGATGAAGAAGAGGAAGTAGGTAGGGATAGAGTTAGAGGG 

TCTAGAGGTAGCATCAATCGTGGTGGCTCGTTGCGGCTTTGCCAAGTAGATAGATGCACA 

GCTGATATGAAAGAGGCAAAACTGTATCACCGGAGACACAAAGTGTGTGAAGTTCATGCA 

AAGGCATCTTCTGTCTTTCTCTC71GGACTTAACGAACGCTTTTGTCAACAATGCAGTAGG 

TTTCATGACCTCCAAGAGTTTGATGAAGCTAAGAGAAGTTGCAGGAGGCGCTTAGCTGGA 

CACAATGAGCGAAGAAGGAAGAGCTCTGGTGAGAGTACTTATGGAGAAGGATCAGGTCGG 

AGAGGAATCAATGGTCAGGTGGTGATGCAGAATCAAGAAAGATCAAGGGTAGAGATGACA 

CTTCCTATGCCAAACTCATCATTCAAGCGACCACAGATTAGATAG 

>G2010 Amino Acid Sequence (domain in AA coordinates: 53-127) 

MEGKRSQGQGYMKKKSYLVEEDMETDTDEEEEVGRDRVRGSRGSINRGGSLRLCQVDRCT 

ADMKEAKLYHRRHKVCEVHAKASSVFLSGIiNQRFCQQCSRFHDLQEFDEAKRSCRRRLAG 

HNERRRKSSGESTYGEGSGRRGINGQWMQNQERSRVEMTLPMPNSSFKRPQIR* 
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This application claims the benefit of US Provisional Application No. 
60/310,847, filed August 9, 2001, US Provisional Application No. 60/336,049, filed 
December 5, 2001, US Provisional Application No. 60/338,692, filed December 11, 
2001, and US Non-provisional Application No. 10/171,468, filed June 14, 2002, the 
entire contents of which are hereby incorporated by reference. 

FIELD OF THE INVENTION 

This invention relates to the field of plant biology. More particularly, the 
present invention pertains to compositions and methods for phenotypically modifying 
a plant. 

INTRODUCTION 

A plant's traits, such as its biochemical, developmental, or phenotypic 
characteristics, may be controlled through a number of cellular processes. One 
important way to manipulate that control is through transcription factors - proteins 
that influence the expression of a particular gene or sets of genes. Transformed and 
transgenic plants that comprise cells having altered levels of at least one selected 
transcription factor, for example, possess advantageous or desirable traits. Strategies 
for manipulating traits by altering a plant cell's transcription factor content can 
therefore result in plants and crops with commercially valuable properties. Applicants 
have identified polynucleotides encoding transcription factors, developed numerous 
transgenic plants using these polynucleotides, and have analyzed the plants for a 
variety of important traits. In so doing, applicants have identified important 
polynucleotide and polypeptide sequences for producing commercially valuable 
plants and crops as well as the methods for making them and using them. Other 
aspects and embodiments of the invention are described below and can be derived 
from the teachings of this disclosure as a whole. 

BACKGROUND OF THE INVENTION 

Transcription factors (TFs) can modulate gene expression, either increasing or 
decreasing (inducing or repressing) the rate of transcription. This modulation results 
in differential levels of gene expression at various developmental stages, in different 
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tissues and cell types, and in response to different exogenous (e.g., environmental) 
and endogenous stimuli throughout the life cycle of the organism. 



Because transcription factors are key controlling elements of biological 
pathways, altering the expression levels of one or more transcription factors can 
change entire biological pathways in an organism. For example, manipulation of the 
levels of selected transcription factors may result in increased expression of 
economically useful proteins or metabolic chemicals in plants or to improve other 
agriculturally relevant characteristics. Conversely, blocked or reduced expression of a 
transcription factor may reduce biosynthesis of unwanted compounds or remove an 
undesirable trait. Therefore, manipulating transcription factor levels in a plant offers 
tremendous potential in agricultural biotechnology for modifying a plant's traits. 

The present invention provides novel transcription, factors useful for 
modifying a plant's phenotype in desirable ways. 

SUMMARY OF THE INVENTION 

In a first aspect, the invention relates to a recombinant polynucleotide 
comprising a nucleotide sequence selected from the group consisting of: (a) a 
nucleotide sequence encoding a polypeptide comprising a polypeptide sequence 
selected from those of the Sequence Listing, SEQ ID NOs:2 to 2N, where N = 2-561, 
or those listed in Table 4, or a complementary nucleotide sequence thereof; (b) a 
nucleotide sequence encoding a polypeptide comprising a variant of a polypeptide of 
(a) having one or more, or between 1 and about 5, or between 1 and about 10, or 
between 1 and about 30, conservative amino acid substitutions; (c) a nucleotide 
sequence comprising a sequence selected from those of SEQ ID NOs:l to (2N - 1), 
where N = 2-561, or those included in Table 4, or a complementary nucleotide 
sequence thereof; (d) a nucleotide sequence comprising silent substitutions in a 
nucleotide sequence of (c); (e) a nucleotide sequence which hybridizes under stringent 
conditions over substantially the entire length of a nucleotide sequence of one or more 
of: (a), (b), (c), or (d); (f) a nucleotide sequence comprising at least 10 or 15, or at 
least about 20, or at least about 30 consecutive nucleotides of a sequence of any of 
(a)-(e), or at least 10 or 15, or at least about 20, or at least about 30 consecutive 
nucleotides outside of a region encoding a conserved domain of any of (a)-(e); (g) a 
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nucleotide sequence comprising a subsequence or fragment of any of (a)-(f), which 
subsequence or fragment encodes a polypeptide having a biological activity that 
modifies a plant's characteristic, functions as a transcription factor, or alters the level 
of transcription of a gene or transgene in a cell; (h) a nucleotide sequence having at 
least 3 1 % sequence identity to a nucleotide sequence of any of (a)-(g); (i) a 
nucleotide sequence having at least 60%, or at least 70 %, or at least 80 %, or at least 
90 %, or at least 95 % sequence identity to a nucleotide sequence of any of (a)-(g) or a 
10 or 15 nucleotide, or at least about 20, or at least about 30 nucleotide region of a 
sequence of (a)-(g) that is outside of a region encoding a conserved domain; (j) a 
nucleotide sequence that encodes a polypeptide having at least 31% sequence identity 
to a polypeptide listed in Table 4, or the Sequence Listing; (k) a nucleotide sequence 
which encodes a polypeptide having at least 60%, or at least 70 %, or at least 80%, or 
at least 90 %, or at least 95 % sequence identity to a polypeptide listed in Table 4, or 
the Sequence Listing; and (1) a nucleotide sequence that encodes a conserved domain 
of a polypeptide having at least 85%, or at least 90%, or at least 95%, or at least 98% 
sequence identity to a conserved domain of a polypeptide listed in Table 4, or the 
Sequence Listing. The recombinant polynucleotide may further comprise a 
constitutive, inducible, or tissue-specific promoter operably linked to the nucleotide 
sequence. The invention also relates to compositions comprising at least two of the 
above-described polynucleotides. 

In a second aspect, the invention comprises an isolated or recombinant 
polypeptide comprising a subsequence of at least about 10, or at least about 15, or at 
least about 20, or at least about 30 contiguous amino acids encoded by the 
recombinant or isolated polynucleotide described above, or comprising a subsequence 
of at least about 8, or at least about 12, or at least about 15, or at least about 20, or at 
least about 30 contiguous amino acids outside a conserved domain. 

In a third aspect, the invention comprises an isolated or recombinant 
polynucleotide that encodes a polypeptide that is a paralog of the isolated polypeptide 
described above. In one aspect, the invention is an paralog which, when expressed in 
Arabidopsis 9 modifies a trait of the Arabidopsis plant 
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In a fourth aspect, the invention comprises an isolated or recombinant 
polynucleotide that encodes a polypeptide that is an ortholog of the isolated 
polypeptide described above, hi one aspect, the invention is an ortholog which, when 
expressed in Arabidopsis, modifies a trait of the Arabidopsis plant. 

In a fifth aspect, the invention comprises an isolated polypeptide that is a 
paralog of the isolated polypeptide described above. In one aspect, the invention is an 
paralog which, when expressed in Arabidopsis, modifies a trait of the Arabidopsis 
plant. 

hi a sixth aspect, the invention comprises an isolated polypeptide that is an 
ortholog of the isolated polypeptide described above. In one aspect, the invention is 
an ortholog which, when expressed in Arabidopsis, modifies a trait of the 
Arabidopsis plant. 

The present invention also encompasses transcription factor variants. A 
preferred transcription factor variant is one having at least 40% amino acid sequence 
identity, a more preferred transcription factor variant is one having at least 50% amino •■ 
acid sequence identity and a most preferred transcription factor variant is one having 
at least 65% amino acid sequence identity to the transcription factor amino acid 
sequence SEQ ID NOs:2 to 2N, where N = 2-561, and which contains at least one 
functional or structural characteristic of the transcription factor amino acid sequence. 
Sequences having lesser degrees of identity but comparable biological activity are 
considered to be equivalents. 

In another aspect, the invention is a transgenic plant comprising one or more 
of the above-described isolated or recombinant polynucleotides. In yet another 
aspect, the invention is a plant with altered expression levels of a polynucleotide 
described above or a plant with altered expression or activity levels of an above- 
described polypeptide. Further, the invention is a plant lacking a nucleotide sequence 
encoding a polypeptide described above or substantially lacking a polypeptide 
described above. The plant may be any plant, including, but not limited to, 
Arabidopsis, mustard, soybean, wheat, corn, potato, cotton, rice, oilseed rape, 
sunflower, alfalfa, sugarcane, turf, banana, blackberry, blueberry, strawberry, 
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raspberry, cantaloupe, carrot, cauliflower, coffee, cucumber, eggplant, grapes, 
honeydew, lettuce, mango, melon, onion, papaya, peas, peppers, pineapple, pumpkin, 
spinach, squash, sweet corn, tobacco, tomato, watermelon, rosaceous fruits, vegetable 
brassicas, and mint or other labiates. In yet another aspect, the inventions is an 
isolated plant material of a plant, including, but not limited to, plant tissue, fruit, seed, 
plant cell, embryo, protoplast, pollen, and the like. In yet another aspect, the 
invention is a transgenic plant tissue culture of regenerable cells, including, but not 
limited to, embryos, meristematic cells, microspores, protoplast, pollen, and the like. 

In yet another aspect the invention is a transgenic plant comprising one or 
more of the above described polynucleotides wherein the encoded polypeptide is 
expressed and regulates transcription of a gene. 

In a further aspect the invention provides a method of using the polynucleotide 
composition to breed a progeny plant from a transgenic plant including crossing 
plants, producing seeds from transgenic plants, and methods of breeding using 
transgenic plants, the method comprising transforming a plant with the polynucleotide 
composition to create a transgenic plant, crossing the transgenic plant with another 
plant, selecting seed, and growing the progeny plant from the seed. 

In a further aspect, the invention provides a progeny plant derived from a 
parental plant wherein said progeny plant exhibits at least three fold greater 
messenger RNA levels than said parental plant, wherein the messenger RNA encodes 
a DNA-binding protein which is capable of binding to a DNA regulatory sequence 
and inducing expression of a plant trait gene, wherein the progeny plant is 
characterized by a change in the plant trait compared to said parental plant. In yet a 
further aspect, the progeny plant exhibits at least ten fold greater messenger RNA 
levels compared to said parental plant. In yet a further aspect, the progeny plant 
exhibits at least fifty fold greater messenger RNA levels compared to said parental 
plant. 

In a further aspect, the invention relates to a cloning or expression vector 
comprising the isolated or recombinant polynucleotide described above or cells 
comprising the cloning or expression vector. 
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In yet a further aspect, the invention relates to a composition produced by 
incubating a polynucleotide of the invention with a nuclease, a restriction enzyme, a 
polymerase; a polymerase and a primer; a cloning vector, or with a cell. 

Furthermore, the invention relates to a method for producing a plant having a 
modified trait. The method comprises altering the expression of an isolated or 
recombinant polynucleotide of the invention or altering the expression or activity of a 
polypeptide of the invention in a plant to produce a modified plant, and selecting the 
modified plant for a modified trait. In one aspect, the plant is a monocot plant. In 
another aspect, the plant is a dicot plant. In another aspect the recombinant 
polynucleotide is from a dicot plant and the plant is a monocot plant. In yet another 
aspect the recombinant polynucleotide is from a monocot plant and the plant is a dicot 
plant. In yet another aspect the recombinant polynucleotide is from a monocot plant 
and the plant is a monocot plant. In yet another aspect the recombinant 
polynucleotide is from a dicot plant and the plant is a dicot plant. 

In another aspect, the invention is a transgenic plant comprising an isolated or 
recombinant polynucleotide encoding a polypeptide wherein the polypeptide is 
selected from the group consisting of SEQ ID NOs: 2 - 2N, Where N = 2-561 . In yet 
another aspect, the invention is a plant with altered expression levels of a polypeptide 
described above or a plant with altered expression or activity levels of an above- 
described polypeptide. Further, the invention is a plant lacking a polynucleotide 
sequence encoding a polypeptide described above or substantially lacking a 
polypeptide described above. The plant may be any plant, including, but not limited 
to, Arabidopsis, mustard, soybean, wheat, corn, potato, cotton, rice, oilseed rape, 
sunflower, alfalfa, sugarcane, turf, banana, blackberry, blueberry, strawberry, 
raspberry, cantaloupe, carrot, cauliflower, coffee, cucumber, eggplant, grapes, 
honeydew, lettuce, mango, melon, onion, papaya, peas, peppers, pineapple, pumpkin, 
spinach, squash, sweet corn, tobacco, tomato, watermelon, rosaceous fruits, vegetable 
brassicas, and mint or other labiates. In yet another aspect, the inventions is an 
isolated plant material of a plant, including, but not limited to, plant tissue, fruit, seed, 
plant cell, embryo, protoplast, pollen, and the like. In yet another aspect, the 
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invention is a transgenic plant tissue culture of regenerable cells, including, but not 
limited to, embryos, meristematic cells, microspores, protoplast, pollen, and the like. 

In another aspect, the invention relates to a method of identifying a factor that 
is modulated by or interacts with a polypeptide encoded by a polynucleotide of the 
invention. The method comprises expressing a polypeptide encoded by the 
polynucleotide in a plant; and identifying at least one factor that is modulated by or 
interacts with the polypeptide. In one embodiment the method for identifying 
modulating or interacting factors is by detecting binding by the polypeptide to a 
promoter sequence, or by detecting interactions between an additional protein and the 
polypeptide in a yeast two hybrid system, or by detecting expression of a factor by 
hybridization to amicroarray, subtractive hybridization, or differential display. 

In yet another aspect, the invention is a method of identifying a molecule that 
modulates activity or expression of a polynucleotide or polypeptide of interest. The 
method comprises placing the molecule in contact with a plant comprising the 
polynucleotide or polypeptide encoded by the polynucleotide of the invention and 
monitoring one or more of the expression level of the polynucleotide in the plant, the 
expression level of the polypeptide in the plant, and modulation of an activity of the 
polypeptide in the plant. ' 

In yet another aspect, the invention relates to an integrated system, computer 
or computer readable medium comprising one or more character strings 
corresponding to a polynucleotide of the invention, or to a polypeptide encoded by the 
polynucleotide. The integrated system, computer or computer readable medium may 
comprise a link between one or more sequence strings to a modified plant trait. 

In yet another aspect, the invention is a method for identifying a sequence 
similar or homologous to one or more polynucleotides of the invention, or one or 
more polypeptides encoded by the polynucleotides. The method comprises providing 
a sequence database, and querying the sequence database with one or more target 
sequences corresponding to the one or more polynucleotides or to the one or more 
polypeptides to identify one or more sequence members of the database that display 
sequence similarity or homology to one or more of the one or more target sequences. 
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The method may further comprise of linking the one or more of the 
polynucleotides of the invention, or encoded polypeptides, to a modified plant 
phenotype. 

BRIEF DESCRIPTION OF THE SEQUENCE LISTING, 
TABLES, AND FIGURE 

The Sequence Listing provides exemplary polynucleotide and polypeptide 
sequences of the invention. The traits associated with the use of the sequences are 
included in the Examples. 

Diskettel is a read-only memory computer-readable diskette and contains a 
copy of the Sequence Listing in ASCII text format. The Sequence Listing is named 
"SEQLIST514442002041" and is 929 kilobytes in size. The copy of the Sequence 
Listing on the diskette is hereby incorporated by reference in its entirety. 

Table 4 shows the polynucleotides and polypeptides identified by SEQ ID 
NO; Mendel Gene ID No.; conserved domain of the polypeptide; and if the 
polynucleotide was tested in a transgenic assay. The first column shows the 
polynucleotide SEQ ID NO; the second column shows the Mendel Gene ID No., GID; 
the third column shows the trait(s) resulting from the knock out or overexpression of 
the polynucleotide in the transgenic plant; the fourth column shows the category of 
the trait; the fifth column shows the transcription factor family to which the 
polynucleotide belongs; the sixth column ("Comment"), includes specific effects and 
utilities conferred by the polynucleotide of the first column; the seventh column 
shows the SEQ ID NO of the polypeptide encoded by the polynucleotide; and the 
eighth column shows the amino acid residue positions of the conserved domain in > 
amino acid (AA) co-ordinates. 

Table 5 lists a summary of orthologous and homologous sequences identified 
using BLAST (tblastx program). The first column shows the polynucleotide sequence 
identifier (SEQ ID NO), the second column shows the corresponding cDNA identifier 
(Gene ID), the third column shows the orthologous or homologous polynucleotide 
GenBank Accession Number (Test Sequence ID), the fourth column shows the 
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calculated probability value that the sequence identity is due to chance (Smallest Sum 
Probability), the fifth column shows the plant species from which the test sequence 
was isolated (Test Sequence Species), and the sixth column shows the orthologous or 
homologous test sequence GenBank annotation (Test Sequence GenBank 
Annotation). 

Figure 1 shows a phylogenic tree of related plant families adapted from Daly 
et al. (2001 Plant Physiology 127:13284333). 

Detailed Description of Exemplary Embodiments 

In an important aspect, the present invention relates to polynucleotides and 
polypeptides, e.g. for modifying phenotypes of plants. Throughout this disclosure, 
various information sources are referred to and/or are specifically incorporated. The 
information sources include scientific journal articles, patent documents, textbooks, 
and World Wide Web browser-inactive page addresses, for example. While the 
reference to these information sources clearly indicates that they can be used by one 
of skill in the art, applicants specifically incorporate each and every one of the 
information sources cited herein, in their entirety, whether or not a specific mention of 
"incorporation by reference" is noted. The contents and teachings of each and every 
one of the infonnation sources can be relied on and used to make and use 
embodiments of the invention. 

It must be noted that as used herein and in the appended claims, the singular 
forms "a," "an," and "the" include plural reference unless the context clearly dictates 
otherwise. Thus, for example, a reference to "a plant" includes a plurality of such 
plants, and a reference to "a stress" is a reference to one or more stresses and 
equivalents thereof known to those skilled in the art, and so forth. 

The polynucleotide sequences of the invention encode polypeptides that are 
members of well-known transcription factor families, including plant transcription 
factor families, as disclosed in Table 4. Generally, the transcription factors encoded 
by the present sequences are involved in cell differentiation and proliferation and the 
regulation of growth. Accordingly, one skilled in the art would recognize that by 
expressing the present sequences in a plant, one may change the expression of 
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autologous genes or induce the expression of introduced genes. By affecting the 
expression of similar autologous sequences in a plant that have the biological activity 
of the present sequences, or by introducing the present sequences into a plant, one 
may alter a plant's phenotype to one with improved traits. The sequences of the 
invention may also be used to transform a plant and introduce desirable traits not 
found in the wild-type cultivar or strain. Plants may then be selected for those that 
produce the most desirable degree of over- or underexpression of target genes of 
interest and coincident trait improvement. 

The sequences of the present invention may be from any species, particularly 
plant species, in a naturally occurring form or from any source whether natural, 
synthetic, semi-synthetic or recombinant. The sequences of the invention may also 
include fragments of the present amino acid sequences. In this context, a "fragment" 
refers to a fragment of a polypeptide sequence which is at least 5 to about 15 amino 
acids in length, most preferably at least 14 amino acids, and which retain some 
biological activity of a transcription factor. Where "amino acid sequence" is recited to 
refer to an amino acid sequence of a naturally occurring protein molecule, "amino 
acid sequence" and like terms are not meant to limit the amino acid sequence to the 
complete native amino acid sequence associated with the recited protein molecule. 

As one of ordinary skill in the art recognizes, transcription factors can be 
identified by the presence of a region or domain of structural similarity or identity to a 
specific consensus sequence or the presence of a specific consensus DNA-binding site 
or DNA-binding site motif (see, for example, Riechmann et al., (2000) Science 290: 
2105-21 10). The plant transcription factors may belong to one of the following 
transcription factor families: the AP2 (APETALA2) domain transcription factor 
family (Riechmann and Meyerowitz (1998) Biol Chenu 379:633-646); the MYB 
transcription factor family (Martin and Paz-Ares, (1997) Trends Genet 13:67-73); the 
MADS domain transcription factor family (Riechmann and Meyerowitz (1997) Biol 
Chem. 378:1079-1 101); the WRKY protein family (Ishiguro and Nakamura (1994) 
Mol Gen. Genet 244:563-571); the ankyrin-repeat protein family (Zhang et aL 
(1992) Plant Cell 4: 1575-1 588); the zinc finger protein (Z) family (Klug and Schwabe 
(1995) FASEB J. 9: 597-604); the homeobox (HB) protein family (Buerglin m 
Guidebook to the Homeobox Genes, Duboule (ed.) (1994) Oxford University Press); 
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the CAAT-element binding proteins (Forsburg and Guarente (1989) Genes Dev. 
3:1166-1178); the squamosa promoter binding proteins (SPB) (Klein et al. (1996) 
Mol. Gen. Genet. 1996 250:7-16); the NAM protein family (Souer et al. (1996) Cell 
85:159-170); the IAA/AUX proteins (Rouse et al. (1998) Science 279:1371-1373); 
the HLH/MYC protein family (Littlewood et al. (1994) Prot. Profile 1:639-709); the 
DNA-binding protein (DBP) family (Tucker et al. (1994) EMBOJ. 13:2994-3002); 
the bZIP family of transcription factors (Foster et al. (1994) FASEB J. 8:192-200); the 
Box P-binding protein (the BPF-1) family (da Costa e Silva et al. (1993) Plant J. 
4:125-135); the high mobility group (HMG) family (Bustin and Reeves (1996; Prog. 
Nucl. Acids Res. Mol. Biol. 54:35-100); the scarecrow (SCR) family (Di Laurenzio et 
al. (1996) Cell 86:423-433); the GF14 family (Wu et al. (1997) Plant Physiol. 
114:1421-1431); the polycomb (PCOMB) family (Kennison (1995) Annu. Rev. Genet. • 
29:289-303); the teosinte branched (TEO) family (Luo et al. (1996) Nature 383:794- 
799; the ABB family (Giraudat et al. (1992) Plant Cell 4:1251-1261); the triple helix 
(TH) family (Dehesh et al. (1990) Science 250:1397-1399); the EJJL family (Chao et 
al. (1997) Cell 89: 1 133-44); the AT-HOOK family (Reeves and Nissen (1990) J. Biol. 
Chem. 265:8573-8582); the S1FA family (Zhou et al. (1995) Nucleic Acids Res. 
23:1 165-1 169); the bZIPT2 family (Lu and Ferl (1995; Plant Physiol. 109:723); the 
YABBY family (Bowman et al. (1999) Development 126:2387-96); the PAZ family 
(Bohmert et al. (1998) EMBOJ. 17:170-80); a family of miscellaneous (MISC) 
transcription factors including the DPBF family (Kim et al. (1997) Plant J. 11:1237- 
125 1) and the SPF1 family (Ishiguro and Nakamura (1 994) Mol. Gen. Genet. 
244:563-571); the golden (GLD) family (Hall et al. (1998) Plant Cell 10:925-936), 
the TUBBY family (Boggin et al, (1999) Science 286:21 19-2125), the heat shock 
family (Wu C (1995) Annu Rev Cell Dev Biol 1 1 :441-469), the ENBP family 
(Christiansen et al (1996) Plant Mol Biol 32:809-821), the RING-zinc family (Jensen 
et al. (1998; FEBS letters 436:283-287), the PDBP family (Janik et al Virology. 
(1989) 168:320-329), the PCF family (Cubas P, et al. Plant J. (1999) 18:215-22), the 
SRS (SHI-related) family (Fridborg et al Plant Cell (1999) 11:1019-1032), the CPP 
(cysteine-rich polycomb-like) family (Cvitanich et al Proc. Natl. Acad. Sci. U S A. 
(2000) 97:8163-8168), the ARF (auxin response factor) family (Ulmasov, et al. 
(1999) Proc. Natl. Acad. Sci. USA 96: 5844-5849), the SWI/SNF family 
(Collingwood et al J. Mol. End. 23:255-275), the ACBF family (Seguin et al (1997) 
Plant Mol Biol. 35:281-291), PCGL (CG-1 like) family (da Costa e Silva et al. 
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(1994) Plant Mol Biol. 25:921-924) the ARID family (Vazquez et al. (1999) 
Development. 126: 733-42), the Jiimonji family, Balciunas et al (2000, Trends 
Biochem ScL 25: 274-276), the bZIP-NIN family (Schauser et al (1999) Nature 402: 
191-195), the E2F family Kaelin et al (1992) Cell 70: 351-364) and the GRF-like 
family (Knaap et al (2000) Plant Physiol. 122: 695-704). As indicated by any pail of 
the list above and as known in the art, transcription factors have been sometimes 
categorized by class, family, and sub-family according to their structural content and 
consensus DNA-binding site motif, for example. Many of the classes and many of the 
families and sub-families are listed here. However, the inclusion of one sub-family 
and not another, or the inclusion of one family and not another, does not mean that the 
invention does not encompass polynucleotides or polypeptides of a certain family or 
sub-family. The list provided here is merely an example of the types of transcription 
factors and the knowledge available concerning the consensus sequences and 
consensus DNA-binding site motifs that help define them as known to those of skill in 
the art (each of the references noted above are specifically incorporated herein by 
reference). A transcription factor may include, but is not limited to, any polypeptide 
that can activate or repress transcription of a single gene or a number of genes. This 
polypeptide group includes, but is not limited to, DNA-binding proteins, DNA- 
binding protein binding proteins, protein kinases, protein phosphatases, GTP-binding 
proteins, and receptors, and the like. 

In addition to methods for modifying a plant phenotype by employing one or 
more polynucleotides and polypeptides of the invention described herein, the 
polynucleotides and polypeptides of the invention have a variety of additional uses. 
These uses include their use in the recombinant production (i.e., expression) of 
proteins; as regulators of plant gene expression, as diagnostic probes for the presence 
of complementary or partially complementary nucleic acids (including for detection 
of natural coding nucleic acids); as substrates for further reactions, e.g., mutation 
reactions, PCR reactions, or the like; as substrates for cloning e.g., including digestion 
or ligation reactions; and for identifying exogenous or endogenous modulators of the 
transcription factors. A "polynucleotide" is a nucleic acid sequence comprising a 
plurality of polymerized nucleotides, e.g., at least about 15 consecutive polymerized 
nucleotides, optionally at least about 30 consecutive nucleotides, at least about 50 
consecutive nucleotides. In many instances, a polynucleotide comprises a nucleotide 
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sequence encoding a polypeptide (or protein) or a domain or fragment thereof. 
Additionally, the polynucleotide may comprise a promoter, an intron, an enhancer 
region, a polyadenylation site, a translation initiation site, 5 s or 3' untranslated 
regions, a reporter gene, a selectable marker, or the like. The polynucleotide can be 
single stranded or double stranded DNA or RNA. The polynucleotide optionally 
comprises modified bases or a modified backbone. The polynucleotide can be, e.g., 
genomic DNA or RNA, a transcript (such as an mRNA), a cDNA, a PCR product, a 
cloned DNA, a synthetic DNA or RNA, or the like. The polynucleotide can comprise 
a sequence in either sense or antisense orientations. 

A "recombinant polynucleotide" is a polynucleotide that is not in its native 
state, e.g., the polynucleotide comprises a nucleotide sequence not found in nature, or 
the polynucleotide is in a context other than that in which it is naturally found, e.g., 
separated from nucleotide sequences with which it typically is in proximity in nature, 
or adjacent (or contiguous with) nucleotide sequences with which it typically is not in 
proximity. For example, the sequence at issue can be cloned into a vector, or 
otherwise recombined with one or more additional nucleic acid. 

An "isolated polynucleotide" is a polynucleotide whether naturally occurring 
or recombinant, that is present outside the cell in which it is typically found in nature, 
whether purified or not. Optionally, an isolated polynucleotide is subject to one or 
more enrichment or purification procedures, e.g., cell lysis, extraction, centrifogation, 
precipitation, or the like. 

A "polypeptide" is an amino acid sequence comprising a plurality of 
consecutive polymerized amino acid residues e.g., at least about 15 consecutive 
polymerized amino acid residues, optionally at least about 30 consecutive 
polymerized amino acid residues, at least about 50 consecutive polymerized amino 
acid residues. In many instances, a polypeptide comprises a polymerized amino acid 
residue sequence that is a transcription factor or a domain or portion or fragment 
thereof. Additionally, the polypeptide may comprise a localization domain, 2) an 
activation domain, 3) a repression domain, 4) an oligomerization domain or 5) a 
DNA-binding domain, or the like. The polypeptide optionally comprises modified 
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amino acid residues, naturally occurring amino acid residues not encoded by a codon, 
non-naturally occurring amino acid residues. 



A "recombinant polypeptide" is a polypeptide produced by translation of a 
recombinant polynucleotide. A "synthetic polypeptide" is a polypeptide created by 
consecutive polymerization of isolated amino acid residues using methods well 
known in the art. An "isolated polypeptide," whether a naturally occurring or a 
recombinant polypeptide, is more enriched in (or out of) a cell than the polypeptide in 
its natural state in a wild type cell, e.g., more than about 5% enriched, more than 
about 10% enriched, or more than about 20%, or more than about 50%, or more, 
enriched, i.e., alternatively denoted: 105%, 110%, 120%, 150% or more, enriched 
relative to wild type standardized at 100%. Such an enrichment is not the result of a 
natural response of a wild type plant. Alternatively, or additionally, the isolated 
polypeptide is separated from other cellular components with which it is typically 
associated, e.g., by any of the various protein purification methods herein. 

"Identity" or "similarity" refers to sequence similarity between two 
polynucleotide sequences or between two polypeptide sequences, with identity being 
a more strict comparison. The phrases "percent identity" and "% identity" refer to the 
percentage of sequence similarity found in a comparison of two or more 
polynucleotide sequences or two or more polypeptide sequences. Identity or 
similarity can be determined by comparing a position in each sequence that may be 
aligned for purposes of comparison. When a position in the compared sequence is 
occupied by the same nucleotide base or amino acid, then the molecules are identical 
at that position. A degree of similarity or identity between polynucleotide sequences 
is a function of the number of identical or matching nucleotides at positions shared by 
the polynucleotide sequences. A degree of identity of polypeptide sequences is a 
function of the number of identical amino acids at positions shared by the polypeptide 
sequences. A degree of homology or similarity of polypeptide sequences is a function 
of the number of amino acids, i.e., structurally related, at positions shared by the 
polypeptide sequences. 

"Altered" nucleic acid sequences encoding polypeptide include those 
sequences with deletions, insertions, or substitutions of different nucleotides, resulting 
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in a polynucleotide encoding a polypeptide with at least one functional characteristic 
of the polypeptide. Included within this definition are polymorphisms that may or 
may not be readily detectable using a particular oligonucleotide probe of the 
polynucleotide encoding polypeptide, and improper or unexpected hybridization to 
allelic variants, with a locus other than the normal chromosomal locus for the 
polynucleotide sequence encoding polypeptide. The encoded polypeptide protein 
may also be "altered", and may contain deletions, insertions, or substitutions of amino 
acid residues that produce a silent change and result in a functionally equivalent 
polypeptide. Deliberate amino acid substitutions may be made on the basis of 
similarity in residue side chain chemistry, including, but not limited to, polarity, 
charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of 
the residues, as long as the biological activity of polypeptide is retained. For 
example, negatively charged amino acids may include aspartic acid and glutamic acid, 
positively charged amino acids may include lysine and arginine, and amino acids with 
uncharged polar head groups having similar hydrophilicity values may include 
leucine, isoleucine, and valine; glycine and alanine; asparagine and glutamine; serine 
and threonine; and phenylalanine and tyrosine. Alignments between different 
polypeptide sequences may be used to calculate "percentage sequence similarity". 

The term "plant 11 includes whole plants, shoot vegetative organs/structures 
(e.g., leaves, stems and tubers), roots, flowers and floral organs/structures (e.g., 
bracts, sepals, petals, stamens, carpels, anthers and ovules), seed (including embryo, 
endosperm, and seed coat) and fruit (the mature ovary), plant tissue (e.g., vascular 
tissue, ground tissue, and the like) and cells (e.g. y guard cells, egg cells, and the like), 
and progeny of same. The class of plants that can be used in the method of the 
invention is generally as broad as the class of higher and lower plants amenable to 
transformation techniques, including angiosperms (monocotyledonous and 
dicotyledonous plants), gymnosperms, ferns, horsetails, psilophytes, lycophytes, 
bryophytes, and multicellular algae. (See for example, Figure 1, adapted from Daly et 
al. 2001 Plant Physiology 127:1328-1333; and see also Tudge, C, The Variety of 
Life, Oxford University Press, New York, 2000, pp. 547-606.) 

A 'transgenic plant" refers to a plant that contains genetic material not found 
in a wild type plant of the same species, variety or cultivar. The genetic material may 
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include a transgene, an insertional mutagenesis event (such as by transposes or T- 
DNA insertional mutagenesis), an activation tagging sequence, a mutated sequence, a* 
homologous recombination event or a sequence modified by chimeraplasty. 
Typically, the foreign genetic material has been introduced into the plant by human 
manipulation, but any method can be used as one of skill in the art recognizes. 

A transgenic plant may contain an expression vector or cassette. The 
expression cassette typically comprises a polypeptide-encoding sequence operably 
linked (i.e., under regulatory control of) to appropriate inducible or constitutive 
regulatory sequences that allow for the expression of polypeptide. The expression 
cassette can be introduced into a plant by transformation or by breeding after 
transformation of a parent plant. A plant refers to a whole plant as well as to a plant 
part, such as seed, fruit, leaf, or root, plant tissue, plant cells or any other plant 
material, e.g., a plant explant, as well as to progeny thereof, and to in vitro systems 
that mimic biochemical or cellular components or processes in a cell. 

"Ectopic expression or altered expression" in reference to a polynucleotide 
indicates that the pattern of expression in, e.g., a transgenic plant or plant tissue, is 
different from the expression pattern in a wild type plant or a reference plant of the 
same species. The pattern of expression may also be compared with a reference 
expression pattern in a wild type plant of the same species. For example, the 
polynucleotide or polypeptide is expressed in a cell or tissue type other than a cell or 
tissue type in which the sequence is expressed in the wild type plant, or by expression 
at a time other than at the time the sequence is expressed in the wild type plant, or by 
a response to different inducible agents, such as hormones or environmental signals, 
or at different expression levels (either higher or lower) compared with those found in 
a wild type plant The term also refers to altered expression patterns that are produced 
by lowering the levels of expression to below the detection level or completely 
abolishing expression. The resulting expression pattern can be transient or stable, 
constitutive or inducible. In reference to a polypeptide, the term "ectopic expression 
or altered expression" further may relate to altered activity levels resulting from the 
interactions of the polypeptides with exogenous or endogenous modulators or from 
interactions with factors or as a result of the chemical modification of the 
polypeptides. 
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A "fragment" or "domain," with respect to a polypeptide, refers to a 
subsequence of the polypeptide, hi some cases, the fragment or domain, is a 
subsequence of the polypeptide which performs at least one biological function of the 
intact polypeptide in substantially the same manner, or to a similar extent, as does the 
intact polypeptide. For example, a polypeptide fragment can comprise a recognizable 
structural motif or functional domain such as a DNA-binding site or domain that 
binds to a DNA promoter region, an activation domain, or a domain for protein- 
protein interactions. Fragments can vary in size from as few as 6 amino acids to the 
full length of the intact polypeptide, but are preferably at least about 30 amino acids in 
length and more preferably at least about 60 amino acids in length. In reference to a 
polynucleotide sequence, "a fragment" refers to any subsequence of a polynucleotide, 
typically, of at least about 15 consecutive nucleotides, preferably at least about 30 
nucleotides, more preferably at least about 50 nucleotides, of any of the sequences 
provided herein. 

. The invention also encompasses production of DNA sequences that encode 
transcription factors and transcription factor derivatives, or fragments thereof, entirely 
by synthetic chemistry. After production, the synthetic sequence may be inserted into 
any of the many available expression vectors and cell systems using reagents well 
known in the art. Moreover, synthetic chemistry may be used to introduce mutations 
into a sequence encoding transcription factors or any fragment thereof. 

A "conserved domain", with respect to a polypeptide, refers to a domain 
within a transcription factor family which exhibits a higher degree of sequence 
homology, such as at least 65% sequence identity including conservative 
substitutions, and preferably at least 80% sequence identity, and more preferably at 
least 85%, or at least about 86%, or at least about 87%, or at least about 88%, or at 
least about 90%, or at least about 95%, or at least about 98% amino acid residue 
sequence identity of a polypeptide of consecutive amino acid residues. A fragment or 
domain can be referred to as outside a consensus sequence or outside a consensus 
DNA-binding site that is known to exist or that exists for a particular transcription 
factor class, family, or sub-family. In this case, the fragment or domain will not 
include the exact amino acids of a consensus sequence or consensus DNA-binding 
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site of a transcription factor class, family or sub-family, or the exact amino acids of a 
particular transcription factor consensus sequence or consensus DNA-binding site. 
Furthermore, a particular fragment, region, or domain of a polypeptide, or a 
polynucleotide encoding a polypeptide, can be "outside a conserved domain" if all the 
amino acids of the fragment, region, or domain fall outside of a defined conserved 
domain(s) for a polypeptide or protein. The conserved domains for each of 
polypeptides of SEQ ID NOs:2 - 2N, where N = 2-561, are listed in Table 4 as 
described in Example VII. Also, many of the polypeptides of Table 4 have conserved 
domains specifically indicated by start and stop sites. A comparison of the regions of 
the polypeptides in SEQ ID NOs:2 - 2N, where N = 2-561, or of those in Table 4, 
allows one of skill in the art to identify conserved domain(s) for any of the 
polypeptides listed or referred to in this disclosure, including those in Table 4. 

A "trait" refers to a physiological, morphological, biochemical, or physical 
characteristic of a plant or particular plant material or cell. In some instances, this 
characteristic is visible to the human eye, such as seed or plant size, or can be 
measured by biochemical techniques, such as detecting the protein, starch, or oil 
content of seed or leaves, or by observation of a metabolic or physiological process, 
e.g. by measuring uptake of carbon dioxide, or by the observation of the expression 
level of a gene or genes, e.g., by employing Northern analysis, RT-PCR, microarray 
gene expression assays, or reporter gene expression systems, or by agricultural 
observations such as stress tolerance, yield, or pathogen tolerance. Any technique can 
be used to measure the amount of, comparative level of, or difference in any selected 
chemical compound or macromolecule in the transgenic plants, however. 

"Trait modification" refers to a detectable difference in a characteristic in a 
plant ectopically expressing a polynucleotide or polypeptide of the present invention 
relative to a plant not doing so, such as a wild type plant. In some cases, the trait 
modification can be evaluated quantitatively. For example, the trait modification can 
entail at least about a 2% increase or decrease in an observed trait (difference), at least 
a 5% difference, at least about a 10% difference, at least about a 20% difference, at 
least about a 30%, at least about a 50%, at least about a 70%, or at least about a 100%, 
or an even greater difference compared with a wild type plant. It is known that there 
can be a natural variation in the modified trait. Therefore, the trait modification 
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observed entails a change of the normal distribution of the trait in the plants compared 
with the distribution observed in wild type plant. 



I. Traits Which May Be Modified 

Trait modifications of particular interest include those to seed (such as embryo 
or endosperm), fruit, root, flower, leaf, stem, shoot, seedling or the like, including: 
enhanced tolerance to environmental conditions including freezing, chilling, heat, 
drought, water saturation, radiation and ozone; improved tolerance to microbial, 
fungal or viral diseases; improved tolerance to pest infestations, including nematodes, 
mollicutes, parasitic higher plants or the like; decreased herbicide sensitivity; 
improved tolerance of heavy metals or enhanced ability to take up heavy metals; 
improved growth under poor photoconditions (e.g., low light and/or short day length), 
or changes in expression levels of genes of interest. Other phenotype that can be 
modified relate to the production of plant metabolites, such as variations in the 
production of taxol, tocopherol, tocotrienol, sterols, phytosterols, vitamins, wax 
monomers, anti-oxidants, amino acids, lignins, cellulose, tannins, prenyllipids (such 
as chlorophylls and carotenoids), glucosinolates, and terpenoids, enhanced or 
compositionally altered protein or oil production (especially in seeds), or modified 
sugar (insoluble or soluble) and/or starch composition. Physical plant characteristics 
that can be modified include cell development (such as the number of trichomes), fruit 
and seed size and number, yields of plant parts such as stems, leaves, inflorescences; 
and roots, the stability of the seeds during storage, characteristics of the seed pod 
(e.g., susceptibility to shattering), root hair length and quantity, internode distances, or 
the quality of seed coat. Plant growth characteristics that can be modified include 
growth rate, germination rate of seeds, vigor of plants and seedlings, leaf and flower 
senescence, male sterility, apomixis, flowering time, flower abscission, rate of 
nitrogen uptake, osmotic sensitivity to soluble sugar concentrations, biomass or 
transpiration characteristics, as well as plant architecture characteristics such as apical 
dominance, branching patterns, number of organs, organ identity, organ shape or size. 

II. Transcription Factors Modify Expression Of Endogenous Genes 

Expression of genes which encode transcription factors that modify expression 
of endogenous genes, polynucleotides, and proteins are well known in the art. In 
addition, transgenic plants comprising isolated polynucleotides encoding transcription 
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factors may also modify expression of endogenous genes, polynucleotides, and 
proteins. Examples include Peng et al. (1997, Genes and Development 1 1 :3194- 
3205) and Peng et al. (1999, Nature, 400:256-261). In addition, many others have 
demonstrated that an Arabidopsis transcription factor expressed in an exogenous plant 
species elicits the same or very similar phenotypic response. See, for example, Fu et 
al. (2001, Plant Cell 13:1791-1802); Nandi et al. (2000, Curr. Biol. 10:215-218); 
Coupland (1995, Nature 377:482-483); and Weigel and Nilsson (1995, Nature 
377:482-500). 

In another example, Mandel et al. (1992, Cell 71-133-143) and Suzuki et al. 
(2001, Plant J. 28:409-418) teach that a transcription factor expressed in another plant 
species elicits the same or very similar phenotypic response of the endogenous 
sequence, as often predicted in earlier studies of Arabidopsis transcription factors in 
Arabidopsis (see Mandel et al., 1992, supra; Suzuki et al, 2001, supra). 

Other examples include Miiller et al. (2001, Plant J. 28:169-179); Kim et al. 
(2001, Plant J. 25:247-259); Kyozuka and Shimamoto (2002, Plant Cell Physiol. 
43:130-135); Boss and Thomas (2002, Nature, 416:847-850); He et al. (2000, 
Transgenic Res, 9:223-227); and Robson et al. (2001, Plant J. 28:619-631). 

In yet another example, Gilmour et al. (1998, Plant J. 16:433-442) teach an 
Arabidopsis AP2 transcription factor, CBF1, which, when overexpressed in transgenic 
plants, increases plant freezing tolerance. Jaglo et al (2001, Plant Physiol. 127:910- 
917) further identified sequences in Brassica napus which encode CBF-like genes and 
that transcripts for these genes accumulated rapidly in response to low temperature. 
Transcripts encoding CBF-like proteins were also found to accumulate rapidly in 
response to low temperature in wheat, as well as in tomato. An alignment of the CBF 
proteins from Arabidopsis, B. napus, wheat, rye, and tomato revealed the presence of 
conserved amino acid sequences, PKK/RPAGRxKFxETRHP and DSAWR, that 
bracket the AP2/EREBP DNA binding domains of the proteins and distinguish them 
from other members of the AP2/EREBP protein family. (See Jaglo et al., supra.) 

III. Polypeptides and Polynucleotides of the Invention 
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The present invention provides, among other tilings, transcription factors 
(TFs), and transcription factor homologue polypeptides, and isolated or recombinant 
polynucleotides encoding the polypeptides, or novel variant polypeptides or 
polynucleotides encoding novel variants of transcription factors derived from the 
specific sequences provided here. These polypeptides and polynucleotides may be 
employed to modify a plant's characteristic. 

Exemplary polynucleotides encoding the polypeptides of the invention were 
identified in the Arabidopsis thaliana GenBank database using publicly available 
sequence analysis programs and parameters. Sequences initially identified were then 
further characterized to identify sequences comprising specified sequence strings 
corresponding to sequence motifs present in families of known transcription factors. 
In addition, further exemplary polynucleotides encoding the polypeptides of the 
invention were identified in the plant GenBank database using publicly available 
sequence analysis programs and parameters. Sequences initially identified were then 
further characterized to identify sequences comprising specified sequence strings 
corresponding to sequence motifs present in families of known transcription factors. 
Polynucleotide sequences meeting such criteria were confirmed as transcription 
factors. 

Additional polynucleotides of the invention were identified by screening 
Arabidopsis thaliana and/or other plant cDNA libraries with probes corresponding to 
known transcription factors under low stringency hybridization conditions. 
Additional sequences, including full length coding sequences were subsequently 
recovered by the rapid amplification of cDNA ends (RACE) procedure, using a 
commercially available kit according to the manufacturer's instructions. Where 
necessary, multiple rounds of RACE are performed to isolate 5' and 3' ends. The foil 
length cDNA was then recovered by a routine end-to-end polymerase chain reaction 
(PCR) using primers specific to the isolated 5' and 3' ends. Exemplary sequences are 
provided in the Sequence Listing. 

The polynucleotides of the invention can be or were ectopically expressed in 
overexpressor or knockout plants and the changes in the characteristic(s) or trait(s) of 
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the plants observed. Therefore, the polynucleotides and polypeptides can be 
employed to improve the characteristics of plants. 



The polynucleotides of the invention can be or were ectopically expressed in 
overexpressor plant cells and the changes in the expression levels of a number of 
genes, polynucleotides, and/or proteins of the plant cells observed. Therefore, the 
polynucleotides and polypeptides can be employed to change expression levels of a 
genes, polynucleotides, and/or proteins of plants. 

IV. Producing Polypeptides 

The polynucleotides of the invention include sequences that encode 
transcription factors and transcription factor homologue polypeptides and sequences 
complementary thereto, as well as unique fragments of coding sequence, or sequence 
complementary thereto. Such polynucleotides can be, e.g., DNA or RNA, e.g., 
mRNA, cRNA, synthetic RNA, genomic DNA, cDNA synthetic DNA, 
oligonucleotides, etc. The polynucleotides are either double-stranded or single- 
stranded, and include either, or both sense (i.e., coding) sequences and antisense (i.e., 
non-coding, complementary) sequences. The polynucleotides include the coding 
sequence of a transcription factor, or transcription factor homologue polypeptide, in 
isolation, in combination with additional coding sequences (e.g., a purification tag, a 
localization signal, as a fusion-protein, as a pre-protein, or the like), in combination 
with non-coding sequences (e.g., introns or inteins, regulatory elements such as 
promoters, enhancers, terminators, and the like), and/or in a vector or host 
environment in which the polynucleotide encoding a transcription factor or 
transcription factor homologue polypeptide is an endogenous or exogenous gene. 

A variety of methods exist for producing the polynucleotides of the invention. 
Procedures for identifying and isolating DNA clones are well known to those of skill 
in the art, and are described in, e.g., Berger and Kimmel, Guide to Molecular Cloning 
Techniques. Methods in Enzymology volume 152 Academic Press, Inc., San Diego, 
CA ("Berger"); Sambrook et al., Molecular Cloning - A Laboratory Manual (2nd 
Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, 1989 
("Sambrook") and Current Protocols in Molecular Biology , F. M. Ausubel et aL, eds., 
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Current Protocols, a joint venture between Greene Publishing Associates, Inc. and 
John Wiley & Sons, Inc., (supplemented through 2000) ("Ausubel"). 



Alternatively, polynucleotides of the invention, can be produced by a variety 
of in vitro amplification methods adapted to the present invention by appropriate 
selection of specific or degenerate primers. Examples of protocols sufficient to direct 
persons of skill through in vitro amplification methods, including the polymerase 
chain reaction (PCR) the ligase chain reaction (LCR), Qbeta-replicase amplification 
and other RNA polymerase mediated techniques (e.g., NASBA), e.g., for the 
production of the homologous nucleic acids of the invention are found in Berger 
(supra), Sambrook (supra), and Ausubel (supra), as well as Mullis et al., (1987) PCR 
Protocols A Guide to Methods and Applications (Innis et al. eds) Academic Press Inc. 
San Diego, CA (1990) (Innis). Improved methods for cloning in vitro amplified 
nucleic acids are described in Wallace et al., U.S. Pat. No. 5,426,039. Improved 
methods for amplifying large nucleic acids by PCR are summarized in Cheng et al. 
(1994) Nature 369: 684-685 and the references cited therein, in which PCR amplicons 
of up to 40kb are generated. One of skill will appreciate that essentially any RNA can 
be converted into a double stranded DNA suitable for restriction digestion, PCR 
expansion and sequencing using reverse transcriptase and a polymerase. See, e.g., 
Ausubel, Sambrook and Berger, all supra. 

Alternatively, polynucleotides and oligonucleotides of the invention can be 
assembled from fragments produced by solid-phase synthesis methods. Typically, 
fragments of up to approximately 100 bases are individually synthesized and then 
enzymatically or chemically ligated to produce a desired sequence, e.g., a 
polynucleotide encoding all or part of a transcription factor. For example, chemical 
synthesis using the phosphoramidite method is described, e.g., by Beaucage et al. 
(1981) Tetrahedron Letters 22:1859-1869; and Matthes et al. (1984) EMBO J. 3:801- 
805. According to such methods, oligonucleotides are synthesized, purified, annealed 
to their complementary strand, ligated and then optionally cloned into suitable * 
vectors. And if so desired, the polynucleotides and polypeptides of the invention can 
be custom ordered from any of a number of commercial suppliers. 

V. Homologous Sequences 

23 



BNSDOCID: <WO_03013227A2_1A> 



WO 03/013227 PCT/US02/25805 

Sequences homologous, i.e., that share significant sequence identity or 
similarity, to those provided in the Sequence Listing, derived from Arabiclopsis 
thaliana or from other plants of choice are also an aspect of the invention. 
Homologous sequences can be derived from any plant including monocots and dicots 
and in particular agriculturally important plant species, including but not limited to, 
crops such as soybean, wheat, corn, potato, cotton, rice, rape, oilseed rape (including 
canola), sunflower, alfalfa, sugarcane and turf; or fruits and vegetables, such as 
banana, blackberry, blueberry, strawberry, and raspberry, cantaloupe, carrot, 
cauliflower, coffee, cucumber, eggplant, grapes, honeydew, lettuce, mango, melon, 
onion, papaya, peas, peppers, pineapple, pumpkin, spinach, squash, sweet corn, 
tobacco, tomato, watermelon, rosaceous fruits (such as apple, peach, pear, cherry and 
plum) and vegetable brassicas (such as broccoli, cabbage, cauliflower, Brussels 
sprouts, and kohlrabi). Other crops, fruits and vegetables whose phenotype can be 
changed include barley, rye, millet, sorghum, currant, avocado, citrus fruits such as 
oranges, lemons, grapefruit and tangerines, artichoke, cherries, nuts such as the 
walnut and peanut, endive, leek, roots, such as arrowroot, beet, cassava, turnip, radish, 
yam, and sweet potato, and beans. The homologous sequences may also be derived 
from woody species, such pine, poplar and eucalyptus, or mint or other labiates. 

Orthologs And Paralogs 

Several different methods are known by those of skill in the art for identifying 
and defining these functionally homologous sequences. Three general methods for 
defining paralogs and orthologs are described; a paralog or ortholog or homolog may 
be identified by one or more of the methods described below. 

Orthologs and paralogs are evolutionarily related genes that have similar 
sequence and similar functions. Orthologs are structurally related genes in different 
species that are derived from a speciation event. Paralogs are structurally related 
genes within a single species that are derived by a duplication event. 

Within a single plant species, gene duplication may cause two copies of a 
particular gene, giving rise to two or more genes with similar sequence and similar 
function known as paralogs. A paralog is therefore a similar gene with a similar 
function within the same species. Paralogs typically cluster together or in the same 
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clade (a group of similar genes) when a gene family phytogeny is analyzed using 
programs such as CLUSTAL (Thompson et al. (1994) Nucleic Acids Res. 22:4673- 
4680; Higgins et al. (1996) Methods Enzymol. 266 383-402). Groups of similar 
genes can also be identified with pair-wise BLAST analysis (Feng and Doolittle 
(1987) J. Mol. Evol. 25:351-360). For example, a clade of very similar MADS 
domain transcription factors from Arabidopsis all share a common function in 
flowering time (Ratcliffe et al. (2001) Plant Physiol. 126:122-132), and a group of 
very similar AP2 domain transcription factors from Arabidopsis are involved in 
tolerance of plants to freezing (Gilmour et al. (1998) Plant J. 16:433-442). Analysis 
of groups of similar genes with similar function that fall within one clade can yield 
sub-sequences that are particular to the clade. These sub-sequences, known as 
consensus sequences, can not only be used to define the sequences within each clade, 
but define the functions of these genes; genes within a clade may contain paralogous 
or orthologous sequences that share the same function. (See also, for example, Mount, 
D. W. (200 1 ) Bioinformatics: Sequence and Genome Analysis Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, New York page 543.) 

Speciation, the production of new species from a parental species, can also 
give rise to two or more genes with similar sequence and similar function. These 
genes, termed orthologs, often have an identical function within their host plants and 
are often interchangeable between species without losing function. Because plants 
have common ancestors, many genes in any plant species will have a corresponding 
orthologous gene in another plant species. Once a phylogenic tree for a gene family 
of one species has been constructed using a program such as CLUSTAL (Thompson 
et al. (1994) Nucleic Acids Res. 22:4673-4680; Higgins et al. (1996) Methods 
Enzymol. 266:383-402), potential orthologous sequences can placed into the 
phylogenetic tree and its relationship to genes from the species of interest can be 
determined. Once the ortholog pair has been identified, the function of the test 
ortholog can be determined by determining the function of the reference ortholog. 

Transcription factors that are homologous to the listed sequences will typically 
share at least about 30% amino acid sequence identity, or at least about 30% amino 
acid sequence identity outside of a known consensus sequence or consensus DNA- 
binding site. More closely related transcription factors can share at least about 50%, 
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about 60%, about 65%, about 70%, about 75% or about 80% or about 90% or about 
95% or about 98% or more sequence identity with the listed sequences, or with the 
listed sequences but excluding or outside a known consensus sequence or consensus 
DNA-binding site, or with the listed sequences excluding one or all conserved 
domain. Factors that are most closely related to the listed sequences share, e.g., at 
least about 85%, about 90% or about 95% or more % sequence identity to the listed 
sequences, or to the listed sequences but excluding or outside a known consensus 
sequence or consensus DNA-binding site or outside one or all conserved domain. At 
the nucleotide level the sequences will typically share at least about 40% nucleotide 
sequence identity, preferably at least about 50%, about 60%, about 70% or about 80% 
sequence identity, and more preferably about 85%, about 90%, about 95% or about 
97% or more sequence identity to one or more of the listed sequences, or to a listed 
sequence but excluding or outside a known consensus sequence or consensus DNA- 
binding site, or outside one or all conserved domain. The degeneracy of the genetic 
code enables major variations in the nucleotide sequence of a polynucleotide while . 
maintaining the amino acid sequence of the encoded protein. Conserved domains 
within a transcription factor family may exhibit a higher degree of sequence 
homology, such as at least 65% sequence identity including conservative 
substitutions, and preferably at least 80% sequence identity, and more preferably at 
least 85%>, or at least about 86%, or at least about 87%, or at least about 88%, or at 
least about 90%, or at least about 95%, or at least about 98% sequence identity. 
Transcription factors that are homologous to the listed sequences should share at least 
30%, or at least about 60%, or at least about 75%, or at least about 80%, or at least 
about 90%, or at least about 95% amino acid sequence identity over the entire length 
of the polypeptide or the homolog. In addition, transcription factors that are 
homologous to the listed sequences should share at least 30%, or at least about 60%, 
or at least about 75%, or at least about 80%, or at least about 90%, or at least about 
95% amino acid sequence similarity over the entire length of the polypeptide or the 
homolog. 

Percent identity can be determined electronically, e.g., by using the 
MEGALIGN program (DNASTAR, Inc. Madison, Wis.). The MEGALIGN program 
can create alignments between two or more sequences according to different methods, 
e.g., the clustal method. (See, e.g., Higgins, D. G. and P. M. Sharp (1988) Gene 
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73:237-244.) The clustal algorithm groups sequences into clusters by examining the 
distances between all pairs. The clusters are aligned pairwise and then in groups. 
Other alignment algorithms or programs may be used, including FASTA, BLAST, or 
ENTREZ, FASTA and BLAST. These are available as a part of the GCG sequence 
analysis package (University of Wisconsin, Madison, Wis.), and can be used with or 
without default settings. ENTREZ is available through the National Center for 
Biotechnology Information. In one embodiment, the percent identity of two 
sequences can be determined by the GCG program with a gap weight of 1, e.g., each 
amino acid gap is weighted as if it were a single amino acid or nucleotide mismatch 
between the two sequences (see USPN 6,262,333). 

Other techniques for alignment are described in Methods in Enzymology, vol. 
266: Computer Methods for Macromolecular Sequence Analysis (1996), ed. Doolittle, 
Academic Press, Inc., San Diego, Calif., USA. Preferably, an alignment program that 
permits gaps in the sequence is utilized to align the sequences. The Smith-Waterman 
is one type of algorithm that permits gaps in sequence alignments. See Methods Mol. 
Biol. 70: 173-187(1997). Also, the GAP program using the Needleman and Wunsch 
alignment method can be utilized to align sequences. An alternative search strategy 
uses MPSRCH software, which runs on a MASPAR computer. MPSRCH uses a 
Smith- Waterman algorithm to score sequences on a massively parallel computer. 
This approach improves ability to pick up distantly related matches, and is especially 
tolerant of small gaps and nucleotide sequence errors. Nucleic acid-encoded amino 
acid sequences can be used to search both protein and DNA databases. 

The percentage similarity between two polypeptide sequences, e.g., sequence 
A and sequence B, is calculated by dividing the length of sequence A, minus the 
number of gap residues in sequence A, minus the number of gap residues in sequence 
B, into the sum of the residue matches between sequence A and sequence B, times 
one hundred. Gaps of low or of no similarity between the two amino acid sequences 
are not included in determining percentage similarity. Percent identity between 
polynucleotide sequences can also be counted or calculated by other methods known 
in the art, e.g., the Jotun Hein method. (See, e.g., Hein, J. (1990) Methods Enzymol. 
183:626-645.) Identity between sequences can also be determined by other methods 
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known in the art, e.g., by varying hybridization conditions (see US Patent Application 
No. 20010010913). 

Thus, the invention provides methods for identifying a sequence similar or 
paralogous or orthologous or homologous to one or more polynucleotides as noted 
herein, or one or more target polypeptides encoded by the polynucleotides, or 
otherwise noted herein and may include linking or associating a given plant 
phenotype or gene function with a sequence. In the methods, a sequence database is 
provided (locally or across an inter or intra net) and a query is made against the 
sequence database using the relevant sequences herein and associated plant 
phenotypes or gene functions. 

In addition, one or more polynucleotide sequences or one or more 
polypeptides encoded by the polynucleotide sequences may be used to search against 
a BLOCKS (Bairoch et al. (1997) Nucleic Acids Res. 25:21 7-221), PFAM, and other 
databases which contain previously identified and annotated motifs, sequences and 
gene functions. Methods that search for primary sequence patterns with secondary 
structure gap penalties (Smith et al. (1992) Protein Engineering 5:35-51) as well as 
algorithms such as Basic Local Alignment Search Tool (BLAST; Altschul, S. R 
(1993) J. Mol. Evol. 36:290-300; Altschul et al. (1990) supra), BLOCKS (Henikoff, 
S. and Henikoff, G. J. (1991) Nucleic Acids Research 19:6565-6572), Hidden Markov 
Models (HMM; Eddy, S: R. (1996) Cur. Opin. Str. Biol. 6:361-365; Sonnhammer et 
al. (1997) Proteins 28:405-420), and the like, can be used to manipulate and analyze 
polynucleotide and polypeptide sequences encoded by polynucleotides. These 
databases, algorithms and other methods are well known in the art and are described 
in Ausubel et al. (1997; Short Protocols in Molecular Biology, John Wiley & Sons, 
New York N.Y., unit 7.7) and in Meyers, R. A. (1995; Molecular Biology and 
Biotechnology, Wiley VCH, New York N.Y., p 856-853). 

Furthermore, methods using manual alignment of sequences similar or 
homologous to one or more polynucleotide sequences or one or more polypeptides 
encoded by the polynucleotide sequences maybe used to identify regions of similarity 
and conserved domains. Such manual methods are well-known of those of skill in the 
art and can include, for example, comparisons of tertiary structure between a 
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polypeptide sequence encoded by a polynucleotide which comprises a known function 
with a polypeptide sequence encoded by a polynucleotide sequence which has a 
function not yet determined. Such examples of tertiary structure may comprise 
predicted alpha helices, beta-sheets, amphipathic helices, leucine zipper motifs, zinc 
finger motifs, proline-rich regions, cysteine repeat motifs, and the like. 

VL Identifying Polynucleotides or Nucleic Acids by Hybridization 

Polynucleotides homologous to the sequences illustrated in the Sequence 
Listing and tables can be identified, e.g., by hybridization to each other under 
stringent or under highly stringent conditions. Single stranded polynucleotides 
hybridize when they associate based on a variety of well characterized physical- 
chemical forces, such as hydrogen bonding, solvent exclusion, base stacking and the 
like. The stringency of a hybridization reflects the degree of sequence identity of the 
nucleic acids involved, such that the higher the stringency, the more similar are the 
two polynucleotide strands. Stringency is influenced by a variety of factors, including 
temperature, salt concentration and composition, organic and non-organic additives; 
solvents, etc. present in both the hybridization and wash solutions and incubations 
(and number thereof), as described in more detail in the references cited above. 
Encompassed by the invention are polynucleotide sequences that are capable of 
hybridizing to the claimed polynucleotide sequences, and, in particular, to those 
shown in SEQ ID NOs: 860; 802; 240; 274; 558; 24; 1120; 44; 460; 286; 120; 130; 
134; 698; 832; 580; 612; 48, and fragments thereof under various conditions of 
stringency. (See, e.g., Wahl, G. M. and S. L. Berger (1987) Methods Enzymol. 
152:399-407; Kimmel, A. R. (1987) Methods Enzymol 152:507-51 1.) Estimates of 
homology are provided by either DNA-DNA or DNA-RNA hybridization under 
conditions of stringency as is well understood by those skilled in the art (Hames and 
Higgins, Eds. (1985) Nucleic Acid Hybridisation, JRL Press, Oxford, U.K.). 
Stringency conditions can be adjusted to screen for moderately similar fragments, 
such as homologous sequences from distantly related organisms, to highly similar 
fragments, such as genes that duplicate functional enzymes from closely related 
organisms. Post-hybridization washes determine stringency conditions. 

In addition to the nucleotide sequences listed in Tables 4 and 5, full length 
cDNA, orthologs, paralogs and homologs of the present nucleotide sequences may be 
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identified and isolated using well known methods. The cDNA libraries orthologs, 
paralogs and homologs of the present nucleotide sequences may be screened using 
hybridization methods to determine their utility as hybridization target or 
amplification probes. 

An example of stringent hybridization conditions for hybridization of 
complementary nucleic acids which have more than 100 complementary residues on a 
filter in a Southern or northern blot is about 5°C to 20°C lower than the thermal 
melting point (T m ) for the specific sequence at a defined ionic strength and pH. The 
T m is the temperature (under defined ionic strength and pH) at which 50% of the 
target sequence hybridizes to a perfectly matched probe. Nucleic acid molecules that 
hybridize under stringent conditions will typically hybridize to a probe based on either 
the entire cDNA or selected portions, e.g., to a unique subsequence, of the cDNA 
under wash conditions of 0.2x SSC to 2.0x SSC, 0.1% SDS at 50-65° C. For 
example, high stringency is about 0.2 x SSC, 0.1% SDS at 65° C. Ultra-high 
stringency will be the same conditions except the wash temperature is raised about 3 
to about 5° C, and ultra-ultra-high stringency will be the same conditions except the 
wash temperature is raised about 6 to about 9° C. For identification of less closely 
related homologues washes can be performed at a lower temperature, e.g., 50° C. In 
general, stringency is increased by raising the wash temperature and/or decreasing the 
concentration of SSC, as known in the art. 

In another example, stringent salt concentration will ordinarily be less than 
about 750 mM NaCl and 75 mM trisodium citrate, preferably less than about 500 mM 
NaCl and 50 mM trisodium citrate, and most preferably less than about 250 mM NaCl 
and 25 mM trisodium citrate. Low stringency hybridization can be obtained in the 
absence of organic solvent, e.g., formamide, while high stringency hybridization can 
be obtained in the presence of at least about 35% formamide, and most preferably at 
least about 50% formamide. Stringent temperature conditions will ordinarily include 
temperatures of at least about 30° C, more preferably of at least about 37° C, and most 
preferably of at least about 42° C. Varying additional parameters, such as 
hybridization time, the concentration of detergent, e.g., sodium dodecyl sulfate (SDS), 
and the inclusion or exclusion of carrier DNA, are well known to those skilled in the 
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art. Various levels of stringency are accomplished by combining these various 
conditions as needed. In a preferred embodiment, hybridization will occur at 30° C in 
750 mM NaCl, 75 mM trisodium citrate, and 1% SDS. In a more preferred 
embodiment, hybridization will occur at 37° C in 500 mM NaCl, 50 mM trisodium 
citrate, 1% SDS, 35% formamide, and 100 \ig/m\ denatured salmon sperm DNA 
(ssDNA). In a most preferred embodiment, hybridization will occur at 42° C in 250 
mM NaCl, 25 mM trisodium citrate, 1% SDS, 50% formamide, and 200 ng/ml 
ssDNA. Useful variations on these conditions will be readily apparent to those skilled 
in the art. 

The washing steps that follow hybridization can also vary in stringency. Wash 
stringency conditions can be defined by salt concentration and by temperature. As 
above, wash stringency can be increased by decreasing salt concentration or by 
increasing temperature. For example, stringent salt concentration for the wash steps 
will preferably be less than about 30 mM NaCl and 3 mM trisodium citrate, and most 
preferably less than about 15 mM NaCl and 1.5 mM trisodium citrate. Stringent 
temperature conditions for the wash steps will ordinarily include temperature of at 
least about 25° C, more preferably of at least about 42° C. Another preferred set of 
highly stringent conditions uses two final washes in 0.1 X SSC, 0.1% SDS at 65° C. 
The most preferred high stringency washes are of at least about 68° C. For example, 
in a preferred embodiment, wash steps will occur at 25° C in 30 mM NaCl, 3 mM 
trisodium citrate, and 0.1% SDS. In a more preferred embodiment, wash steps will 
occur at 42° C in 15 mM NaCl, 1.5 mM trisodium citrate, and 0.1% SDS. In a most 
preferred embodiment, the wash steps will occur at 68° C in 15 mM NaCl, 1.5 mM 
trisodium citrate, and 0.1% SDS. Additional variations on these conditions will be 
readily apparent to those skilled in the art (see U.S. Patent Application No. 
20010010913). 

As another example, stringent conditions can be selected such that an 
oligonucleotide that is perfectly complementary to the coding oligonucleotide 
hybridizes to the coding oligonucleotide with at least about a 5-1 Ox higher signal to 
noise ratio than the ratio for hybridization of the perfectly complementary 
oligonucleotide to a nucleic acid encoding a transcription factor known as of the filing 
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date of the application. Conditions can be selected such that a higher signal to noise 
ratio is observed in the particular assay which is used, e.g., about 15x, 25x, 35x, 50x 
or more. Accordingly, the subject nucleic acid hybridizes to the unique coding 
oligonucleotide with at least a 2x higher signal to noise ratio as compared to 
hybridization of the coding oligonucleotide to a nucleic acid encoding known 
polypeptide. Again, higher signal to noise ratios can be selected, e.g., about 5x, lOx, 
25x, 35x, 50x or more. The particular signal will depend on the label used in the 
relevant assay, e.g., a fluorescent label, a colorimetric label, a radioactive label, or the 
like. 

Alternatively, transcription factor homolog polypeptides can be obtained by 
screening an expression library using antibodies specific for one or more transcription 
factors. With the provision herein of the disclosed transcription factor, and 
transcription factor homologue nucleic acid sequences, the encoded polypeptide(s) 
can be expressed and purified in a heterologous expression system (e.g., E. coli) and 
used to raise antibodies (monoclonal or polyclonal) specific for the polypeptide(s) in 
question. Antibodies can also be raised against synthetic peptides derived from 
transcription factor, or transcription factor homologue, amino acid sequences. 
Methods of raising antibodies are well known in the art and are described in Harlow 
and Lane (1988) Antibodies: A Laboratory Manual Cold Spring Harbor Laboratory, 
New York. Such antibodies can then be used to screen an expression library 
produced from the plant from which it is desired to clone additional transcription 
factor homologues, using the methods described above. The selected cDNAs can be 
confirmed by sequencing and enzymatic activity. 

VII. Sequence Variations 

It will readily be appreciated by those of skill in the art, that any of a variety of 
polynucleotide sequences are capable of encoding the transcription factors and 
transcription factor homologue polypeptides of the invention. Due to the degeneracy 
of the genetic code, many different polynucleotides can encode identical and/or 
substantially similar polypeptides in addition to those sequences illustrated in the 
Sequence Listing. Nucleic acids having a sequence that differs from the sequences 
shown in the Sequence Listing, or complementary sequences, that encode functionally 
equivalent peptides (i.e., peptides having some degree of equivalent or similar 
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biological activity) but differ in sequence from the sequence shown in the sequence 
listing due to degeneracy in the genetic code, are also within the scope of the 
invention. 

Altered polynucleotide sequences encoding polypeptides include those 
sequences with deletions, insertions, or substitutions of different nucleotides, resulting 
in a polynucleotide encoding a polypeptide with at least one functional characteristic 
of the instant polypeptides. Included within this definition are polymorphisms which 
may or may not be readily detectable using a particular oligonucleotide probe of the 
polynucleotide encoding the instant polypeptides, and improper or unexpected 
hybridization to allelic variants, with a locus other than the normal chromosomal 
locus for the polynucleotide sequence encoding the instant polypeptides. 

Allelic variant refers to any of two or more alternative forms of a gene 
occupying the same chromosomal locus. Allelic variation arises naturally through 
mutation, and may result in phenotypic polymorphism within populations. Gene 
mutations can be silent (i.e., no change in the encoded polypeptide) or may encode 
polypeptides having altered amino acid sequence. The term allelic variant is also used 
herein to denote a protein encoded by an allelic variant of a gene. Splice variant refers 
to alternative forms of RNA transcribed from a gene. Splice variation arises naturally 
through use of alternative splicing sites within a transcribed RNA molecule, or less 
commonly between separately transcribed RNA molecules, and may result in several 
mRNAs transcribed from the same gene. Splice variants may encode polypeptides 
having altered amino acid sequence. The term splice variant is also used herein to 
denote a protein encoded by a splice variant of an mRNA transcribed from a gene. 

Those skilled in the art would recognize that the polypeptide sequence G681, 
SEQ ID NO: 580, represents a single transcription factor; allelic variation and 
alternative splicing may be expected to occur. Allelic variants of the polypeptide 
sequence of SEQ ID NO: 579 can be cloned by probing cDNA or genomic libraries 
from different individual organisms according to standard procedures. Allelic 
variants of the DNA sequence shown in SEQ ID NO: 579, including those containing 
silent mutations and those in which mutations result in amino acid sequence changes, 
are within the scope of the present invention, as are proteins which are allelic variants 
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of SEQ ID NO: 580. cDNAs generated from alternatively spliced mKNAs, which 
retain the properties of the transcription factor are included within the scope of the 
present invention, as are polypeptides encoded by such cDNAs and mRNAs. Allelic 
variants and splice variants of these sequences can be cloned by probing cDNA or 
genomic libraries from different individual organisms or tissues according to standard 
procedures known in the art (see USPN 6,388,064). 

For example, Table 1 illustrates, e.g., that the codons AGC, AGT, TCA, TCC, 
TCG, and TCT all encode the same amino acid: serine. Accordingly, at each position 
in the sequence where there is a codon encoding serine, any of the above trinucleotide 
sequences can be used without altering the encoded polypeptide. 
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Table 1 



Amino acid 


Possible Codons 


Alanine 


Ala 


A 


GCA 


GCC 


GCG 


GCU 






Cysteine 


Cys 


C 


TGC 


TGT 










Aspartic acid 


Asp 


D 


GAC 


GAT 










Glutamic acid Glu 


E 


GAA 


GAG 










Phenylalanine Phe 


F 


TTC 


TTT 










Glycine 


Gly 


G 


GGA 


GGC 


GGG 


GGT 






Histidine 


His 


H 


CAC 


CAT 










Isoleucine 


He 


I 


ATA 


ATC 


ATT 








Lysine 


Lys 


K 


AAA 


AAG 










Leucine 


Leu 


L 


TTA 


TTG 


CTA 


CTC 


CTG 


CTT 


Methionine 


Met 


M 


ATG 












Asparagine 


Asn 


N 


AAC 


AAT 










Proline 


Pro 


P 


CCA 


CCC 


CCG 


CCT 






Glutamine 


Gin 


Q 


CAA 


CAG 










Arginine 


Arg 


R 


AGA 


AGG 


CGA 


CGC 


CGG 


CGT 


Serine 


Ser 


S 


AGC 


AGT 


TCA 


TCC 


TCG 


TCT 


Threonine 


Thr 


T 


ACA 


ACC 


ACG 


ACT 






Valine 


Val 


V 


GTA 


GTC 


GTG 


GTT 






Tryptophan 


Tip 


w 


TGG 












Tyrosine 


Tyr 


Y 


TAC 


TAT 











Sequence alterations that do not change the amino acid sequence encoded by 
the polynucleotide are termed "silent" variations. With the exception of the codons 
ATG and TGG, encoding methionine and tryptophan, respectively, any of the possible 
codons for the same amino acid can be substituted by a variety of techniques, e.g., 
site-directed mutagenesis, available in the art. Accordingly, any and all such 
variations of a sequence selected from the above table are a feature of the invention. 
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In addition to silent variations, other conservative variations that alter one, or a 
few amino acids in the encoded polypeptide, can be made without altering the 
function of the polypeptide, these conservative variants are, likewise, a feature of the 
invention. 

For example, substitutions, deletions and insertions introduced into the 
sequences provided in the Sequence Listing are also envisioned by the invention. 
Such sequence modifications can be engineered into a sequence by site-directed 
mutagenesis (Wu (ed.) Meth. Enzymol . (1993) vol. 217, Academic Press) or the other 
methods noted below. Amino acid substitutions are typically of single residues; 
insertions usually will be on the order of about from 1 to 10 amino acid residues; and 
deletions will range about from 1 to 30 residues. In preferred embodiments, deletions 
or insertions are made in adjacent pairs, e.g., a deletion of two residues or insertion of 
two residues. Substitutions, deletions, insertions or any combination thereof can be 
combined to arrive at a sequence. The mutations that are made in the polynucleotide 
encoding the transcription factor should not place the sequence out of reading frame 
and should not create complementary regions that could produce secondary mRNA 
structure. Preferably, the polypeptide encoded by the DNA performs the desired 
function. 

Conservative substitutions are those in which at least one residue in the amino 
acid sequence has been removed and a different residue inserted in its place. Such 
substitutions generally are made in accordance with the Table 2 when it is desired to 
maintain the activity of the protein. Table 2 shows amino acids which can be 
substituted for an amino acid in a protein and which are typically regarded as 
conservative substitutions. 
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Table 2 



Residue 


Conservative 




Substitutions 


Ala 


Ser 


Arg 


Lys 


Asn 


Gin; His 


Asp 


Glu 


Gin 


Asn 


Cys 


Ser 


Glu 


Asp 


Gly 


Pro 


His 


Asn; Gin 


He 


Leu, Val 


Leu 


He; Val 


Lys 


Arg; Gin 


Met 


Leu; De 


Phe 


Met; Leu; Tyr 


Ser 


Thr; Gly 


Thr 


Ser; Val 


Trp 


Tyr 


Tyr 


Trp; Phe 


Val 


He; Leu 



Similar substitutions are those in which at least one residue in the amino acid 
sequence has been removed and a different residue inserted in its place. Such 
substitutions generally are made in accordance with the Table 3 when it is desired to 
maintain the activity of the protein. Table 3 shows amino acids which can be 
substituted for an amino acid in a protein and which are typically regarded as 
structural and functional substitutions. For example, a residue in column 1 of Table 3 
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may be substituted with residue in column 2; in addition, a residue in column 2 of 
Table 3 may be substituted with the residue of column 1 . 



Table 3 



Residue 


Similar Substitutions 


A 1 

Ala 


Ser; Thr; Gly; Val; Leu; He 


Arg 


Lys; His; Gly 


Asn 


Gin; His; Gly; Ser; Thr 


Asp 


Glu, Ser; Thr 


Gin 


Asn; Ala 


Cys 


Ser; Gly 


Glu 


Asp 


Gly 


Pro; Arg 


His 


Asn; Gin; Tyr; Phe; Lys; Arg 


lie 


Ala; Leu; Val; Gly; Met 


Leu 


Ala; He; Val; Gly; Met 


T vc 


Arp- T-Tk- Gin* Glv Pro 


Met 


Leu; He; Phe 


Phe 


Met; Leu; Tyr; Tip; His; Val; 




Ala 


Ser 


Thr; Gly; Asp; Ala; Val; lie; His 


Thr 


Ser; Val; Ala; Gly 


Trp 


Tyr; Phe; His 


Tyr 


Trp; Phe; His 


Val 


Ala; He; Leu; Gly; Thr; Ser; Glu 



Substitutions that are less conservative than those in Table 2 can be selected 
by picking residues that differ more significantly in their effect on maintaining (a) the 
structure of the polypeptide backbone in the area of the substitution, for example, as a 
sheet or helical conformation, (b) the charge or hydrophobicity of the molecule at the 
target site, or (c) the bulk of the side chain. The substitutions which in general are 
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expected to produce the greatest changes in protein properties will be those in which 
(a) a hydrophilic residue, e.g., seryl or threonyl, is substituted for (or by) a 
hydrophobic residue, e.g., leucyl, isoleucyl, phenylalanyl, valyl or alanyl; (b) a 
cysteine or proline is substituted for (or by) any other residue; (c) a residue having an 
electropositive side chain, e.g., lysyl, arginyl, or histidyl, is substituted for (or by) an 
electronegative residue, e.g., glutamyl or aspartyl; or (d) a residue having a bulky side 
chain, e.g., phenylalanine, is substituted for (or by) one not having a side chain, e.g., 
glycine. 

VIIL Further Modifying Sequences of the Invention - Mutation/Forced 
Evolution 

In addition to generating silent or conservative substitutions as noted, above, 
the present invention optionally includes methods of modifying the sequences of the 
Sequence Listing. In the methods, nucleic acid or protein modification methods are 
used to alter the given sequences to produce new sequences and/or to chemically or 
enzymatically modify given sequences to change the properties of the nucleic acids or 
proteins. 

Thus, in one embodiment, given nucleic acid sequences are modified, e.g., 
according to standard mutagenesis or artificial evolution methods to produce modified 
sequences. The modified sequences may be created using purified natural 
polynucleotides isolated from any organism or may be synthesized from purified 
compositions and chemicals using chemical means well know to those of skill in the 
art. For example, Ausubel, supra, provides additional details on mutagenesis 
methods. Artificial forced evolution methods are described, for example, by Stemmer 
(1994) Nature 370:389-391, Stemmer (1994) Proc. Natl Acad. Sci. USA 91:10747- 
10751, and U.S. Patents 5,81 1,238, 5,837,500, and 6,242,568. Methods for 
engineering synthetic transcription factors and other polypeptides are described, for 
example, by Zhang et al. (2000) J. Biol. Chem. 275:33850-33860, Liu et aL (2001) I 
Biol Chem. 276:1 1323-1 1334, and Isalan et aL (2001) Nature Biotechnol. 19:656- 
660. Many other mutation and evolution methods are also available and expected to 
be within the skill of the practitioner. 
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Similarly, chemical or enzymatic alteration of expressed nucleic acids and 
polypeptides can be performed by standard methods. For example, sequence can be 
modified by addition of lipids, sugars, peptides, organic or inorganic compounds, by 
the inclusion of modified nucleotides or amino acids, or the like. For example, 
protein modification techniques are illustrated in Ausubel, supra. Further details on 
chemical and enzymatic modifications can be found herein. These modification 
methods can be used to modify any given sequence, or to modify any sequence 
produced by the various mutation and artificial evolution modification methods noted 
herein. 

Accordingly, the invention provides for modification of any given nucleic acid 
by mutation, evolution, chemical or enzymatic modification, or other available 
methods, as well as for the products produced by practicing such methods, e.g., using 
the sequences herein as a starting substrate for the various modification approaches. 

For example, optimized coding sequence containing codons preferred by a 
particular prokaryotic or eukaryotic host can be used e.g., to increase the rate of 
translation or to produce recombinant RNA transcripts having desirable properties, 
such as a longer half-life, as compared with transcripts produced using a non- 
optimized sequence. Translation stop codons can also be modified to reflect host 
preference. For example, preferred stop codons for Saccharomyces cerevisiae and 
mammals are TAA and TGA, respectively. The preferred stop codon for 
monocotyledonous plants is TGA, whereas insects and E. coli prefer to use TAA as 
the stop codon. 

The polynucleotide sequences of the present invention can also be engineered 
in order to alter a coding sequence for a variety of reasons, including but not limited 
to, alterations which modify the sequence to facilitate cloning, processing and/or 
expression of the gene product. For example, alterations are optionally introduced 
using techniques which are well known in the art, e.g., site-directed mutagenesis, to 
insert new restriction sites, to alter glycosylation patterns, to change codon preference, 
to introduce splice sites, etc. 
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Furthermore, a fragment or domain derived from any of the polypeptides of 
the invention can be combined with domains derived from other transcription factors 
or synthetic domains to modify the biological activity of a transcription factor. For 
instance, a DNA-binding domain derived from a transcription factor of the invention 
can be combined with the activation domain of another transcription factor or with a 
synthetic activation domain. A transcription activation domain assists in initiating 
transcription from a DNA-binding site. Examples include the transcription activation 
region of VP16 or GAL4 (Moore et al. (1998) Proc. Natl Acad. Sci. USA 95: 376- 
381; and Aoyama et al. (1995) Plant Cell 7:1773-1785), peptides derived from 
bacterial sequences (Ma and Ptashne (1987) Cell 51; 1 13-119) and synthetic peptides 
(Giniger and Ptashne, (1987) Nature 330:670-672). 

IX. Expression and Modification of Polypeptides 

Typically, polynucleotide sequences of the invention are incorporated into 
recombinant DNA (or RNA) molecules that direct expression of polypeptides of the 
invention in appropriate host cells, transgenic plants, in vitro translation systems, or 
the like. Due to the inherent degeneracy of the genetic code, nucleic acid sequences 
which encode substantially the same or a functionally equivalent amino acid sequence 
can be substituted for any listed sequence to providefor cloning and expressing the 
relevant homologue. 

X. Vectors, Promoters, and Expression Systems 

The present invention includes recombinant constructs comprising one or 
more of the nucleic acid sequences herein. The constructs typically comprise a 
vector, such as a plasmid, a cosmid, a phage, a virus (e.g., a plant virus), a bacterial 
artificial chromosome (BAC), a yeast artificial chromosome (Y AC), or the like, into 
which a nucleic acid sequence of the invention has been inserted, in a forward or 
reverse orientation. In a preferred aspect of this embodiment, the construct further 
comprises regulatory sequences, including, for example, a promoter, operably linked 
to the sequence. Large numbers of suitable vectors and promoters are known to those 
of skill in the art, and are commercially available. 

General texts that describe molecular biological techniques useful herein, 
including the use and production of vectors, promoters and many other relevant 

41 



BNSDOCID: <WO_03013227A2_IA> 



WO 03/013227 ' 



PCT/US02/25805 



topics, include Berger, Sambrook and Ausubel, supra. Any of the identified sequences 
can be incorporated into a cassette or vector, e.g., for expression in plants. A number of 
expression vectors suitable for stable transformation of plant cells or for the 
establishment of transgenic plants have been described including those described in 
Weissbach and Weissbach, (1989) Methods for Plant Molecular Biology , Academic 
Press, and Gelvin et al., (1 990) Plant Molecular Biology Manual Kluwer Academic 
Publishers. Specific examples include those derived from a Ti plasmid of 
Agrobacterium tumefaciens, as well as those disclosed by Herrera-Estrella et al. 
(1983) Nature 303: 209, Bevan (1984) Nucl Acid Res. 12: 871 1-8721, Klee (1985) 
Bio/Technology 3: 637-642, for dicotyledonous plants. 

Alternatively, non-Ti vectors can be used to transfer the DNA into 
monocotyledonous plants and cells by using free DNA delivery techniques. Such 
methods can involve, for example, the use of liposomes, electroporation, 
microprojectile bombardment, silicon carbide whiskers, and-viruses. By using these 
methods transgenic plants such as wheat, rice (Christou (1991) Bio/Technology 9: 
957-962) and corn (Gordon-Kamm (1990) Plant Cell 2: 603-618) can be produced. 
An immature embryo can also be a good target tissue for monocots for direct DNA 
delivery techniques by using the particle gun (Weeks et al. (1 993) Plant Physiol 102: 
1077-1084; Vasil (1993) Bio/Technology 10: 667-674; Wan and Lemeaux (1994) 
Plant Physiol 104: 37-48, and for Agrobacteriwn-mGdiated DNA transfer (Ishida et al. 
(1996) Nature Biotech 14: 745-750). 

Typically, plant transformation vectors include one or more cloned plant 
coding sequence (genomic or cDNA) under the transcriptional control of 5* and 3 1 
regulatory sequences and a dominant selectable marker. Such plant transformation 
vectors typically also contain a promoter (e.g., a regulatory region controlling 
inducible or constitutive, environmentally-or developmentally-regulated, or cell- or 
tissue-specific expression), a transcription initiation start site, an RNA processing 
signal (such as intron splice sites), a transcription termination site, and/or a 
polyadenylation signal. 

Examples of constitutive plant promoters which can be useful for expressing 
the TF sequence include: the cauliflower mosaic virus (CaMV) 35S promoter, which 
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confers constitutive, high-level expression in most plant tissues (see, e.g., Odell et al. 
(1985) Nature 313:810-812); the nopaline synthase promoter (An et aL (1988) Plant 
Physiol 88:547-552); and the octopine synthase promoter (Fromm et al. (1989) Plant 
Cell 1:977-984). 

A variety of plant gene promoters that regulate gene expression in response to 
environmental, hormonal, chemical, developmental signals, and in a tissue-active 
manner can be used for expression of a TF sequence in plants. Choice of a promoter 
is based largely on the phenotype of interest and is determined by such factors as 
tissue (e.g., seed, fruit, root, pollen, vascular tissue, flower, carpel, etc.), inducibility 
(e.g., in response to wounding, heat, cold, drought, light, pathogens, etc.), timing, 
developmental stage, and the like. Numerous known promoters have been 
characterized and can favorably be employed to promote expression of a 
polynucleotide of the invention in a transgenic plant or cell of interest. For example, 
tissue specific promoters include: seed-specific promoters (such as the napin, 
phaseolin or DC3 promoter described in US Pat. No. 5,773,697), fruit-specific 
promoters that are active during fruit ripening (such as the dru 1 promoter (US Pat. 
No. 5,783,393), or the 2A1 1 promoter (US Pat. No. 4,943,674) and the tomato 
polygalacturonase promoter (Bird et al. (1988) Plant Mol Biol 1 1:651), root-specific 
promoters, such as those disclosed in US Patent Nos. 5,618,988, 5,837,848 and 
5,905,1 86, pollen-active promoters such as PTA29, PTA26 and PTA13 (US Pat. No. 
5,792,929), promoters active in vascular tissue (Ringli and Keller (1998) Plant Mol 
Biol 37:977-988), flower-specific (Kaiser et al, (1995) Plant Mol Biol 28:231-243), 
pollen (Baerson et al. (1994) Plant Mol Biol 26:1947-1959), caipels (Ohl et al. (1990) 
Plant Cell 2:837-848), pollen and ovules (Baerson et al. (1993) Plant Mol Biol 
22:255-267), auxin-inducible promoters (such as that described in van der Kop et al. 
(1999) Plant Mol Biol 39:979-990 or Baumann et al. (1999) Plant Cell 11:323-334),. 
cytokinin-inducible promoter (Guevara-Garcia (1998) Plant Mol Biol 38:743-753), 
promoters responsive to gibberellin (Shi et al. (1998) Plant Mol Biol 38:1053-1060, 
Willmott et al. (1998) 38:817-825) and the like. Additional promoters are those that 
elicit expression in response to heat (Ainley et al. (1993) Plant Mol Biol 22: 13-23), 
light (e.g., the pea rbcS-3 A promoter, Kuhlemeier et al. (1989) Plant Cell 1:471, and 
the maize rbcS promoter, Schaffiier and Sheen (1991) Plant Cell 3: 997); wounding 
(e.g., wunl, Siebertz et al. (1989) Plant Cell 1: 961); pathogens (such as the PR-1 
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promoter described in Buchel et al. (1999) Plant Mol. Biol. 40:387-396, and the 
PDF1 .2 promoter described in Manners et al. (1998) Plant Mol. Biol. 38:1071-80), 
and chemicals such as methyl jasmonate or salicylic acid (Gatz et al. (1997) Plant Mol 
Biol 48: 89-108). In addition, the timing of the expression can be controlled by using 
promoters such as those acting at senescence (An and Amazon (1995) Science 270: 
1986-1988); or late seed development (Odell et al. (1994) Plant Physiol 106:447-458). 

Plant expression vectors can also include RNA processing signals that can be 
positioned within, upstream or downstream of the coding sequence. In addition, the 
expression vectors can include additional regulatory sequences from the 3'- 
untranslated region of plant genes, e.g., a 3* terminator region to increase mRNA 
stability of the mRNA, such as the PHI terminator region of potato or the octopine or 
nopaline synthase 3' terminator regions. 

Additional Expression Elements 

Specific initiation signals can aid in efficient translation of coding sequences. 
These signals can include, e.g., the ATG initiation codon and adjacent sequences. In 
cases where a coding sequence, its initiation codon and upstream sequences are 
inserted into the appropriate expression vector, no additional translational control 
signals may be needed. However, in cases where only coding sequence (e.g., a 
mature protein coding sequence), or a portion thereof, is inserted, exogenous 
transcriptional control signals including the ATG initiation codon can be separately 
provided. The initiation codon is provided in the correct reading frame to facilitate 
transcription. Exogenous transcriptional elements and initiation codons can be of 
various origins, both natural and synthetic. The efficiency of expression can be 
enhanced by the inclusion of enhancers appropriate to the cell system in use. 

Expression Hosts 

The present invention also relates to host cells which are transduced with 
vectors of the invention, and the production of polypeptides of the invention 
(including fragments thereof) by recombinant techniques. Host cells are genetically 
engineered (i.e., nucleic acids are introduced, e.g., transduced, transformed or 
transfected) with the vectors of this invention, which may be, for example, a cloning 
vector or an expression vector comprising the relevant nucleic acids herein. The 
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vector is optionally a plasmid, a viral particle, a phage, a naked nucleic acid, etc. The 
engineered host cells can be cultured in conventional nutrient media modified as 
appropriate for activating promoters, selecting transfonnants, or amplifying the 
relevant gene. The culture conditions, such as temperature, pH and the like, are those 
previously used with the host cell selected for expression, and will be apparent to 
those skilled in the art and in the references cited herein, including, Sambrook and 
Ausubel. 

The host cell can be a eukaryotic cell, such as a yeast cell, or a plant cell, or 
the host cell can be a prokaryotic cell, such as a bacterial cell Plant protoplasts are 
also suitable for some applications. For example, the DNA fragments are introduced 
into plant tissues, cultured plant cells or plant protoplasts by standard methods 
including electroporation (Fromm et al., (1985) Proc. Natl Acad. Sci. USA 82, 5824, 
infection by viral vectors such as cauliflower mosaic virus (CaMV) (Hohn et al., 
(1982) Molecular Biology of Plant Tumors , (Academic Press, New York) pp. 549- 
560; US 4,407,956), high velocity ballistic penetration by small particles with the 
nucleic acid either within the matrix of small beads or particles, or on the surface 
(Klein et al, (1987) Nature 327, 70-73), use of pollen as vector (WO 85/01856), or 
use of Agrobacterhtm tumefaciens or A. rhizogenes carrying a T-DNA plasmid in 
which DNA fragments are cloned. The T-DNA plasmid is transmitted to plant cells 
upon infection by Agrobacterium tumefaciens, and a portion is stably integrated into 
the plant genome (Horsch et al. (1984) Science 233:496-498; Fraley et al. (1983) 
Proc. Natl Acad. Sci. USA 80, 4803). 

The cell can include a nucleic acid of the invention which encodes a 
polypeptide, wherein the cells expresses a polypeptide of the invention. The cell can 
also include vector sequences, or the like. Furthermore, cells and transgenic plants 
that include any polypeptide or nucleic acid above or throughout this specification, 
e.g., produced by transduction of a vector of the invention, are an additional feature of 
the invention. 

For long-term, high-yield production of recombinant proteins, stable 
expression can be used. Host cells transformed with a nucleotide sequence encoding 
a polypeptide of the invention are optionally cultured under conditions suitable for the 
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( » expression and recovery of the encoded protein from cell culture. The protein or 
fragment thereof produced by a recombinant cell may be secreted, membrane-bound, 

J 

or contained intracellular^, depending on the sequence and/or the vector used. As 
will be understood by those of skill in the art, expression vectors containing 
polynucleotides encoding mature proteins of the invention can be designed with 
signal sequences which direct secretion of the mature polypeptides through a 
prokaryotic or eukaryotic cell membrane. 

XI. Modified Amino Acid Residues 

Polypeptides of the invention may contain one or more modified amino acid 
residues. The presence of modified amino acids may be advantageous in, for 
example, increasing polypeptide half-life, reducing polypeptide antigenicity or 
toxicity, increasing polypeptide storage stability, or the like. Amino acid residue(s) 
are modified, for example, co-translationally or post-translationally during 
recombinant production or modified by synthetic or chemical means. 

Non-limiting examples of a modified amino acid residue include incorporation 
or other use of acetylated amino acids, glycosylated amino acids, sulfated amino 
acids, prenylated (e.g., farnesylated, geranylgeranylated) amino acids, PEG modified 
(e.g., "PEGylated") amino acids, biotinylated amino acids, carboxylated amino acids, 
phosphorylated amino acids, etc. References adequate to guide one of skill in the 
modification of amino acid residues are replete throughout the literature. 

The modified amino acid residues may prevent or increase affinity of the 
polypeptide for another molecule, including, but not limited to, polynucleotide, 
proteins, carbohydrates, lipids and lipid derivatives, and other organic or synthetic 
compounds. 

XII. Identification of Additional Factors 

A transcription factor provided by the present invention can also be used to 
identify additional endogenous or exogenous molecules that can affect a phentoype or 
trait of interest. On the one hand, such molecules include organic (small or large 
molecules) and/or inorganic compounds that affect expression of (i.e., regulate) a 
particular transcription factor. Alternatively, such molecules include endogenous 
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molecules that are acted upon either at a transcriptional level by a transcription factor 
of the invention to modify a phenotype as desired. For example, the transcription 
factors can be employed to identify one or more downstream gene with which is 
subject to a regulatory effect of the transcription factor. In one approach, a 
transcription factor or transcription factor homologue of the invention is expressed in 
a host cell, e.g., a transgemc plant cell, tissue or explant, *«a ****** V^ots, 
either RNA or protein, of likely or random targets are monitored, e.g., by 
hybridization to a microarray of nucleic acid probes corresponding to genes expressed 
in a tissue or cell type of interest, by two-dimensional gel electrophoresis of protem 
products, or by any other method known in the art for assessing expression of gene 
products at the level of RNA or protein. Alternatively, a transcription factor of the 
invention can be used to identify promoter sequences (i.e., binding sites) involved m 
the relation of a downstream target. After identifying a promoter sequence, 
interactions between the transcription factor and the promoter sequence can be 
modified by changing specific nucleotides in the promoter sequence or specific ammo 
acids in the transcription factor that interact with the promoter sequence to alter a 
plant trait. Typically, transcription factor DNA-binding sites are identified by gel 
shift assays. After identifying the promoter regions, the promoter region sequences 
can be employed in double-stranded DNA arrays to identify molecules that affect the 
interactions of the transcription factors with their promoters (Bulyk et al. (1999) 
. N^nre Biot echnology 17:573-577). 

The identified transcription factors are also useful to identify proteins that 
modify the activity of the transcription factor. Such modification can occur by 
covalentmodification, such as by phosphorylation, or by protein-protein (homo or- 
heteropolymer) interactions. Any method suitable for detecting protem-protem 
interactions canbe employed. Among themethods that canbe employed are co- 
immunoprecipitation, cross-hnking and co-purification through gradients or 
chromatographic columns, and the two-hybrid yeast system. 

The two-hybrid system detects protein interactions in vivo and is described in 
Chien et al. ((1991), Proc Natl. Acad. SciJJ SA 88:9578-9582) and is commercially 
available from Clontech (Palo Alto, Calif.). In such a system, plasmids are 
constructed that encode two hybrid proteins: one consists of the DNA-binding domain 
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of a transcription activator protein -fused to the TF polypeptide and the other consists 
of the transcription activator protein's activation domain fused to an unknown protein 
that is encoded by a cDNA that has been recombined into the plasmid as part of a 
cDNA library. The DNA-binding domain fusion plasmid and the cDNA library are 
transformed into a strain of the yeast Saccharomyces cerevisiae that contains a 
reporter gene (e.g., lacZ) whose regulatory region contains the transcription activator's 
binding site. Either hybrid protein alone cannot activate transcription of the reporter 
gene. Interaction of the two hybrid proteins reconstitutes the functional activator 
protein and results in expression of the reporter gene, which is detected by an assay 
for the reporter gene product. Then, the library plasmids responsible for reporter gene 
expression are isolated and sequenced to identify tire proteins encoded by the library 
plasmids. After identifying proteins that interact with the transcription factors, assays 
for compounds that interfere with the TF protein-protein interactions can be 
prefonned. 



XIII. Identification of Modulators 

In addition to the intracellular- molecules described above, extracellular 
molecules that alter activity or expression of a transcription factor, either directly or 
indirectly, can be identified. For example, the methods can entail first placing a 
candidate molecule in contact with a plant or plant cell. The molecule can be 
introduced by topical administration, such as spraying or soaking of a plant, and then 
the molecule's effect on the expression or activity of the TF polypeptide or the 
expression of the polynucleotide monitored. Changes in the expression of the TF 
polypeptide can be monitored by use of polyclonal or monoclonal antibodies, gel 
electrophoresis or the like. Changes in the expression of the corresponding 
polynucleotide sequence can be detected by use of microarrays, Northerns, 
quantitative PCR, or any other technique for monitoring changes in mRNA 
expression. These techniques are exemplified in Ausubel et al. (eds) Current 
Protocols in Molecular Biology , John Wiley & Sons (1998, and supplements through 
2001). Such changes in the expression levels can be correlated with modified plant 
traits and thus identified molecules can be useful for soaking or spraying on fruit, 
vegetable and grain crops to modify traits in plants. 



48 



BNSDOCID: <WO_03013227A2_IA> 



WO 03/013227 



PCT/US02/25805 



Essentially any available composition can be tested for modulatory activity of 
expression or activity of any nucleic acid or polypeptide herein. Thus, available 
libraries of compounds such as chemicals, polypeptides, nucleic acids and the like can 
be tested for modulatory activity. Often, potential modulator compounds can be 
dissolved in aqueous or organic (e.g., DMSO-based) solutions for easy delivery to the 
cell or plant of interest in which the activity of the modulator is to be tested. 
Optionally, the assays are designed to screen large modulator composition libraries by 
automating the assay steps and providing compounds from any convenient source to 
assays, which are typically run in parallel (e.g., in microtiter formats on microtiter 
plates in robotic assays). 

In one embodiment, high throughput screening methods involve providing a 
combinatorial library containing a large number of potential compounds (potential 
modulator compounds). Such "combinatorial chemical libraries" are then screened in 
one or more assays, as described herein, to identify those library members (particular 
chemical species or subclasses) that display a desired characteristic activity. The 
compounds thus identified can serve as target compounds. 

A combinatorial chemical library can be, e.g., a collection of diverse chemical 
compounds generated by chemical synthesis or biological synthesis. For example, a 
combinatorial chemical library such as a polypeptide library is formed by combining a 
set of chemical building blocks (e.g., in one example, amino acids) in every possible 
way for a given compound length (i.e., the number of amino acids in a polypeptide 
compound of a set length). Exemplary libraries include peptide libraries, nucleic acid 
libraries, antibody libraries (see, e.g., Vaughn et al. (1996) Nature Biotechnology, 
14(3):309-314 and PCT/US96/10287), carbohydrate libraries (see, e.g., Liang et al. 
Science (1996) 274:1520-1522 and U.S. Patent 5,593,853), peptide nucleic acid 
libraries (see, e.g., U.S. Patent 5,539,083), and small organic molecule libraries (see, 
e.g., benzodiazepines, Baum C&EN Jan 18, page 33 (1993); isoprenoids, U.S. Patent 
5,569,588; thiazolidinones and metathiazanones, U.S. Patent 5,549,974; pyrrolidines, 
U.S. Patents 5,525,735 and 5,519,134; morpholino compounds, U.S. Patent 
5,506,337) and the like. 
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Preparation and screening of combinatorial or other libraries is well known to 
those of skill in the art. Such combinatorial chemical libraries include, but are not 
limited to, peptide libraries (see, e.g., U.S. Patent 5,010,175; Furka, (1991) Int. J. 
Pept. Prot. Res. 37:487-493; and Houghton et al. (1991) Nature 354:84-88). Other 
chemistries for generating chemical diversity libraries can also be used. 

In addition, as noted, compound screening equipment for high-throughput 
screening is generally available, e.g., using any of a number of well known robotic 
systems that have also been developed for solution phase chemistries useful in assay 
systems. These systems include automated workstations including an automated 
synthesis apparatus and robotic systems utilizing robotic arms. Any of the above 
devices are suitable for use with the present invention, e.g., for high-throughput 
screening of potential modulators. The nature and implementation of modifications to 
these devices (if any) so that they can operate as discussed herein will be apparent to 
persons skilled in the relevant art. 

Indeed, entire high throughput screening systems are commercially available. 
These systems typically automate entire procedures including all sample and reagent 
pipetting, liquid dispensing, timed incubations, and final readings of the microplate in 
detector(s) appropriate for the assay. These configurable systems provide high 
throughput and rapid start up as well as a high degree of flexibility and customization. 
Similarly, microfluidic implementations of screening are also commercially available. 

The manufacturers of such systems provide detailed protocols the various high 
throughput. Thus, for example, Zymark Corp. provides technical bulletins describing 
screening systems for detecting the modulation of gene transcription, ligand binding, 
and the like. The integrated systems herein, in addition to providing for sequence 
alignment and, optionally, synthesis of relevant nucleic acids, can include such 
screening apparatus to identify modulators that have an effect on one or more 
polynucleotides or polypeptides according to the present invention. 

In some assays it is desirable to have positive controls to ensure that the 
components of the assays are working properly. At least two types of positive 
controls are appropriate. That is, known transcriptional activators or inhibitors can be 
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incubated with cells/plants/ etc. in one sample of the assay, and the resulting 
increase/decrease in transcription can be detected by measuring the resulting increase 
in RNA/ protein expression, etc., according to the methods herein. It will be 
appreciated that modulators can also be combined with transcriptional activators or 
inhibitors to find modulators that inhibit transcriptional activation or transcriptional 
repression. Either expression of the nucleic acids and proteins herein or any 
additional nucleic acids or proteins activated by the nucleic acids or proteins herein, 
or both, can be monitored. 

In an embodiment, the invention provides a method for identifying 
compositions that modulate the activity or expression of a polynucleotide or 
polypeptide of the invention. For example, a test compound, whether a small or large 
molecule, is placed in contact with a cell, plant (or plant tissue or explant), or 
composition comprising the polynucleotide or polypeptide of interest and a resulting 
effect on the cell, plant, (or tissue or explant) or composition is evaluated by 
monitoring, either directly or indirectly, one or more of: expression level of the 
polynucleotide or polypeptide, activity (or modulation of the activity) of the 
polynucleotide or polypeptide. In some cases, an alteration in a plant phenotype can 
be detected following contact of a plant (or plant cell, or tissue or explant) with the 
putative modulator, e.g., by modulation of expression or activity of a polynucleotide 
or polypeptide of the invention. Modulation of expression or activity of a 
polynucleotide or polypeptide of the invention may also be caused by molecular 
elements in a signal transduction second messenger pathway and such modulation can 
affect similar elements in the same or another signal transduction second messenger 
pathway. 

XIV. Subsequences 

Also contemplated are uses of polynucleotides, also referred to herein as 
oligonucleotides, typically having at least 12 bases, preferably at least 15, more 
preferably at least 20, 30, or 50 bases, which hybridize under at least highly stringent 
(or ultra-high stringent or ultra-ultra-high stringent conditions) conditions to a 
polynucleotide sequence described above. The polynucleotides may be used as 
probes, primers, sense and antisense agents, and the like, according to methods as 
noted supra. 
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Subsequences of the polynucleotides of the invention, including 
polynucleotide fragments and oligonucleotides are useful as nucleic acid probes and 
primers. An oligonucleotide suitable for use as a probe or primer is at least about 15 
nucleotides in length, more often at least about 1 8 nucleotides, often at least about 21 
nucleotides, frequently at least about 30 nucleotides, or about 40 nucleotides, or more 
in length. A nucleic acid probe is useful in hybridization protocols, e.g., to identify 
additional polypeptide homologues of the invention, including protocols for 
microarray experiments. Primers can be annealed to a complementary target DNA 
strand by nucleic acid hybridization to fonn a hybrid between the primer and the 
target DNA strand, and then extended along the target DNA strand by a DNA 
polymerase enzyme. Primer pairs can be used for amplification of a nucleic acid 
sequence, e.g., by the polymerase chain reaction (PCR) or other nucleic-acid 
amplification methods. See Sambrook and Ausubel, supra. 

hi addition, the invention includes an isolated or recombinant polypeptide 
including a subsequence of at least about 15 contiguous amino acids encoded by the 
recombinant or isolated polynucleotides of the invention. For example, such 
polypeptides, or domains or fragments thereof, can be used as immunogens, e.g., to 
produce antibodies specific for the polypeptide sequence, or as probes for detecting a 
sequence of interest. A subsequence can range in size from about 15 amino acids in 
length up to and including the full length of the polypeptide. 

To be encompassed by the present invention, an expressed polypeptide which 
comprises such a polypeptide subsequence performs at least one biological function 
of the intact polypeptide in substantially the same manner, or to a similar extent, as 
does the intact polypeptide. For example, a polypeptide fragment can comprise a 
recognizable structural motif or functional domain such as a DNA binding domain 
that binds to a specific DNA promoter region, an activation domain or a domain for 
protein-protein interactions. 
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XV. Production of Transgenic Plants 

Modification of Traits 

The polynucleotides of the invention are favorably employed to produce 
transgenic plants with various traits, or characteristics, that have been modified in a 
desirable manner, e.g., to improve the seed characteristics of a plant. For example, 
alteration of expression levels or patterns (e.g., spatial or temporal expression 
patterns) of one or more of the transcription factors (or transcription factor 
homologues) of the invention, as compared with the levels of the same protein found 
in a wild type plant, can be used to modify a plant's traits. An illustrative example of 
trait modification, improved characteristics, by altering expression levels of a 
particular transcription factor is described further in the Examples and the Sequence 
Listing. 

Arabidopsis as a model system 

Arabidopsis thaliana is the object of rapidly growing attention as a model for 
genetics and metabolism in plants. Arabidopsis has a small genome, and well 
documented studies are available. It is easy to grow in large numbers and mutants 
defining important genetically controlled mechanisms are either available, or can 
readily be obtained. Various methods to introduce and express isolated homologous 
genes are available (see Koncz, et al., eds. Methods in Arabidopsis Research, et at 
(1992), World Scientific, New Jersey, New Jersey, in "Preface"). Because of its small 
size, short life cycle, obligate autogamy and high fertility, Arabidopsis is also a 
choice organism for the isolation of mutants and studies in morphogenetic and 
development pathways, and control of these pathways by transcription factors (Koncz, 
supra, p. 72). A number of studies introducing transcription factors into A. thaliana 
have demonstrated the utility of this plant for understanding the mechanisms of gene 
regulation and trait alteration in plants. See, for example, Koncz, supra, and U.S. 
Patent Number 6,417,428). 

Arabidopsis genes in transgenic plants. 

Expression of genes which encode transcription factors modify expression of 
endogenous genes, polynucleotides, and proteins are well known in the art. In 
addition, transgenic plants comprising isolated polynucleotides encoding transcription 
factors may also modify expression of endogenous genes, polynucleotides, and 
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proteins. Examples include Peng et al. (1997, Genes and Development 1 1 :3194- 
3205) and Peng et al. (1999, Nature, 400:256-261). In addition, many others have 
demonstrated that an Arabidopsis transcription factor expressed in an exogenous plant 
species elicits the same or very similar phenotypic response. See, for example, Fu et 
al. (2001, Plant Cell 13:1791-1802); Nandi et al. (2000, Curr. Biol. 10:215-218); 
Coupland (1995, Nature 377:482-483); and Weigel and Nilsson (1995, Nature 
377:482-500). 

Homologous genes introduced into transgenic plants. 

Homologous genes that may be derived from any plant, or from any source 
whether natural, synthetic, semi-synthetic or recombinant, and that share significant 
sequence identity or similarity to those provided by the present invention, may be 
introduced into plants, for example, crop plants, to confer desirable or improved traits. 
Consequently, transgenic plants may be produced that comprise a recombinant 
expression vector or cassette with a promoter operably linked to one or more 
sequences homologous to presently disclosed sequences. The promoter may be, for, 
example, a plant or viral promoter. 

The invention thus provides for methods for preparing transgenic plants, and 
for modifying plant traits. These methods include introducing into a plant a 
recombinant expression vector or cassette comprising a functional promoter operably 
linked to one or more sequences homologous to presently disclosed sequences. Plants 
and kits for producing these plants that result from the application of these methods 
are also encompassed by the present invention. 

The complete descriptions of the traits associated with each polynucleotide of 
the invention is fully disclosed in Table 4, Table 5, and Table 6. 
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1 GRAB1 protein. I 


I unnamed protein product. 


I NAC1. i 


NAC-domain protein. ~1 


I ATAFMike protein. 


I BOMMZ07TR BO 2 3 KB Brassica oleracea gen f 


I EST402062-tomato root, plants pre-a 


I sm12e10.y1 Gm-c1027 Glycine max cDNA clone GENO ! 


I chromosome 8 clone P0461 F06, *** SEQUENCING IN ! 


I ( ) chromosome 8 clo i 


I AT002234 Flower bud cDNA Br | 


I HV CEa0006N02f Hordeum vulgare seedlinq qre 


EM1 41 E02.g1 A002 Embryo 1 (EM1) Sorqhum b 


I MEST40-H05.T3 ISUM4-TN Zea mays cDNA clone MEST40- 


I RHIZ2 75 D09.g1 A003 Rhizome2 (RHIZ2) So 


I contains EST C74560(E31855)-unknown protein. 


I helix-loop-helix protein 1 A. \ 
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| phaseolin G-box binding protein PG2. 
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I transcriptional activator Rb homolog. 1 


| helix-loop-helix protein 1 A. j 


! phaseolih G-box binding protein PG2. ] 


symbiotic ammonium transporter; nodulin. j 


| bHLH transcription factor GBOF-1 . I 


anthocyanin 1. j 


transcription factor MYC7E. j 


I DEL. | 


| myc-like regulatory R gene product. ] 


| BOGRJ19TR BOGR Brassica oleracea genomic ] 


I sp05h10.y1 Gm-c1041 Glycine max cDNA clone GENO | 


| genomic DNA, chromosome 1 , PAC clone:P0002B05, I 


I HV CEa0006N02f Hordeum vulgare seedling gre I 


I AT002234 Flower bud cDNA Br I 


| EM1 41 E02.g1 A002 Embryo 1 (EM1) Sorghum b I 


| ( ) chromosome 8 do I 


I EST402062 tomato root, plants pre-a J 


| MEST40-H05.T3 ISUM4-TN Zea mays cDNA clone MEST40- | 
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| bHLH transcription factor GBOF-1 . I 


| myc-like regulatory R gene product. I 


| myc-like regulatory R gene product. I 


| transcriptional activator Rb homolog. S 


I ER33 protein; I 
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1 NF036F04RT1 F1 032 Developing root Medica ~~ I 


I GHMYB25. 


I protein' 1. I 


I mixta. 


I OSMYB1. 


I myb-related transcription factor. 


I myb-related transcription factor. ! 


I CI protein. 


| transforming protein (myb) homolog (clone Zm38) I 


I GmMYB29A2. ] 
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I DNA-binding protein 2 (WRKY2) mRNA, compl j 


| SPF1-like DNA-binding protein mRNA, complet 


| zinc finger transcription factor WRKY1 mRNA, c 


[Sequence 9 from Patent WO0149840. 


I Sweet potato mRNA for SPF1 protein, complet j 


[A.fatua mRNA for DNA-binding protein (clone ABF 


| mRNA for hypothetical protein (ORF 


I Sequence 1 1 from Patent WO01 49840. ! 


I zinc finger protein (ZFP1 ) mRNA, com 


I DNA-binding protein WRKY1 mRNA, comple I 
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DNA-binding protein. 


hypothetical protein. 
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Traits of interest 

Examples of some of the traits that may be desirable in plants, and that may be 
provided by transforming the plants with the presently disclosed sequences, are listed 
in Table 6. 



Table 6. Genes, traits and utilities that affect plant characteristics 



Trait Category 


Traits 


Transcription factor genes that 
impact traits 


Utility 
Gene effect on: 




Resistance and 
tolerance 


Salt stress resistance 


G22;G196; G226; G303; 
r^i9« fV30^« n^^* rMQo- 

\JDlz., u4oZ, 

G545; G801; G867; G884; 
G922; G926; G1452; G1794; 
G1820; G1836; G1843; G1863; 

G2379; G2701; G2713; G2719; 
G2789 


Germination rate, 
survivability, 
yield; extended 
growth range 




Osmotic stress 
resistance 


G47;G175;G188;G303; 
G325; G353; G489; G502; 
G526; G921; G922; G926; 
G1069; G1089; G1452; G1794; 
G1930; G2140; G2153; G2379; 
G2701;G2719;G2789; 


Germination rate, 
survivability, yield 




Cold stress resistance; 
cold germination 


G256; G394; 

G664;G864;G1322; G2130 


Germination, 
growth, earlier 
planting 




Tolerance to freezing 


G303; G325; G353; G720; 
G912; G913; G1794; G2053; 
G2140; G2153; G2379; G2701; 
G2719; G2789 


Survivability, 
yield, appearance, 
extended range 




Heat stress resistance 


G3; G464; G682; G864; G964; 


Germination, 
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G1305; G1645; G2130 G2430 


growth, later 
planting 




Drought, low 
humidity resistance 


G303; G325; G353; G720; 
G912; G926; G1452; G1794; 
G1820; G1843; G2053; G2140; 
G2153; G2379; G2583; G2701; 
G2719; G2789 


Survivability, 
yield, extended 
range 




Radiation resistance 


G1052 


Survivability, 
vigor, appearance 




Decreased herbicide 
sensitivity 


G343;G2133;G2517 


Resistant to 
increased 
herbicide use 




Increased herbicide 
sensitivity 


G374; G877;G1519 


Use as a herbicide 
target 




Oxidative stress 


G477; G789;G1807;G2133; 
G2517 


Improved yield, 
appearance, 
reduced 
senescence 




Light response 


G183;G354;G375; G1062; 
G1322; G1331; G1488; G1494; 
G1521; G1786; G1794; G2144; 
G2555; 


Germination, 
growth, 
development, 
flowering time 




Development, 
morphology 


Overall plant 
architecture 


G24; G27; G31;G33;G47 
G147; G156; G160; G182; 
G187; G195;G196; G211; 
G221;G237; G280;G342; 
G352; G357; G358; G360; 
G362; G364; G365; G367; 
G373;G377; G396;G431; 
G447; G479; G546; G546; 
G551;G578; G580; G596; 
G615; G617; G620; G625; 




Vascular tissues, 
lignin content; cell 
wall content; 
appearance 
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G638;G65S;G716;G725; 
G727; G730; G740; G770; 
G858; G865; G869; G872; 
G904; G910; G912; G920; 
G939; G963; G977; G979; 
G987; G988; G993; G1007; 



G1010; 


G1014; 


G1035; 


G1046; 




G1062; 


G1069; 


G1070; 


G1076* 


G1089- 

VJ X ? 


G1093- 

VJ 1 \J S ~J 5 


Gil 27; 


rni3i* 


G1 14-v 

VJ 1 X'T.J 5 


GP29- 

VJ 1 — ■, 


GP46- 

VJ X ^»**T\J ? 


vj x -J vy" 5 


G1318* 

VJ UlOj 


VJ 1 -J^v/j 


G1330; 


UIJJ X, 


VJ 1 JJx., 


G1354- 


G1360* 

VJ 1 Jv/Uj 


VJ J. .JVJt, 


G1379- 

VJ 1 J / 7j 


G13S4* 

VJ X —> *J » j 


G1399* 

VJ 1 ~J J S 5 


VJ1*+ 1 J, 


H1417* 

vJ 1H- 1 / , 




VJ 1 TJJ, 


VJT 1 HJH-j 


VJ 


G1460* 

VJ 1 *T\JV/j 


G1471- 

VJ IT / X j 


m47V 

Vj It- / J, 


01477- 

VJ XT" / / , 


VJ1*tO / , 


G14R7- 

VJ l^O / ? 


VJ 1 t -ryL. i 


G1499* 


G1499- 

vj a ^xyy , 


G1531- 




VJ X ~>^~J y 


G1 S43- 

VJ 1 JtJj 


G1 544- 

VJ X •J**t**r j 


G1 "548* 

VJ 1 J*tOj 


G1584' 


G1587* 

VJ A^CJ/3 


G1588; 


gi sro- 


G1 636* 

VJ X V/J VJ ? 


G1647- 

VJ X u*-r^j 


G1747* 

VJ 1 / *T / 5 


G1 749- 


G1 749* 

vj x / '-ty 5 


G1751* 

VJ X / Jlj 


G1752* 

VJ X / ~J •> 


H176V 

VJ I / w J 5 


G1 766- 

VJ x / vju 3 


G1767- 

VJ X / Vf / j 


G1778- 

VJ III Uj 


G17o9; 


Cj179U; 


G1791; 


(jl /93; 


G1794; 


G1795; 


G1800; 


G1806; 


G1811; 


G1835; 


G1836; 


G1838; 


G1839; 


G1843; 


G1853; 


G1855; 


G1865: 


G1881 


G1882; 


G1883, 


G1884 


,Gl891 


G1896 ; 


.G1898 


G1902 


, G1904 


; G1906. 


,G1913 


G1914 


; G1925 


; G1929 


; G1930 


G1954 


; G1958 


; G1965 


; G1976 


G2057 


; G2107 


; G2133 


; G2134 


G2151 


; G2154 


;G2157 


; G2181 
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G2290; G2299; G2340; G2340 
G2346; G2373; G2376; G2424 
G2465; G2505; G2509; G2512 
G2513;G2519;G2520; G2533 
G2534; G2573; G2589; G2687 
G2720; G2787; G2789; G2893 



Size: increased stature 



Size: reduced stature 
or dwarfism 



G189; G1073; G1435; G2430 



G3; G5;G21;G23;G39; G165; 
G1S4;G194;G258;G280; 
G340; G343; G353; G354; 
G362; G363; G370; G385; 
G396; G439; G440; G447; 
G450; G550; G557; G599; 
G636; G652; G670;G671; 
G674; G729; G760; G804; 
G831;G864;G884; G898; 
G900; G912; G913; G922; 
G932; G937; G939; G960; 
G962;G977; G991;G1000; 
G1008; G1020; G1023; G1053 
G1067; G1075; G1137; G1181 
G1198; G1228; G1266; G1267 
G1275; G1277; G1309; G1311 
G1314;G1317; G1322; G1323 
G1326; G1332; G1334; G1367 
G1381; G1382; G1386; G1421 
G1488; G1494; G1537; G1545 
G1560; G1586; G1641; G1652 
G1655; G1671; G1750; G1756 
G1757; G1782; G1786; G1794 
G1839; G1845; G1879; G1886: 
G1888; G1933; G1939; G1943 
G1944; G201 1; G2094; G21 15 



Ornamental; small 
stature provides 
wind resistance; 
creation of dwarf 
varieties 
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G2130; G2132; G2144; G2145 
G2147; G2156; G2294; G2313 
G2344; G2431;G2510; G2517 ; 
G2521;G2893;G2893 






Fruit size and number 


G362 


Biomass, yield, 
cotton boll fiber 
density 




Flower structure, 
inflorescence 


G47; G259; G353; G354; 
G671;G732; G988; G1000; 
G1063;G1140; G1326; G1449; 
G1543; G1560; G1587; G1645; 
G1947; G2108;G2143; G2893 


Ornamental 
horticulture; 
production of 
saffron or other 
edible flowers 




Number and 
development of 
trichomes 


G225; G226; G247; G362; 
G585; G634; G676; G682; 
G1014; G1332; G1452; G1795; 
G2105 


Resistance to pests 
and desiccation; 
essential oil 
production 




Seed size, color, and 
number 


G156; G450; G584; G652; 
G668; GS58; G979; G1040; 
G1062; G1145; G1255; G1494; 
G1531;G1534; G1594;G2105; 
G2114; 


Yield 




Root development, 
modifications 


G9;G1482; G1534; G1794; 
G1852; G2053; G2136; G2140 






Modifications to root 
lairs 


G225; G226 

] 


Nutrient, water 
uptake, pathogen 
resistance 




Apical dominance 


G559; G732; G1255; G1275; 1 
G1411;G1488; G1635; G2452; 
G2509 


Ornamental 
lorticulture 




Branching patterns ( 


3568; G988;G1 548 ( 

r 

i 


Ornamental 
lorticulture, knot 
eduction, 
mproved 
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windscreen 




Leaf shape, color, 
modifications 


G375; G377; G428; G438; 
G447; G464; G557; G577; 
G599; G635;G671;G674; 
G736; G804; G903; G977; 
G921;G922;G1038;G1063; 
G1067; G1073; G1075; G1146; 
Gl 152; G1198; G1267; G1269; 
G1452; G1484; G1586; G1594; 
G1767; G1786; G1792; G1886; 
G2059; G2094; G2105; G21 13; 
G2117; G2143; G2144; G2431; 
G2452; G2465; G2587; G2583; 
G2724; 


Appealing shape 
or shiny leaves for 
omamental 
agriculture, 
increased biomass 
or photosynthesis 




Silique 


G1134 


Ornamental 




Stem morphology 


G47;G438;G671;G748; 
G988; G1000 


Ornamental; 
digestibility 




Shoot modifications 


G390; G39l 


Ornamental stem 
bifurcations 




Disease, 

Pathogen 

Resistance 


Bacterial 


G2ll;G347; G367; G418; 
G525; G545; G578; G1049 


Yield, appearance, 
survivability, 
extended range 




Fungal 


Gl9;G28;G28;G28; G147; 
Gl88;G207;G2ll;G237; 
G248; G278; G347; G367; 
G37l;G378;G409; G477; 
G545; G545; G558; G569; 
G578; G59l;G594; G616; 
G789; G805; G812; G865; 
G869; G872; G88l;G896; 
G940; G1047; G1049; G1064; 
G1084; GH96; G1255; G1266; 


Yield, appearance, 
survivability, 
extended range 
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G1363; G1514; G1756; G1792; 








G1792; G1792; G1792; G18S0; 








G1919; G1919; G1927; G1927; 








G1936; G1936; G1950; G2069; 








G2130; G2380; G2380; G2555 






Nutrients 


Increased tolerance to 


G225; G226; G1792 






nitrogen-limited soils 








Increased tolerance to 


G419; G545;G561;G1946 






phosphate-limited 








soils 








Increased tolerance to 


G561;G911 






potassium-limited 








soils 








Hormonal 


Hormone sensitivity 


G12; G546; G926; G760; 


Seed dormancy, 






G913;G926; G1062; G1069; 


drought tolerance;. 






G1095; G1134; G1330; G1452; 


plant form, fruit 






G1666; G1820; G2140; G2789 


ripening 




Seed 


Production of seed 


G214; G259; G490; G652; 


Antioxidant 


biochemistry 


prenyl lipids. 


G748; G883; G1052; G1328; 


activity, vitamin E 




including tocopherol 


G1930; G2509; G2520 






Production of seed 


G20 


Precursors for ! 




sterols 




human steroid 1 








hormones; 








cholesterol 








modulators 




Production of seed 


G353; G484; G674; G1272; 


Defense against 




glucosinolates 


G1506; G1897; G1946; G2113; 


insects; putative 






G2117; G2155; G2290; G2340 


anticancer 








activity; 








undesirable in 
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animal feeds 




Modified seed oil 
content 


G162; G162;G180; G192; 
G241;G265;G286;G291; 
G427; G509; G519; G561; 
G567;G590; G818; G849; 
G892;G961;G974;G1063; 
G1143;G1190;G1198;G1226 
G1229; G1323; G1451; G1471 
G1478;G1496; G1526;G1543, 
G1640; G1644; G1646; G1672; 
G1677;G1750; G1765; G1777; 
G1793; G1838; G1902; G1946; 
G1948; G2059; G2123; G2138; 
G2139; G2343; G2792; G2830 


Vegetable oil 
production; 
increased caloric 
value for animal 
feeds; lutein 
; content 




Modified seed oil 
composition 


G217; G504; G622; G778; 
G791;G861;G869;G938; 
G965; G1417; G2192 


Heat stability, 
digestibility of 
seed oils 




Modified seed protein 
content 

( 
< 


G162;G226;G241;G371; 
G427; G509; G567; G597; 
G732; G849; G865; G892; 
G963; G988; G1323; G1323; 
G1419; G1478; G1488; G1634; 
G1637; G1641; G1644; G1652; 
G1677; G1777; G1777; G1818; 
G1820; G1903; G1909; G1946; 
G1946; G195S; G2059; G2117; 
32417; G2509 


Reduced caloric 
value for humans 










Leaf 

biochemistry 


Production of < 
avonoids 


31666* , 

r 


Ornamental 
jigment 
>roduction; 
>athogen 
esistance; health 
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benefits 




Production of leaf 
glucosinolates 


G264; G353; G484; G652; 
G674; G6S1;G1069;G119S; 
G1322;G1421;G1657; G1794; 
G1897; G1946; G2115;G2117; 
G2144; G2155; G2155; G2340; 
G2512;G2520; G2552 


Defense against 
insects; putative 
anticancer 
activity; 
undesirable in 
animal feeds 




Production of 
diterpenes 


G229 


Induction of 
enzymes involved 
in alkaloid 
biosynthesis 




Production of 
anthocyanin 


G546 


Ornamental 
pigment 




Production of leaf 
phytosterols, inc. 
stigmastanol, 
campesterol 


G561;G2131;G2424 


Precursors for 
human steroid 
hormones; ! 
cholesterol 
modulators 




Leaf fatty acid 

composition 

i 


G214; G377; G861;G962; 
G975; G987; G1266; G1337; 
G1399; G1465; G1512; G2136; 
G2147; G2192 


Nutritional value; 
increase in waxes 
for disease 
resistance 




Production of leaf 
prenyl lipids, 
including tocopherol 


G214; G259; G280; G652; 
G987; G1543; G2509; G2520 


Antioxidant 
activity, vitamin E 




Biochemistry, 
general 


Production of 
miscellaneous 
secondary metabolites 


G229; G663 






Sugar, starch, 
hemicellulose 
composition, 


G158; G211;G211;G237; 
G242; G274; G598; G1012; 
G1266; G1309; G1309; G1641; 
G1765; G1865; G2094; G2094; 


Food digestibility, 
hemicellulose & 
pectin content; 
fiber content; plant 
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G2589; G25S9 


tensile strength, 
wood quality, 
pathogen 
resistance, pulp 
production; tuber 
starch content 




Sugar sensing 


Plant response to 
sugars 


G26; G38; G43; G207; G218; 
G241;G254; G263; G308; 
G536; G567; G567; G680; 
G867; G912; G956; G996; 
G1068; G1225; G1314; G1314; 
G1337; G1759; G1S04; G2153; 
G2379 


Photosynthetic 

rate, carbohydrate 

accumulation, 

biomass 

production, 

source-sink 

relationships, 

senescence 




Growth, 
Reproduction 


Plant growth rate and 
development 


G447; G617; G674; G730; 
G917; G937; G1035; G1046; 
G1131; G1425; G1452; G1459; 
G1492; G1589; G1652; G1879; 
G1943; G2430; G2431; G2465; 
G2521 


Faster growth, 
increased biomass 
or yield, improved 
appearance; delay, 
in bolting 




Embryo development 


G167 






Seed germination rate 


G979; G1792;G2130 


Yield 




Plant, seedling vigor 


G561;G2346 


Survivability, 
yield 




Senescence; cell death 


G571;G636; G878; G1050; 
G1463; G1749; G1944; G2130; 
G2155;G2340;G2383 


Yield, appearance; 
response to 
pathogens; 




Modified fertility 


G39; G340; G439; G470; 
G559;G615; G652; G671; 
G779; G962; G977; G988; 
G1000; G1063; G1067; G1075; 


Prevents or 
minimizes escape 
of the pollen of 
GMOs 
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G1266;G1311;G1321;G1326; 
G1367;G1386;G1421;G1453; 
G1471; G1453; G1560; G1594; 
G1635; G1750; G1947; G2011; 
G2094; G2113; G2U5; G2130; 
G2143; G2147; G2294; G2510; 
G2893 






Early flowering 


G147; G157;G180;G183; 
G1S3;G184;G185;G208; 
G227; G294; G390; G390; 
G390; G391;G391;G427; 
G427; G490; G565; G590; 
G592; G720; G789; G865; 
G898; G898; G989; G989; 
G1037- G1037- G114''- G 
G1225; G1226; G1242;G 
G1305; G1380; G1380; G 
G1480; G1488; G1494; G 
G1545; G1649; G1706;G 
G1767; G1767; G1820;G 
G1841;G1842; G1843;G 
G1946; G1946;G2010;G 
G2030; G2144; G2144; G 
G2295; G2347; G2348; G 
G2373; G2373; G2509; G 
G2555; G2555 


1225; 
1305; 
1480; 
1545; 
1760; 
1841; 
1843; 
2030; 
2295; 
2348; 
2509; 


Faster generation 
time; synchrony of 
flowering; 
potential for 
introducing new 
traits to single 
variety 




Delayed flowering 


G8;G47;G192;G214;Gi 
G361; G362; G562; G568 
G571; G591; G680; G736 
G748; G859; G878; G910 
G912;G913;G971;G994 
G1051;G1052; G1073;G 
G1335; G1435; G1452; G 


>34; 

• 

1079; 
1478; 


Delayed time to 
pollen production 
ofGMO plants; 
synchrony of 
flowering; 
increased yield 
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G1789;G1804;G1865;G1865; 
G1895; G1900; G2007; G2133; 
G2155;G2291;G2465 






Extended flowering 
phase 


G1947 






Flower and leaf 
development 


G259; G353; G377; G580; 
G638 G652; G858; G869; 
G917; G922; G932; G1063; 
G1075; G1140; G1425; G1452; 
G1499; G1548; G1645; G1865; 
G1S97; G1933; G2094; G2124; 
G2140; G2143; G2535; G2557 


Ornamental 
applications; 
decreased fertility 


* 1171 


Flower abscission 


G1897 


Ornamental: 
longer retention of 
lowers 



* When co-expressed with G669 and G663 



Significance of modified plant traits 

Currently, the existence of a series of maturity groups for different latitudes 
represents a major barrier to the introduction of new valuable traits. Any trait (e.g. 
disease resistance) has to be bred into each of the different maturity groups separately, 
a laborious and costly exercise. The availability of single strain, which could be 
grown at any latitude, would therefore greatly increase the potential for introducing 
new traits to crop species such as soybean and cotton. 



For many of the traits, listed in Table 6 and below, that may be conferred to 
plants, a single transcription factor gene may be used to increase or decrease, advance 
or delay, or improve or prove deleterious to a given trait. For example, 
overexpression of a transcription factor gene that naturally occurs in a plant may 
cause early flowering relative to non-transformed or wild-type plants. By knocking 
out the gene, or suppressing the gene (with, for example, antisense suppression) the 
plant may experience delayed flowering. Similarly, overexpressing or suppressing 
one or more genes can impart significant differences in production of plant products, 
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such as different fatty acid ratios. Thus, suppressing a gene that causes a plant to be 
more sensitive to cold may improve a plant's tolerance of cold. 

Salt stress resistance . Soil salinity is one of the more important variables that 
determines where a plant may thrive. Salinity is especially important for the 
successful cultivation of crop plants, particular in many parts of the world that have 
naturally high soil salt concentrations, or where the soil has been over-utilized. Thus, 
presently disclosed transcription factor genes that provide increased salt tolerance 
during germination, the seedling stage, and throughout a plant's life cycle would find 
particular value for imparting survivability and yield in areas where a particular crop 
would not normally prosper. 

Osmotic stress resistance. Presently disclosed transcription factor genes that 
confer resistance to osmotic stress may increase germination rate under adverse 
conditions, which could impact survivability and yield of seeds and plants. 

Cold stress resistance. The potential utility of presently disclosed transcription 
factor genes that increase tolerance to cold is to confer better germination and growth 
in cold conditions. The germination of many crops is very sensitive to cold 
temperatures. Genes that would allow germination and seedling vigor in the cold 
would have highly significant utility in allowing seeds to be planted earlier in the 
season with a high rate of survivability. Transcription factor genes that confer better 
survivability in cooler climates allow a grower to move up planting time in the spring 
and extend the growing season further into autumn for higher crop yields. 

Tolerance to freezing . The presently disclosed transcription factor genes that 
impart tolerance to freezing conditions are useful for enhancing the survivability and 
appearance of plants conditions or conditions that would otherwise cause extensive 
cellular damage. Thus, germination of seeds and survival may take place at 
temperatures significantly below that of the mean temperature required for 
germination of seeds and survival of non-transformed plants. As with salt tolerance, 
this has the added benefit of increasing the potential range of a crop plant into regions 
in which it would otherwise succumb. Cold tolerant transformed plants may also be 
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planted earlier in the spring or later in autumn, with greater success than with non- 
transformed plants. 

Heat stress tolerance . The germination of many crops is also sensitive to high 
temperatures. Presently disclosed transcription factor genes that provide increased 
heat tolerance are generally useful in producing plants that germinate and grow in hot 
conditions, may find particular use for crops that are planted late in the season, or 
extend the range of a plant by allowing growth in relatively hot climates. 

Drought, low humidity tolerance . Strategies that allow plants to survive in 
low water conditions may include, for example, reduced surface area or surface oil or 
wax production. A number of presently disclosed transcription factor genes increase 
a plant's tolerance to low water conditions and provide the benefits of improved 
survivability, increased yield and an extended geographic and temporal planting 
range. 

Radiation resistance . Presently disclosed transcription factor genes have been 
shown to increase lutein production. Lutein, like other xanthophylls such as 
zeaxanthin and violaxanthin, are important in the protection of plants against the 
damaging effects of excessive light. Lutein contributes, directly or indirectly, to the 
rapid rise of non-photochemical quenching in plants exposed to high light. Increased 
tolerance of field plants to visible and ultraviolet light impacts survivability and vigor, 
particularly for recent transplants. Also affected are the yield and appearance of 
harvested plants or plant parts. Crop plants engineered with presently disclosed 
transcription factor genes that cause the plant to produce higher levels of lutein 
therefore would have improved photoprotection, leading to less oxidative damage and 
increase vigor, survivability and higher yields under high light and ultraviolet light 
conditions. 

Decreased herbicide sensitivity. Presently disclosed transcription factor genes 
that confer resistance or tolerance to herbicides (e.g., glyphosate) may find use in 
providing means to increase herbicide applications without detriment to desirable 
plants. This would allow for the increased use of a particular herbicide in a local 
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environment, with the effect of increased detriment to undesirable species and less 
harm to transgenic, desirable cultivars. 

Increased herbicide sensitivity . Knockouts of a number of the presently 
disclosed transcription factor genes have been shown to be lethal to developing 
embryos. Thus, these genes are potentially useful as herbicide targets. 

Oxidative stress . In plants, as in all living things, abiotic and biotic stresses 
induce the formation of oxygen radicals, including superoxide and peroxide radicals. 
This has the effect of accelerating senescence, particularly in leaves, with the resulting 
loss of yield and adverse effect on appearance. Generally, plants that have the highest 
level of defense mechanisms, such as, for example, polyunsaturated moieties of 
membrane lipids, are most likely to thrive under conditions that introduce oxidative 
stress (e.g., high light, ozone, water deficit, particularly in combination). Introduction 
of the presently disclosed transcription factor genes that increase the level of oxidative 
stress defense mechanisms would provide beneficial effects on the yield and 
appearance of plants. One specific oxidizing agent, ozone, has been shown to cause 
significant foliar injury, which impacts yield and appearance of crop and ornamental 
plants. In addition to reduced foliar injury that would be found in ozone resistant 
plant created by transforming plants with some of the presently disclosed transcription 
factor genes, the latter have also been shown to have increased chlorophyll 
fluorescence (Yu-Sen Chang et al. Bot. Bull. Acad. Sin. (2001) 42: 265-272). 

Heavy metal tolerance . Heavy metals such as lead, mercury, arsenic, 
chromium and others may have a significant adverse impact on plant respiration. 
Plants that have been transformed with presently disclosed transcription factor genes 
that confer improved resistance to heavy metals, through, for example, sequestering or 
reduced uptake of the metals will show improved vigor and yield in soils with 
relatively high concentrations of these elements. Conversely, transgenic transcription 
factors may also be introduced into plants to confer an increase in heavy metal uptake, 
which may benefit efforts to clean up contaminated soils. 

Light response . Presently disclosed transcription factor genes that modify a 
plant's response to light may be useful for modifying a plant's growth or 
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development, for example, photomorphogenesis in poor light, or accelerating 
flowering time in response to various light intensities, quality or duration to which a 
non-transformed plant would not similarly respond. Examples of such responses that 
have been demonstrated include leaf number and arrangement, and early flower bud 
appearances. 

Overall plant architecture . Several presently disclosed transcription factor 
genes have been introduced into plants to alter numerous aspects of the plant's 
morphology. For example, it has been demonstrated that a number of transcription 
factors may be used to manipulate branching, such as the means to modify lateral 
branching, a possible application in the forestry industry. Transgenic plants have also 
been produced that have altered cell wall content, lignin production, flower organ 
number, or overall shape of the plants. Presently disclosed transcription factor genes 
transformed into plants may be used to affect plant morphology by increasing or 
decreasing internode distance, both of which may be advantageous under different 
circumstances. For example, for fast growth of woody plants to provide more 
biomass, or fewer knots, increased internode distances are generally desirable. For 
improved wind screening of shrubs or trees, or harvesting characteristics of, for 
example, members of the Gramineae family, decreased internode distance may be 
advantageous. These modifications would also prove useful in the ornamental 
horticulture industry for the creation of unique phenotypic characteristics of 
ornamental plants. 

Increased stature . For some ornamental plants, the ability to provide larger 
varieties may be highly desirable. For many plants, including t fhiit-bearing trees or 
trees and shrubs that serve as view or wind screens, increased stature provides 
obvious benefits. Crop species may also produce higher yields on larger cultivars. 

Reduced stature or dwarfism . Presently disclosed transcription factor genes 
that decrease plant stature can be used to produce plants that are more resistant to 
damage by wind and rain, or more resistant to heat or low humidity or water deficit. 
Dwarf plants are also of significant interest to the ornamental horticulture industry, 
and particularly for home garden applications for which space availability may be 
limited. 
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Fniit size and number . Introduction of presently disclosed transcription factor 
genes that affect fruit size will have desirable impacts on fruit size and number, which 
may comprise increases in yield for fruit crops, or reduced fruit yield, such as when 
vegetative growth is preferred (e.g., with bushy ornamentals, or where fruit is 
undesirable, as with ornamental olive trees). 

Flower structure, inflorescence, and development. Presently disclosed 
transgenic transcription factors have been used to create plants with larger flowers or 
arrangements of flowers that are distinct from wild-type or non-transformed cultivars. 
This would likely have the most value for the ornamental horticulture industry, where 
larger flowers or interesting presentations generally are preferred and command the 
highest prices. Flower structure may have advantageous effects on fertility, and could ■ 
be used, for example, to decrease fertility by the absence, reduction or screening of 
reproductive components. One interesting application for manipulation of flower 
structure, for example, by introduced transcription factors could be in the increased 
production of edible flowers or flower parts, including saffron, which is derived from 
the stigmas of Croats salivas. 

Number and development of trichomes . Several presently disclosed 
transcription factor genes have been used to modify trichome number and amount of 
trichome products in plants. Trichome glands on the surface of many higher plants 
produce and secrete exudates that give protection from the elements and pests such as 
insects, microbes and herbivores. These exudates may physically immobilize insects 
and spores, may be insecticidal or ant-microbial or they may act as allergens or 
irritants to protect against herbivores. Trichomes have also been suggested to decrease 
transpiration by decreasing leaf surface air flow, and by exuding chemicals that 
protect the leaf from the sun. 

Seed size, color and number . The introduction of presently disclosed 
transcription factor genes into plants that alter the size or number of seeds may have a 
significant impact on yield, both when the product is the seed itself, or when biomass 
of the vegetative portion of the plant is increased by reducing seed production. In the 
case of fruit products, it is often advantageous to modify a plant to have reduced size 
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or number of seeds relative to non-transformed plants to provide seedless or varieties 
with reduced numbers or smaller seeds. Presently disclosed transcription factor genes 
have also been shown to affect seed size, including the development of larger seeds. 
Seed size, in addition to seed coat integrity, thickness and permeability, seed water 
content and by a number of other components including antioxidants and 
oligosaccharides, may affect seed longevity in storage. This would be an important 
utility when the seed of a plant is the harvested crops, as with, for example, peas, 
beans, nuts, etc. Presently disclosed transcription factor genes have also been used to 
modify seed color, which could provide added appeal to a seed product. 

Root development, modifications . By modifying the structure or development 
of roots by transforming into a plant one or more of the presently disclosed 
transcription factor genes, plants may be produced that have the capacity to thrive in 
otherwise unproductive soils. For example, grape roots that extend further into rocky 
soils, or that remain viable in waterlogged soils, would increase the effective planting 
range of the crop. It may be advantageous to manipulate a plant to produce short 
roots, as when a soil in which the plant will be growing is occasionally flooded, or 
when pathogenic fungi or disease-causing nematodes are prevalent. 

Modifications to root hairs . Presently disclosed transcription factor genes that 
increase root hair length or number potentially could be used to increase root growth 
or vigor, which might in turn allow better plant growth under adverse conditions such 
as limited nutrient or water availability. 

Apical dominance . The modified expression of presently disclosed 
transcription factors that control apical dominance could be used in ornamental 
horticulture, for example, to modify plant architecture. 

Branching patterns. Several presently disclosed transcription factor genes have 
been used to manipulate branching, which could provide benefits in the forestry 
industry. For example, reduction in the formation of lateral branches could reduce 
knot formation. Conversely, increasing the number of lateral branches could provide 
utility when a plant is used as a windscreen, or may also provide ornamental 
advantages. 
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Leaf shape, color and modifications . It has been demonstrated in laboratory 
experiments that overexpression of some of the presently disclosed transcription 
factors produced marked effects on leaf development. At early stages of growth, these 
transgenic seedlings developed narrow, upward pointing leaves with long petioles, 
possibly indicating a disruption in circadian-clock controlled processes or nyctinastic 
movements. Other transcription factor genes can be used to increase plant biomass; 
large size would be useful in crops where the vegetative portion of the plant is the 
marketable portion. 

Siliques . Genes that later silique conformation in brassicates may be used to 
modify fruit ripening processes in brassicates and other plants, which may positively 
affect seed or fruit quality. 

Stem morphology and shoot modifications . Laboratory studies have 
demonstrated that introducing several of the presently disclosed transcription factor 
genes into plants can cause stem bifurcations in shoots, in which the shoot meristems 
split to form two or three separate shoots. This unique appearance would be desirable 
in ornamental applications. 

Diseases, pathogens and pests . A number of the presently disclosed 
transcription factor genes have been shown to or are likely to confer resistance to 
various plant diseases, pathogens and pests. The offending organisms include fungal 
pathogens Fiisarhim oxysporum, Botrytis cinerea, Sclerotinia sclerotionim, and 
Erysiphe orontii. Bacterial pathogens to which resistance may be conferred include 
Pseiidomonas syringae. Other problem organisms may potentially include 
nematodes, mollicutes, parasites, or herbivorous arthropods. In each ease, one or 
more transformed transcription factor genes may provide some benefit to the plant to 
help prevent or overcome infestation. The mechanisms by which the transcription 
factors work could include increasing surface waxes or oils, surface thickness,local 
senescence, or the activation of signal transduction pathways that regulate plant 
defense in response to attacks by herbivorous pests (including, for example, protease 
inhibitors). 
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Increased tolerance of pla nts to nutrient-limited soils . Presently disclosed 
transcription factor genes introduced into plants may provide the means to improve 
uptake of essential nutrients, including nitrogenous compounds, phosphates, 
potassium, and trace minerals. The effect of these modifications is to increase the 
seedling germination and range of ornamental and crop plants. The utilities of 
presently disclosed transcription factor genes conferring tolerance to conditions of 
low nutrients also include cost savings to the grower by reducing the amounts of 
fertilizer needed, environmental benefits of reduced fertilizer runoff; and improved 
yield and stress tolerance. In addition, this gene could be used to alter seed protein 
amounts and/or composition that could impact yield as well as the nutritional value 
and production of various food products. 

Hormone sensitivity. One or more of the presently disclosed transcription 
factor genes have been shown to affect plant abscisic acid (ABA) sensitivity. This 
plant hormone is likely the most important hormone in mediating the adaptation of a 
plant to stress. For example, ABA mediates conversion of apical meristems into 
dormant buds. In response to increasingly cold conditions, the newly developing 
leaves growing above the meristem become converted into stiff bud scales that closely 
wrap the meristem and protect it from mechanical damage during winter. ABA in the 
bud also enforces dormancy; during premature warm spells, the buds are inhibited 
from sprouting. Bud dormancy is eliminated after either a prolonged cold period of 
cold or a significant number of lengthening days. Thus, by affecting ABA sensitivity, 
introduced transcription factor genes may affect cold sensitivity and survivability. 
ABA is also important in protecting plants from drought tolerance. 

Several other of the present transcription factor genes have been used to 
manipulate ethylene signal transduction and response pathways. These genes can thus 
be used to manipulate the processes influenced by ethylene, such as seed germination 
or fruit ripening, and to improve seed or fruit quality. 

Production of seed and leaf prenyl lipids, including tocop herol. Prenyl lipids 
play a role in anchoring proteins in membranes or membranous organelles. Thus 
modifying the prenyl lipid content of seeds and leaves could affect membrane 
integrity and function. A number of presently disclosed transcription factor genes 
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have been shown to modify the tocopherol composition of plants. Tocopherols have 
both anti-oxidant and vitamin E activity. 

Production of seed and leaf phvtosterols : Presently disclosed transcription 
factor genes that modify levels of phytosterols in plants may have at least two 
utilities. First, phytosterols are an important source of precursors for the manufacture 
of human steroid hormones. Thus, regulation of transcription factor expression or 
activity could lead to elevated levels of important human steroid precursors for steroid 
semi-synthesis. For example, transcription factors that cause elevated levels of 
campesterol in leaves, or sitosterols and stigmasterols in seed crops, would be useful 
for this purpose. Phytosterols and their hydrogenated derivatives phytostanols also 
have proven cholesterol-lowering properties, and transcription factor genes that 
modify the expression of these compounds in plants would thus provide health 
• benefits. 

Production of seed and leaf glucosinolates . Some glucosinolates have anti- 
cancer activity; thus, increasing the levels or composition of these compounds by 
introducing several of the presently disclosed transcription factors might be of interest 
from a nutraceutical standpoint. (3) Glucosinolates form part of a plants natural 
defense against insects. Modification of glucosinolate composition or quantity could 
therefore afford increased protection from predators. Furthermore, in edible crops, 
tissue specific promoters might be used to ensure that these compounds accumulate 
specifically in tissues, such as the epidermis, which are not taken for consumption. 

Modified seed oil content . The composition of seeds, particularly with respect 
to seed oil amounts and/or composition, is very important for the nutritional value and 
production of various food and feed products. Several of the presently disclosed 
transcription factor genes in seed lipid saturation that alter seed oil content could be 
used to improve the heat stability of oils or to improve the nutritional quality of seed 
oil, by, for example, reducing the number of calories in seed, increasing the number of 
calories in animal feeds, or altering the ratio of saturated to unsaturated lipids 
comprising the oils. 



114 



BNSDOCID: <WO_03013227A2_IA> 



WO 03/013227 



PCT/USO2/25805 



Seed and leaf fatty acid composition . A number of the presently disclosed 
transcription factor genes have been shown to alter the fatty acid composition in 
plants, and seeds in particular. This modification may find particular value for 
improving the nutritional value of, for example, seeds or whole plants. Dietary fatty 
acids ratios have been shown to have an effect on, for example, bone integrity and 
remodeling (see, for example, Weiler, H.A., Pediatr Res (2000) 47:5 692-697). The 
ratio of dietary fatty acids may alter the precursor pools of long-chain polyunsaturated 
fatty acids that serve as precursors for prostaglandin synthesis. In mammalian 
connective tissue, prostaglandins serve as important signals regulating the balance 
between resorption and formation in bone and cartilage. Thus dietary fatty acid ratios 
altered in seeds may affect the etiology and outcome of bone loss. 

Modified seed protein content . As with seed oils, the composition of seeds, 
particularly with respect to protein amounts and/or composition, is very important for 
the nutritional value and production of various food and feed products. A number of 
the presently disclosed transcription factor genes modify the protein concentrations in 
seeds would provide nutritional benefits, and may be used to prolong storage, increase 
seed pest or disease resistance, or modify germination rates. 

Production of flavonoids in leaves and other plant parts . Expression of 
presently disclosed transcription factor genes that increase flavonoid production in 
plants, including anthocyanins and condensed tannins, maybe used to alter in pigment 
production for horticultural purposes, and possibly increasing stress resistance. 
Flavonoids have antimicrobial activity and could be used to engineer pathogen 
resistance. Several flavonoid compounds have health promoting effects such as the 
inhibition of tumor growth and cancer, prevention of bone loss and the prevention of 
the oxidation of lipids. Increasing levels of condensed tannins, whose biosynthetic 
pathway is shared with anthocyanin biosynthesis, in forage legumes is an important 
agronomic trait because they prevent pasture bloat by collapsing protein foams within 
the rumen. For a review on the utilities of flavonoids and their derivatives, refer to 
Dixon et al. (1999) Trends Plant Sci. 4:394-400. 

Production of diterpenes in leaves and other plant parts . Depending on the 
plant species, varying amounts of diverse secondary biochemicals (often lipophilic 
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terpenes) are produced and exuded or volatilized by trichornes. These exotic 
secondary biochemicals, which are relatively easy to extract because they are on the 
surface of the leaf, have been widely used in such products as flavors and aromas, 
drugs, pesticides and cosmetics. Thus, the overexpression of genes that are used to 
produce diterpenes in plants may be accomplished by introducing transcription factor 
genes that induce said overexpression. One class of secondary metabolites, the 
diterpenes, can effect several biological systems such as tumor progression, 
prostaglandin synthesis and tissue inflammation. In addition, diterpenes can act as 
insect pheromones, termite allomones, and can exhibit neurotoxic, cytotoxic and 
antimitotic activities. As a result of this functional diversity, diterpenes have been the 
target of research several pharmaceutical ventures. In most cases where the metabolic 
pathways are impossible to engineer, increasing trichome density or size on leaves 
may be the only way to increase plant productivity. 

Production of anthocvanin in leaves and other plant parts . Several presently 
disclosed transcription factor genes can be used to alter anthocyanin production in 
numerous plant species. The potential utilities of these genes include alterations in 
pigment production for horticultural purposes, and possibly increaising stress 
resistance in combination with another transcription factor. 

Production of miscellaneous secondary metabolites . Microarray data suggests 
that flux through the aromatic amino acid biosynthetic pathways and primary and 
secondary metabolite biosynthetic pathways are up-regulated. Presently disclosed 
transcription factors have been shown to be involved in regulating alkaloid 
biosynthesis, in part by up-regulating the enzymes indole-3-glycerol phosphatase and 
strictosidine synthase. Phenylalanine ammonia lyase, chalcone synthase and trans- 
cinnamate mono-oxygenase are also induced, and are involved in phenylpropenoid 
biosynthesis. 

Sugar, star ch, hemicellulose composition . Overexpression of the presently 
disclosed transcription factors that affect sugar content resulted in plants with altered 
leaf insoluble sugar content. Transcription factors that alter plant cell wall 
composition have several potential applications including altering food digestibility, 
plant tensile strength, wood quality, pathogen resistance and in pulp production. The 
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potential utilities of a gene involved in glucose-specific sugar sensing are to alter 
energy balance, photosynthetic rate, carbohydrate accumulation, biomass production, 
source-sink relationships, and senescence. 

Hemicellulose is not desirable in paper pulps because of its lack of strength 
compared with cellulose. Thus modulating the amounts of cellulose vs. hemicellulose 
in the plant cell wall is desirable for the paper/lumber industry. Increasing the 
insoluble carbohydrate content in various fruits, vegetables, and other edible 
consumer products will result in enhanced fiber content. Increased fiber content 
would not only provide health benefits in food products, but might also increase 
digestibility of forage crops. In addition, the hemicellulose and pectin content of fruits 
and berries affects the quality of jam and catsup made from them. Changes in 
hemicellulose and pectin content could result in a superior consumer product. 

Plant response to sugars and sugar composition . In addition to their important 
role as an energy source and structural component of the plant cell, sugars are central 
regulatory molecules that control several aspects of plant physiology, metabolism and 
development. It is thought that this control is achieved by regulating gene expression 
and, in higher plants, sugars have been shown to repress or activate plant genes 
involved in many essential processes such as photosynthesis, glyoxylate metabolism, 
respiration, starch and sucrose synthesis and degradation, pathogen response, 
wounding response, cell cycle regulation, pigmentation, flowering and senescence. 
The mechanisms by which sugars control gene expression are not understood. 

Because sugars are important signaling molecules, the ability to control either 
the concentration of a signaling sugar or how the plant perceives or responds to a 
signaling sugar could be used to control plant development, physiology or 
metabolism. For example, the flux of sucrose (a disaccharide sugar used for 
systemically transporting carbon and energy in most plants) has been shown to affect 
gene expression and alter storage compound accumulation in seeds. Manipulation of 
the sucrose signaling pathway in seeds may therefore cause seeds to have more 
protein, oil or carbohydrate, depending on the type of manipulation. Similarly, in 
tubers, sucrose is converted to starch which is used as an energy store. It is thought 
that sugar signaling pathways may partially determine the levels of starch synthesized 
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in the tubers. The manipulation of sugar signaling in tubers could lead to tubers with a 
higher starch content. 

Thus, the presently disclosed transcription factor genes that manipulate the 
sugar signal transduction pathway may lead to altered gene expression to produce 
plants with desirable traits. In particular, manipulation of sugar signal transduction 
pathways could be used to alter source-sink relationships in seeds, tubers, roots and 
other storage organs leading to increase in yield. 

Plant growth rate and development . A number of the presently disclosed 
transcription factor genes have been shown to have significant effects on plant growth 
rate and development. These observations have included, for example, more rapid or 
delayed growth and development of reproductive organs. This would provide utility 
for regions with short or long growing seasons, respectively. Accelerating plant 
growth would also improve early yield or increase biomass at an earlier stage, when 
such is desirable (for example, in producing forestry products). 

Embryo development . Presently disclosed transcription factor genes that alter 
embryo development has been used to alter seed protein and oil amounts and/or 
composition which is very important for the nutritional value and production of 
various food products. Seed shape and seed coat may also be altered by these genes, 
which may provide for improved storage stability. 

Seed germination rate . A number of the presently disclosed transcription 
factor genes have been shown to modify seed germination rate, including when the 
seeds are in conditions normally unfavorable for germination (e.g., cold, heat or salt 
stress, or in the presence of ABA), and may thus be used to modify and improve 
germination rates under adverse conditions. 

Plant, seedling vigor . Seedlings transformed with presently disclosed 
transcription factors have been shown to possess larger cotyledons and appeared 
somewhat more advanced than control plants. This indicates that the seedlings 
developed more rapidly that the control plants. Rapid seedling development is likely 
to reduce loss due to diseases particularly prevalent at the seedling stage (e.g., 
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damping off) and is thus important for survivability of plants germinating in the field 
or in controlled environments. 

Senescence, cell death . Presently disclosed transcription factor genes may be 
used to alter senescence responses in plants. Although leaf senescence is thought to be 
an evolutionary adaptation to recycle nutrients, the ability to control senescence in an 
agricultural setting has significant value. For example, a delay in leaf senescence in 
some maize hybrids is associated with a significant increase in yields and a delay of a 
few days in the senescence of soybean plants can have a large impact on yield. 
Delayed flower senescence may also generate plants that retain their blossoms longer 
and this may be of potential interest to the ornamental horticulture industry. 

Modified fertility . Plants that overexpress a number of the presently disclosed 
transcription factor genes have been shown to possess reduced fertility. This could 
be a desirable trait, as it could be exploited to prevent or minimize the escape of the 
pollen of genetically modified organisms (GMOs) into the environment. 

Early and delayed flowering . Presently disclosed transcription factor genes 
that accelerate flowering could have valuable applications in such programs since 
they allow much faster generation times. In a number of species, for example, 
broccoli, cauliflower, where the reproductive parts of the plants constitute the crop 
and the vegetative tissues are discarded, it would be advantageous to accelerate time 
to flowering. Accelerating flowering could shorten crop and tree breeding programs. 
Additionally, in some instances, a faster generation time might allow additional 
harvests of a crop to be made within a given growing season. A number of 
Arabidopsis genes have already been shown to accelerate flowering when 
constitutively expressed. These include LEAFY, APETALA1 and CONSTANS 
(Mandel, M. et al, 1995, Nature 377, 522-524; Weigel, D. andNilsson, O., 1995, 
Nature 377, 495-500; Simon et al, 1996, Nature 384, 59-62). 

By regulating the expression of potential flowering using inducible promoters, 
flowering could be triggered by application of an inducer chemical. This would allow 
flowering to be synchronized across a crop and facilitate more efficient harvesting. 
Such inducible systems could also be used to time the flowering of crop varieties to 
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different latitudes. At present, species such as soybean and cotton are available as a 
series of maturity groups that are suitable for different latitudes on the basis of their 
flowering time (which is governed by day-length). A system in which flowering could 
be chemically controlled would allow a single high-yielding northern maturity group 
to be grown at any latitude. In southern regions such plants could be grown for longer 
thereby increasing yields, before flowering was induced. In more northern areas the ' 
induction would be used to ensure that the crop flowers prior to the first winter frosts. 

In a sizeable number of species, for example, root crops, where the vegetative 
parts of the plants constitute the crop and the reproductive tissues are discarded, it 
would be advantageous to delay or prevent flowering. Extending vegetative 
development with presently disclosed transcription factor genes could thus bring 
about large increases in yields.. Prevention of flowering might help maximize 
vegetative yields and prevent escape of genetically modified organism (GMO) pollen. 

E xtended flowerinP p hase. Presently disclosed transcription factors that extend 
flowering time have utility in engineering plants with longer-lasting flowers for the 
horticulture industry, and for extending the time in which the plant is fertile. 

Floweran d leaf develop ment. Presently disclosed transcription factor genes 
have been used to modify the development of flowers and leaves. This could be 
advantageous in the development of new ornamental cultivars that present unique 
configurations. In addition, some of these genes have been shown to reduce a plant's 
fertihty, which is also useful for helping to prevent development of pollen of GMOs. 

Flower abscission . Presently disclosed transcription factor genes introduced 
into plants have been used to retain flowers for longer periods. This would provide a 
significant benefit to the ornamental industry, for both cut flowers and woody plant 
varieties (of, for example, maize), as well as have the potential to lengthen the fertile 
period of a plant, which could positively impact yield and breeding programs " 
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A listing of specific effects and utilities that the presently disclosed 
transcription factor genes have on plants, as determined by direct observation and 
assay analysis, is provided in Table 4. 

XVI. Antisense and Co-suppression 

In addition to expression of the nucleic acids of the invention as gene 
replacement or plant phenotype modification nucleic acids, the nucleic acids are also 
useful for sense and anti-sense suppression of expression, e.g., to down-regulate 
expression of a nucleic acid of the invention, e.g., as a further mechanism for 
modulating plant phenotype. That is, the nucleic acids of the invention, or 
subsequences or anti-sense sequences thereof, can be used to block expression of 
naturally occurring homologous nucleic acids. A variety of sense and anti-sense 
technologies are known in the art, e.g., as set forth in Lichtenstein and Nellen (1997) 
Antisense Technology: A Practical Approach IRL Press at Oxford University Press, 
Oxford, U.K.. In general, sense or anti-sense sequences are introduced into a cell, 
where they are optionally amplified, e.gi, by transcription. Such sequences include 
both simple oligonucleotide sequences and catalytic sequences such as ribozymes. 

For example, a reduction or elimination of expression (i.e., a "knock-out") of a 
transcription factor or transcription factor homologue polypeptide in a transgenic 
plant, e.g., to modify a plant trait, can be obtained by introducing an antisense construct 
corresponding to the polypeptide of interest as a cDNA. For antisense suppression, the 
transcription factor or homologue cDNA is arranged in reverse orientation (with 
respect to the coding sequence) relative to the promoter sequence in the expression 
vector. The introduced sequence need not be the full length cDNA or gene, and need 
not be identical to the cDNA or gene found in the plant type to be transformed. 
Typically, the antisense sequence need only be capable of hybridizing to the target 
gene or RNA of interest. Thus, where the introduced sequence is of shorter length, a 
higher degree of homology to the endogenous transcription factor sequence will be 
needed for effective antisense suppression. While antisense sequences of various 
lengths can be utilized, preferably, the introduced antisense sequence in the vector 
will be at least 30 nucleotides in length, and improved antisense suppression will 
typically be observed as the length of the antisense sequence increases. Preferably, 
the length of the antisense sequence in the vector will be greater than 100 nucleotides. 
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Transcription of an antisense construct as described results in the production of RNA 
molecules that are the reverse complement of mRNA molecules transcribed from the 
endogenous transcription factor gene in the plant cell. 

Suppression of endogenous transcription factor gene expression can also be 
achieved using a ribozyme. Ribozymes are RNA molecules that possess highly 
specific endoribonuclease activity. The production and use of ribozymes are 
disclosed in U.S. Patent No. 4,987,071 and U.S. Patent No. 5,543,508. Synthetic 
ribozyme sequences including antisense RNAs can be used to confer RNA cleaving 
activity on the antisense RNA, such that endogenous mRNA molecules that hybridize 
to the antisense RNA are cleaved, which in turn leads to an enhanced antisense 
inhibition of endogenous gene expression. 

Suppression of endogenous transcription factor gene expression can also be 
achieved using RNA interference , or RNAi. RNAi is a post-transcriptional, targeted 
gene-silencing technique that uses double-stranded RNA (dsRNA) to incite 
degradation of messenger RNA (mRNA) containing the same sequence as the dsRNA 
(Constans,.(2002j The Scientist 16:36). Small interfering RNAs, or siRNAs are 
produced in at least two steps: an endogenous ribonuclease cleaves longer dsRNA 
into shorter, 21-23 nucleotide-long RNAs. The siRNA segments then mediate the 
degradation of the target mRNA (Zamore, (2001) Nature Struct. Biol, 8:746-50). 
RNAi has been used for gene function determination in a manner similar to antisense 
oligonucleotides (Constans, (2002,) The Scientist 16:36). Expression vectors that 
continually express siRNAs in transiently and stably transfected have been engineered 
to express small hairpin RNAs (shRNAs), which get processed in vivo into siRNAs- 
like molecules capable of carrying out gene-specific silencing (Brummelkamp et al., 
(2002) Science 296:550-553, and Paddison, et al. (2002) Genes & Dev. 16:948-958). 
Post-transcriptional gene silencing by double-stranded RNA is discussed in further 
detail by Hammond et al. (2001) Nature Rev Gen 2: 1 10-1 19, Fire et al. (1998) Nature 
391 : 806-81 1 and Timmons and Fire (1998) Nature 395: 854. 

Vectors in which RNA encoded by a transcription factor or transcription factor 
homologue cDNA is over-expressed can also be used to obtain co-suppression of a 
corresponding endogenous gene, e.g., in the manner described in U.S. Patent No. 
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5,231,020 to Jorgensen. Such co-suppression (also termed sense suppression) does 
not require that the entire transcription factor cDNA be introduced into the plant cells, 
nor does it require that the introduced sequence be exactly identical to the endogenous 
transcription factor gene of interest. However, as with antisense suppression, the 
suppressive efficiency will be enhanced as specificity of hybridization is increased, 
e.g., as the introduced sequence is lengthened, and/or as the sequence similarity 
between the introduced sequence and the endogenous transcription factor gene is 
increased. 

Vectors expressing an untranslatable form of the transcription factor mRNA, 
e.g., sequences comprising one or more stop codon, or nonsense mutation) can also be 
used to suppress expression of an endogenous transcription factor, thereby reducing or 
eliminating it's activity and modifying one or more traits. Methods for producing 
such constructs are described in U.S. Patent No. 5,583,021. Preferably, such 
constructs are made by introducing a premature stop codon into the transcription 
factor gene. Alternatively, a plant trait can be modified by gene silencing using 
double-strand RNA (Sharp (1999) Genes and Development 13: 139-141).Another 
method for abolishing the expression of a gene is by insertion mutagenesis using the 
T-DNA of Agrobacterium tumefaciens. After generating the insertion mutants, the 
mutants can be screened to identify those containing the insertion in a transcription 
factor or transcription factor homologue gene. Plants containing a single transgene . 
insertion event at the desired gene can be crossed to generate homozygous plants for 
the mutation. Such methods are well known to those of skill in the art. (See for 
example Koncz et al. (1992) Methods in Arabidopsis Research, World Scientific.) 

Alternatively, a plant phenotype can be altered by eliminating an endogenous 
gene, such as a transcription factor or transcription factor homologue, e.g., by 
homologous recombination (Kempin et al. (1997) Nature 389:802-803). 

A plant trait can also be modified by using the Cre-lox system (for example, as 
described in US Pat. No. 5,658,772). A plant genome can be modified to include 
first and second lox sites that are then contacted with a Cre recombinase. If the lox 
sites are in the same orientation, the intervening DNA sequence between the two sites 
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is excised. If the lox sites are in the opposite-orientation, the intervening sequence is 
inverted. 

The polynucleotides and polypeptides of this invention can also be expressed 
in a plant in the absence of an expression cassette by manipulating the activity or 
expression level of the endogenous gene by other means. For example, by 
ectopically expressing a gene by T-DNA activation tagging (Ichikawa et al. (1997) 
Nature 390 698-701; Kakimoto et al. (1996) Science 274: 982-985). This method 
entails transforming a plant with a gene tag containing multiple transcriptional 
enhancers and once the tag has inserted into the genome, expression of a flanking 
gene coding sequence becomes deregulated. In another example, the transcriptional 
machinery in a plant can be modified so as to increase transcription levels of a 
polynucleotide of the invention (See, e.g., PCT Publications WO 96/06166 and WO 
98/53057 which describe the modification of the DNA-binding specificity of zinc 
finger proteins by changing particular amino acids in the DNA-binding motif). 

The transgenic plant can also include the machinery necessary for expressing 
or altering the activity of a polypeptide encoded by an endogenous gene, for example 
by altering the phosphorylation state of the polypeptide to maintain it in an activated 
state. 

Transgenic plants (or plant cells, or plant explants, or plant tissues) 
incorporating the polynucleotides of the invention and/or expressing the polypeptides 
of the invention can be produced by a variety of well established techniques as 
described above. Following construction of a vector, most typically an expression 
cassette, including a polynucleotide, e.g., encoding a transcription factor or 
transcription factor homologue, of the invention, standard techniques can be used to 
introduce the polynucleotide into a plant, a plant cell, a plant explant or a plant tissue 
of interest. Optionally, the plant cell, explant or tissue can be regenerated to produce 
a transgenic plant. 

The plant can be any higher plant, including gymnosperms, 
monocotyledonous and dicotyledenous plants. Suitable protocols are available for 
Leguminosae (alfalfa, soybean, clover, etc.), Umbelliferae (carrot, celery, parsnip), 
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Onciferae (cabbage, radish, rapeseed, broccoli, etc.), Curciirbitacecie (melons and 
cucumber), Gramineae (wheat, corn, rice, barley, millet, etc.), Solanaceae (potato, 
tomato, tobacco, peppers, etc.), and various other crops. See protocols described in 
Ammirato et al. (1984) Handbook of Plant Cell Culture -Crop Species , Macmillan 
Publ. Co. Shimamoto et al. (1989) Nature 338:274-276: Fromm et al. (1990) 
Bio/Technology 8:833-839; and Vasil et al (1990) Bio/Technology 8:429-434. 

Transformation and regeneration of both monocotyledonous and 
dicotyledonous plant cells is now routine, and the selection of the most appropriate 
transformation technique will be determined by the practitioner. The choice of 
method will vary with the type of plant to be transformed; those skilled in the art will 
recognize the suitability of particular methods for given plant types. Suitable methods 
can include, but are not limited to: electroporation of plant protoplasts; liposome- 
mediated transformation; polyethylene glycol (PEG) mediated transformation; 
transformation using viruses; micro-injection of plant cells; micro-projectile 
bombardment of plant cells; vacuum infiltration; and Agrobacterium tumefaciens 
mediated transformation. Transformation means introducing a nucleotide sequence 
into a plant in a manner to cause stable or transient expression of the sequence. 

Successful examples of the modification of plant characteristics by 
transformation with cloned sequences which serve to illustrate the current knowledge 
in this field of technology, and which are herein incorporated by reference, include: 
U.S. Patent Nos. 5,571,706; 5,677,175; 5,510,471; 5,750,386; 5,597,945; 5,589,615; 
5,750,871; 5,268,526; 5,780,708; 5,538,880; 5,773,269; 5,736,369 and 5,610,042. 

Following transformation, plants are preferably selected using a dominant 
selectable marker incorporated into the transformation vector. Typically, such a 
marker will confer antibiotic or herbicide resistance on the transformed plants, and 
selection of transformants can be accomplished by exposing the plants to appropriate 
concentrations of the antibiotic or herbicide. 

After transformed plants are selected and grown to maturity, those plants 
showing a modified trait are identified. The modified trait can be any of those traits 
described above. Additionally, to confirm that the modified trait is due to changes in 
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expression levels or activity of the polypeptide or polynucleotide of the invention can 
be determined by analyzing mRNA expression using Northern blots, RT-PCR or 
microarrays, or protein expression using immunoblots or Western blots or gel shift 
assays. 

XVII. Integrated Systems - Sequence Identity 

Additionally, the present invention may be an integrated system, computer or 
computer readable medium that comprises an instruction set for determining the 
identity of one or more sequences in a database. In addition, the instruction set can be 
used to generate or identify sequences that meet any specified criteria. Furthermore, 
the instruction set may be used to associate or link certain functional benefits, such 
improved characteristics, with one or more identified sequence. 

For example, the instruction set can include, e.g., a sequence comparison or 
other alignment program, e.g., an available program such as, for example, the 
Wisconsin Package Version 10.0, such as BLAST, FASTA, PILEUP, 
FINDPATTERNS or the like (GCG, Madison, WI): Public sequence databases such 
as GenBank, EMBL, Swiss-Prot and PIR or private sequence databases such as 
PHYTOSEQ sequence database (Incyte Genomics, Palo Alto, CA) can be searched. 

Alignment of sequences for comparison can be conducted by the local 
homology algorithm of Smith and Waterman (1981) Adv. Appl. Math. 2:482, by the 
homology alignment algorithm of Needleman and Wunsch (1970) J. Mol. Biol. 
48:443-453, by the search for similarity method of Pearson and Lipman (1988) Proc. 
Natl. Acad. Sci. U.S.A 85:2444-2448, by computerized implementations of these 
algorithms. After alignment, sequence comparisons between two (or more) 
polynucleotides or polypeptides are typically performed by comparing sequences of 
the two sequences over a comparison window to identify and compare local regions 
of sequence similarity. The comparison window can be a segment of at least about 20 
contiguous positions, usually about 50 to about 200, more usually about 100 to about 
150 contiguous positions. A description of the method is provided in Ausubel et ah, 
supra. 
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A variety of methods for determining sequence relationships can be used, 
including manual alignment and computer assisted sequence alignment and analysis. 
This later approach is a preferred approach in the present invention, due to the 
increased throughput afforded by computer assisted methods. As noted above, a 
variety of computer programs for performing sequence alignment are available, or can 
be produced by one of skill 



One example algorithm that is suitable for determining percent sequence 
identity and sequence similarity is the BLAST algorithm, which is described in 
Altschul et al. J. Mol. Biol 215:403-410 (1990). Software for performing BLAST 
analyses is publicly available, e.g., through the National Center for Biotechnology 
Information (see internet website at ncbi.nlm.nih.gov). This algorithm involves first 
identifying high scoring sequence pairs (HSPs) by identifying short words of length 
W in the query sequence, which either match or satisfy some positive- valued 
threshold score T when aligned with a word of the same length in a database 
sequence. T is referred to as the neighborhood word score threshold (Altschul et al., 
supra). These initial neighborhood word hits act as seeds for initiating searches to 
find longer HSPs containing them. The word hits are then extended in both directions 
along each sequence for as far as the cumulative alignment score can be increased. 
Cumulative scores are calculated using, for nucleotide sequences, the parameters M 
(reward score for a pair of matching residues; always > 0) and N (penalty score for 
mismatching residues; always < 0). For amino acid sequences, a scoring matrix is 
used to calculate the cumulative score. Extension of the word hits in each direction 
are halted when: the cumulative alignment score falls off by the quantity X from its 
maximum achieved value; the cumulative score goes to zero or below, due to the 
accumulation of one or more negative-scoring residue alignments; or the end of either 
sequence is reached. The BLAST algorithm parameters W, T, and X determine the 
sensitivity and speed of the alignment. The BLASTN program (for nucleotide 
sequences) uses as defaults a wordlength (W) of 1 1 , an expectation (E) of 10, a cutoff 
of 100, M=5, N=-4, and a comparison of both strands. For amino acid sequences, the 
BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, 
and the BLOSUM62 scoring matrix (see Henikoff & Henikoff (1989 ) Proc. Natl. 
Acad. Sci. USA 89:10915). Unless otherwise indicated, "sequence identity" here 
refers to the % sequence identity generated from a tblastx using the NCBI version of 
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the algorithm at the default settings using gapped alignments with the filter "off (see, 
for example, internet website at ncbi.nlm.nih.gov). 

In addition to calculating percent sequence identity, the BLAST algorithm also 
performs a statistical analysis of the similarity between two sequences (see, e.g., 
Karlin & Altschul (1993) Proc. Natl. Acad. Sci. USA 90:5873-5787). One measure 
of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), 
which provides an indication of the probability by which a match between two 
nucleotide or amino acid sequences would occur by chance. For example, a nucleic 
acid is considered similar to a reference sequence (and, therefore, in this context, 
homologous) if the smallest sum probability in a comparison of the test nucleic acid to 
the reference nucleic acid is less than about 0.1, or less than about 0.01, and or even 
less than about 0.001. An additional example of a useful sequence alignment 
algorithm is PILEUP. PILEUP creates a multiple sequence alignment from a group of 
related sequences using progressive, pairwise alignments. The program can align, e.g., 
up to 300 sequences of a maximum length of 5,000 letters. 

The integrated system, or computer typically includes a user input interface 
allowing a user to selectively view one or more sequence records corresponding to the 
one or more character strings, as well as an instruction set which aligns the one or 
more character strings with each other or with an additional character string to 
identify one or more region of sequence similarity. The system may include a link of 
one or more character strings with a particular phenotype or gene fanction. Typically, 
the system includes a user readable output element that displays an alignment 
produced by the alignment instruction set. 

The methods of this invention can be implemented in a localized or distributed 
computing environment. In a distributed environment, the methods may implemented 
on a single computer comprising multiple processors or on a multiplicity of 
computers. The computers can be linked, e.g. through a common bus, but more 
preferably the computer(s) are nodes on a network. The network can be a generalized 
or a dedicated local or wide-area network and, in certain preferred embodiments, the 
computers may be components of an intra-net or an internet. 
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Thus, the invention provides methods for identifying a sequence similar or 
homologous to one or more polynucleotides as noted herein, or one or more target 
polypeptides encoded by the polynucleotides, or otherwise noted herein and may 
include linking or associating a given plant phenotype or gene function with a 
sequence. In the methods, a sequence database is provided (locally or across an inter 
or intra net) and a query is made against the sequence database using the relevant 
sequences herein and associated plant phenotypes or gene functions. 

Any sequence herein can be entered into the database, before or after querying 
the database. This provides for both expansion of the database and, if done before the 
querying step, for insertion of control sequences into the database. The control 
sequences can be detected by the query to ensure the general integrity of both the 
database and the query. As noted, the query can be performed using a web browser 
based interface. For example, the database can be a centralized public database such 
as those noted herein, and the querying can be done from a remote terminal or 
computer across an internet or intranet. 

XVIIL Examples 

The following examples are intended to illustrate but not limit the present 
invention. The complete descriptions of the traits associated with each polynucleotide 
of the invention is folly disclosed in Table 4 and Table 6. 

Example I: Full Length Gene Identification and Cloning 

Putative transcription factor sequences (genomic or ESTs) related to known 
transcription factors were identified in the Arabidopsis thaliana GenBank database 
using the tblastn sequence analysis program using default parameters and a P-value 
cutoff threshold of -4 or -5 or lower, depending on the length of the query sequence. 
Putative transcription factor sequence hits were then screened to identify those 
containing particular sequence strings. If the sequence hits contained such sequence 
strings, the sequences were confirmed as transcription factors. 

Alternatively, Arabidopsis thaliana cDNA libraries derived from different 
tissues or treatments, or genomic libraries were screened to identify novel members of 
a transcription family using a low stringency hybridization approach. Probes were 
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synthesized using gene specific primers in a standard PCR reaction (annealing 
temperature 60° C) and labeled with 32 P dCTP using the High Prime DN A Labeling 
Kit (Boehringer Mannheim). Purified radiolabeled probes were added to filters 
immersed in Church hybridization medium (0.5 M NaP0 4 pH 7.0, 7% SDS, 1 % w/v 
bovine serum albumin) and hybridized overnight at 60°C with shaking. Filters were 
washed two times for 45 to 60 minutes with lxSCC, 1% SDS at 60° C. 

To identify additional sequence 5* or 3 ? of a partial cDNA sequence in a cDNA 
library, 5' and 3' rapid amplification of cDNA ends (RACE) was performed using the 
U.C. Marathon cDNA amplification kit (Clontech, Palo Alto, CA). Generally, the 
method entailed first isolating poly(A) mRNA, performing first and second strand 
cDNA synthesis to generate double stranded cDNA, blunting cDNA ends, followed 
by ligation of the U.C. Marathon Adaptor to the cDNA to form a library of adaptor- 
ligated ds cDNA. 

Gene-specific primers were designed to be used along with adaptor specific 
primers for both 5' and 3' RACE reactions. Nested primers, rather than single 
primers, were used to increase PCR specificity. Using 5' and 3' RACE reactions, 5' 
and 3' RACE fragments were obtained, sequenced and cloned. The process can be 
repeated until 5' and 3' ends of the full-length gene were identified. Then the full- 
length cDNA was generated by PCR using primers specific to 5' and 3' ends of the 
gene by end-to-end PCR. 

Example II: Construction of Expression Vectors 

The sequence was amplified from a genomic or cDNA library using primers 
specific to sequences upstream and downstream of the coding region. The expression 
vector was pMEN20 or pMEN65, which are both derived from pMON3 16 (Sanders et 
al, (1987 ) Nucleic Acids Research 15:1543-1558) and contain the CaMV 35S 
promoter to express transgenes. To clone the sequence into the vector, both pMEN20 
and the amplified DNA fragment were digested separately with Sail and Not! 
restriction enzymes at 37° C for 2 hours. The digestion products were subject to 
electrophoresis in a 0.8% agarose gel and visualized by ethidium bromide staining. 
The DNA fragments containing the sequence and the linearized plasmid were excised 
and purified by using a Qiaquick gel extraction kit (Qiagen, Valencia CA). The 
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fragments of interest were ligated at a ratio of 3: 1 (vector to insert). Ligation 
reactions using T4 DNA ligase (New England Biolabs, Beverly MA) were carried out 
at 16° C for 16 hours. The ligated DNAs were transformed into competent cells of the 
E. coli strain DH5alpha by using the heat shock method. The transformations were 
plated on LB plates containing 50 mg/1 kanamycin (Sigma, St. Louis, MO). 
Individual colonies were grown overnight in five milliliters of LB broth containing 50 
mg/1 kanamycin at 37° C. Plasmid DNA was purified by using Qiaquick Mini Prep 
kits (Qiagen). 

Example III: Transformation of Agrobacterium with the Expression Vector 

After the plasmid vector containing the gene was constructed, the vector was 
used to transform Agrobacterium tumefaciens cells expressing the gene products. The 
stock of Agrobacterium tumefaciens cells for transformation were made as described 
by Nagel et al. (1990) FEMS Microbiol Letts . 67: 325-328. Agrobacterium strain 
ABI was grown in 250 ml LB medium (Sigma) overnight at 2S°C with shaking until 
an absorbance (A 6 oo) of 0.5 - 1 .0 was reached. Cells were harvested by centrifugation 
at 4,000 x g for 1 5 min at 4° C. Cells were then resuspended in 250 \i\ chilled buffer 
(1 mM HEPES, pH adjusted to 7.0 with KOH). Cells were centrifuged again as 
described above and resuspended in 125 ^il chilled buffer. Cells were then 
centrifuged and resuspended two more times in the same HEPES buffer as described 
above at a volume of 100 p.1 and 750 jil, respectively. Resuspended cells were then 
distributed into 40 |xl aliquots, quickly frozen in liquid nitrogen, and stored at -80° C. 

Agrobacterium cells were transformed with plasmids prepared as described 
above following the protocol described by Nagel et al. For each DNA construct to be 
transformed, 50-100 ng DNA (generally resuspended in 10 mM Tris-HCl, 1 mM 
EDTA, pH 8.0) was mixed with 40 jil of Agrobacterium cells. The DNA/cell mixture 
was then transferred to a chilled cuvette with a 2mm electrode gap and subject to a 2.5 
kV charge dissipated at 25 \xF and 200 nF using a Gene Pulser II apparatus (Bio-Rad, 
Hercules, CA). After electroporation, cells were immediately resuspended in 1.0 ml 
LB and allowed to recover without antibiotic selection for 2 - 4 hours at 28° C in a 
shaking incubator. After recovery, cells were plated onto selective medium of LB 
broth containing 100 pig/ml spectinomycin (Sigma) and incubated for 24-48 hours at 
28° C. Single colonies were then picked and inoculated in fresh medium. The 
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presence of the plasmid construct was verified by PCR amplification and sequence 
analysis. 

Example IV: Transformation of Arabidopsis Plants with Agrobacterium 
tumefaciens with Expression Vector 

After transformation of Agrobacterium tumefaciens with plasmid vectors 
containing the gene, single Agrobacterium colonies were identified, propagated, and 
used to transform Arabidopsis plants. Briefly, 500 ml cultures of LB medium 
containing 50 mg/1 kanamycin were inoculated with the colonies and grown at 28° C 
with shaking for 2 days until an optical absorbance at 600 nm wavelength over 1 cm 
(A<soo) of > 2.0 is reached. Cells were then harvested by centrifugation at 4,000 x g for 
10 min, and resuspended in infiltration medium (1/2 X Murashige and Skoog salts 
(Sigma), 1 X Gamborg's B-5 vitamins (Sigma), 5.0% (w/v) sucrose (Sigma), 0.044 
|iM benzylamino purine (Sigma), 200 jil/1 Silwet L-77 (Lehle Seeds) until an A 6 oo of 
0.8 was reached. 

Prior to transformation, Arabidopsis thaliana seeds (ecotype Columbia) were 
sown at a density of ~10 plants per 4" pot onto Pro-Mix BX potting medium 
(Hummert International) covered with fiberglass mesh (1 8 mm X 16 mm). Plants 
were grown under continuous illumination (50-75 nE/m 2 /sec) at 22-23° C with 65- 
70% relative humidity. After about 4 weeks, primary inflorescence stems (bolts) are 
cut off to encourage growth of multiple secondary bolts. After flowering of the 
mature secondary bolts, plants were prepared for transformation by removal of all 
siliques and opened flowers. 

The pots were then immersed upside down in the mixture of Agrobacterium 
infiltration medium as described above for 30 sec, and placed on their sides to allow 
draining into a V x T flat surface covered with plastic wrap. After 24 h, the plastic 
wrap was removed and pots are turned upright. The immersion procedure was 
repeated one week later, for a total of two immersions per pot. Seeds were then 
collected from each transformation pot and analyzed following the protocol described 
below. 
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Example V: Identification of Arabidopsis Primary Transformants 

Seeds collected from the transformation pots were sterilized essentially as 
follows. Seeds were dispersed into in a solution containing 0.1% (v/v) Triton X-100 
(Sigma) and sterile H 2 0 and washed by shaking the suspension for 20 min. The wash 
solution was then drained and replaced with fresh wash solution to wash the seeds for 
20 min with shaking. After removal of the second wash solution, a solution 
containing 0.1% (v/v) Triton X-100 and 70% ethanol (Equistar) was added to the 
seeds and the suspension was shaken for 5 min. After removal of the 
ethanol/detergent solution, a solution containing 0.1% (v/v) Triton X-100 and 30% 
(v/v) bleach (Clorox) was added to the seeds, and the suspension was shaken for 10 
min. After removal of the bleach/detergent solution, seeds were then washed five 
times in sterile distilled H 2 0. The seeds were stored in the last wash water at 4° C for 
2 days in the dark before being plated onto antibiotic selection medium (1 X 
Murashige and Skoog salts (pH adjusted to 5.7 with 1M KOH), 1 X Gamborg's B-5 
vitamins, 0.9% phytagar (Life Technologies), and 50 mg/1 kanamycin). Seeds were 
germinated under continuous illumination (50-75 ^E/m 2 /sec) at 22-23° C. After 7-10 
days of growth under these conditions, kanamycin resistant primary transformants (Ti 
generation) were visible and obtained. These seedlings were transferred first to fresh 
selection plates where the seedlings continued to grow for 3-5 more days, and then to 
soil (Pro-Mix BX potting medium). 

Primary transformants were crossed and progeny seeds (T 2 ) collected; 
kanamycin resistant seedlings were selected and analyzed. The expression levels of 
the recombinant polynucleotides in the transformants varies from about a 5% 
expression level increase to a least a 100% expression level increase. Similar 
observations are made with respect to polypeptide level expression. 

Example VI: Identification of Arabidopsis Plants with Transcription Factor Gene 
Knockouts 

The screening of insertion mutagenized Arabidopsis collections for null 
mutants in a known target gene was essentially as described in Krysan et al (1999) 
Plant Cell 1 1 :2283-2290. Briefly, gene-specific primers, nested by 5-250 base pairs 
to each other, were designed from the 5' and 3' regions of a known target gene. 
Similarly, nested sets of primers were also created specific to each of the T-DNA or 
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transposon ends (the "right" and "left" borders). All possible combinations of gene 
specific and T-DNA/transposon primers were used to detect by PCR an insertion 
event within or close to the target gene. The amplified DNA fragments were then 
sequenced which allows the precise determination of the T-DNA/transposon insertion 
point relative to the target gene. Insertion events within the coding or intervening 
sequence of the genes were deconvolved from a pool comprising a plurality of 
insertion events to a single unique mutant plant for functional characterization. The 
method is described in more detail in Yu and Adam, US Application Serial No. 
09/177,733 filed October 23, 1998. 

Example VII: Identification of Modified Phenotypes in Overexpression or Gene 
Knockout Plants 

Experiments were performed to identify those transformants or knockouts that 
exhibited modified biochemical characteristics. Among the biochemicals that were 
assayed were insoluble sugars, such as arabinose, fiicose, galactose, mannose, 
rhamnose or xylose or the like; prenyl lipids, such as lutein, beta-carotene, 
xanthophyll-1, xanthophyll-2, chlorophylls A or B, or alpha-, delta- or gamma- 
tocopherol or the like; fatty acids, such as 16:0 (palmitic acid), 16:1 (palmitoleic 
acid), 18:0 (stearic acid), 18:1 (oleic acid), 18:2 (linoleic acid), 20:0 , 18:3 (linolenic 
acid), 20:1 (eicosenoic acid), 20:2, 22:1 (eracic acid) or the like; waxes, such as by 
altering the levels of C29, C31, or C33 alkanes; sterols, such as brassicasterol, 
campesterol, stigmasterol, sitosterol or stigmastanol or the like, glucosinolates, 
protein or oil levels. 

Fatty acids were measured using two methods depending on whether the tissue 
was from leaves or seeds. For leaves, lipids were extracted and esterified with hot 
methanolic H2SO4 and partitioned into hexane from methanolic brine. For seed fatty 
acids, seeds were pulverized and extracted in methanol:heptane:toluene:2,2- 
dimethoxypropane:H 2 S0 4 (39:34:20:5:2) for 90 minutes at 80°C. After cooling to 
room temperature the upper phase, containing the seed fatty acid esters, was subjected 
to GC analysis. Fatty acid esters from both seed and leaf tissues were analyzed with a 
Supelco SP-2330 column. 
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Glucosinolates were purified from seeds or leaves by first heating the tissue at 
95°C for 10 minutes. Preheated ethanohwater (50:50) is and after heating at 95°C for 
a further 10 minutes, the extraction solvent is applied to a DEAE Sephadex column 
which had been previously equilibrated with 0.5 M pyridine acetate. 
Desulfoglucosinolates were eluted with 300 ul water and analyzed by reverse phase 
HPLC monitoring at 226 nm. 

For wax alkanes, samples were extracted using an identical method as fatty 
acids and extracts were analyzed on a HP 5890 GC coupled with a 5973 MSD. 
Samples were chromatographically isolated on a J&W DB35 mass spectrometer 
(J&W Scientific). 

To measure prenyl lipids levels, seeds or leaves were pulverized with 1 to 2% 
pyrogallol as an antioxidant. For seeds, extracted samples were filtered and a portion 
removed for tocopherol and carotenoid/chlorophyll analysis by HPLC. The 
remaining material was saponified for sterol determination. For leaves, an aliquot 
was removed and diluted with methanol and chlorophyll A, chlorophyll B, and total 
carotenoids measured by spectrophotometry by determining optical absorbance at 
665.2 nm, 652.5 nm, and 470 ran. An aliquot was removed for tocopherol and 
carotenoid/chlorophyll composition by HPLC using a Waters uBondapak CI 8 column 
(4.6 mm x 150 mm). The remaining methanolic solution was saponified with 10% 
KOH at 80°C for one hour. The samples were cooled and diluted with a mixture of 
methanol and water. A solution of 2% methylene chloride in hexane was mixed in 
and the samples were centrifuged. The aqueous methanol phase was again re- 
extracted 2% methylene chloride in hexane and, after centrifugation, the two upper 
phases were combined and evaporated. 2% methylene chloride in hexane was added 
to the tubes and the samples were then extracted with one ml of water. The upper 
phase was removed, dried, and resuspended in 400 ul of 2% methylene chloride in 
hexane and analyzed by gas chromatography using a 50 m DB-5ms (0.25 mm DD, 
0.25 urn phase, J&W Scientific). 

Insoluble sugar levels were measured by the method essentially described by 
Reiter et al., (1997) Plant Journal 12:335-345. This method analyzes the neutral sugar 
composition of cell wall polymers found mArabidopsis leaves. Soluble sugars were 
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separated from sugar polymers by extracting leaves with hot 70% ethanol. The 
remaining residue containing the insoluble polysaccharides was then acid hydrolyzed 
with allose added as an internal standard. Sugar monomers generated by the 
hydrolysis were then reduced to the corresponding alditols by treatment with NaBH4, 
then were acetylated to generate the volatile alditol acetates which were then analyzed 
by GC-FID. Identity of the peaks was determined by comparing the retention times 
of known sugars converted to the corresponding alditol acetates with the retention 
times of peaks from wild-type plant extracts. Alditol acetates were analyzed on a 
Supelco SP-2330 capillary column (30 m x 250 mn x 0.2 um) using a temperature 
program beginning at 1 80° C for 2 minutes followed by an increase to 220° C in 4 
minutes. After holding at 220° C for 10 minutes, the oven temperature is increased to 
240° C in 2 minutes and held at this temperature for 10 minutes and brought back to 
room temperature. 

To identify plants with alterations in total seed oil or protein content, 150mg 
of seeds from T2 progeny plants were subjected to analysis by Near Infrared 
Reflectance Spectroscopy (NIRS) using a Foss NirSystems Model 6500 with a 
spinning cup transport system. NIRS is a non-destructive analytical method used to 
determine seed oil and protein composition. Infrared is the region of the 
electromagnetic spectrum located after the visible region in the direction of longer - 
wavelengths. 'Near infrared' owns its name for being the infrared region near to the 
visible region of the electromagnetic spectrum. For practical purposes, near infrared 
comprises wavelengths between 800 and 2500 nm. NIRS is applied to organic 
compounds rich in O-H bonds (such as moisture, carbohydrates, and fats), C-H bonds 
(such as organic compounds and petroleum derivatives), and N-H bonds (such as 
proteins and amino acids). The NIRS analytical instruments operate by statistically 
correlating NIRS signals at several wavelengths with the characteristic or property 
intended to be measured. All biological substances contain thousands of C-H, O-H, 
and N-H bonds. Therefore, the exposure to near infrared radiation of a biological 
sample, such as a seed, results in a complex spectrum which contains qualitative and 
quantitative information about the physical and chemical composition of that sample. 

The numerical value of a specific analyte in the sample, such as protein 
content or oil content, is mediated by a calibration approach known as chemometrics. 
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Chemometrics applies statistical methods such as multiple linear regression (MLR), 
partial least squares (PLS), and principle component analysis (PCA) to the spectral 
data and correlates them with a physical property or other factor, that property or 
factor is directly determined rather than the analyte concentration itself. The method 
first provides "wet chemistry" data of the samples required to develop the calibration. 

Calibration for Arabidopsis seed oil composition was performed using 
accelerated solvent extraction using 1 g seed sample size and was validated against 
certified canola seed. A similar wet chemistry approach was performed for seed 
protein composition calibration. 

Data obtained from NIRS analysis was analyzed statistically using a nearest- 
neighbor (N-N) analysis. The N-N analysis allows removal of within-block spatial 
variability in a fairly flexible fashion which does not require prior knowledge of the 
pattern of variability in the chamber. Ideally, all hybrids are grown under identical 
experimental conditions within a block (rep). In reality, even in many block designs, 
significant within-block variability exists. Nearest-neighbor procedures are based on 
assumption that environmental effect of a plot is closely related to that of its 
neighbors. Nearest-neighbor methods use information from adjacent plots to adjust 
for within-block heterogeneity and so provide more precise estimates of treatment 
means and differences. If there is within-plot heterogeneity on a spatial scale that is 
larger than a single plot and smaller than the entire block, then yields from adjacent 
plots will be positively correlated. Information from neighboring plots can be used to 
reduce or remove the unwanted effect of the spatial heterogeneity, and hence improve 
the estimate of the treatment effect. Data from neighboring plots can also be used to 
reduce the influence of competition between adjacent plots. The Papadakis N-N 
analysis can be used with designs to remove within-block variability that would not 
be removed with the standard split plot analysis (Papadakis, 1973, Inst. d'Amelior. 
Plantes Thessaloniki (Greece) Bull. Scientif., No. 23; Papadakis, 1984, Proc. Acad. 
Athens, 59, 326-342). 

Experiments were performed to identify those transfoimants or knockouts that 
exhibited an improved pathogen tolerance. For such studies, the transformants were 
exposed to biotropic fungal pathogens, such as Erysiphe orontii, and neurotropic 
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fungal pathogens, such as Fusarium oxysponim. Fusarium oxy>sporum isolates cause 
vascular wilts and damping off of various annual vegetables, perennials and weeds 
(Mauch-Mani and Slusarenko (1994) Molecular Plant-Microbe Interactions 7: 378- 
383). For Fusarium oxysporum experiments, plants grown on Petri dishes were 
sprayed with a fresh spore suspension of oxyspoiiun. The spore suspension was 
prepared as follows: A plug of fungal hyphae from a plate culture was placed on a 
fresh potato dextrose agar plate and allowed to spread for one week. 5 ml sterile 
water was then added to the plate, swirled, and pipetted into 50 ml Armstrong 
Fusarium medium. Spores were grown overnight in Fusarium medium and then 
sprayed onto plants using a Preval paint sprayer. Plant tissue was harvested and 
frozen in liquid nitrogen 48 hours post infection. 

Eiysiphe orontii is a causal agent of powdery mildew. For Erysiphe orontii 
experiments, plants were grown approximately 4 weeks in a greenhouse under 12 
hour light (20°C, -30% relative humidity (rh)). Individual leaves were infected with 
E. orontii spores from infected plants using a camePs hair brush, and the plants were 
transferred to a Percival growth chamber (20°C, 80% rh.). Plant tissue was harvested 
and frozen in liquid nitrogen 7 days post infection. 

Botiytis cinerea is a necrotrophic pathogen. Botrytis cinerea was grown on 
potato dextrose agar in the light. A spore culture was made by spreading 10 ml of 
sterile water on the fungus plate, swirling and transferring spores to 10 ml of sterile 
water. The spore inoculum (approx. 105 spores/ml) was used to spray 10 day-old 
seedlings grown under sterile conditions on MS (minus sucrose) media. Symptoms 
were evaluated every day up to approximately 1 week. 

Infection with bacterial pathogens Pseudomonas syringae pv maculicola (Psm) 
strain 4326 and pv maculicola strain 4326 was performed by hand inoculation at two 
doses. Two inoculation doses allows the differentiation between plants with enhanced 
susceptibility and plants with enhanced resistance to the pathogen. Plants were grown 
for 3 weeks in the greenhouse, then transferred to the growth chamber for the 
remainder of their growth. Psm ES4326 was hand inoculated with 1 ml syringe on 3 
folly-expanded leaves per plant (4 1/2 wk old), using at least 9 plants per 
overexpressing line at two inoculation doses, OD=0.005 and OD=0.0005. Disease 
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scoring occurred at day 3 post-inoculation with pictures of the plants and leaves taken 
in parallel. 

In some instances, expression patterns of the pathogen-induced genes (such as 
defense genes) was monitored by microarray experiments. cDNAs were generated by 
PCR and resuspended at a final concentration of - 100 ng/ul in 3X SSC or 150mM 
Na-phosphate (Eisen and Brown (1999) Methods Enzymol 303:179-205). The 
cDNAs were spotted on microscope glass slides coated with polylysine. The prepared 
cDNAs were aliquoted into 384 well plates and spotted on the slides using an x-y-z 
gantry (OmniGrid) purchased from GeneMachines (Menlo Park, CA) outfitted with 
quill type pins purchased from Telechem International (Sunnyvale, CA). After 
spotting, the arrays were cured for a minimum of one week at room temperature, 
rehydrated and blocked following the protocol recommended by Eisen and Brown 
(1999; supra). 

Sample total RNA (10 ug) samples were labeled using fluorescent Cy3 and 
Cy5 dyes. Labeled samples were resuspended in 4X SSC/0.03% SDS/4 ug salmon 
sperm DNA/2 ug tRNA/ 50mM Na-pyrophosphate, heated for 95°C for 2.5 minutes, 
spun down and placed on the array. The array was then covered with a glass 
coverslip and placed in a sealed chamber. The chamber was then kept in a water bath 
at 62°C overnight. The arrays were washed as described in Eisen and Brown (1999) 
and scanned on a General Scanning 3000 laser scanner. The resulting files are 
subsequently quantified using Imagene, a software purchased from BioDiscovery 
(Los Angeles, CA). 

Experiments were performed to identify those transformants or knockouts that 
exhibited an improved environmental stress tolerance. For such studies, the 
transformants were exposed to a variety of environmental stresses. Plants were 
exposed to chilling stress (6 hour exposure to 4-8° C ), heat stress (6 hour exposure to 
32-37° C), high salt stress (6 hour exposure to 200 mM NaCl), drought stress (168 
hours after removing water from trays), osmotic stress (6 hour exposure to 3 M 
mannitol), or nutrient limitation (nitrogen, phosphate, and potassium) (Nitrogen: all 
components of MS medium remained constant except N was reduced to 20 mg/1 of 
NH4NO3, or Phosphate: All components of MS medium except KH2PO4, which was 
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replaced by K2SO4, Potassium: All components of MS medium except removal of 
KNO3 and KH2PO4, which were replaced by NaH 4 P0 4 ). 

Experiments were performed to identify those transformants or knockouts that 
exhibited a modified structure and development characteristics. For such studies, the 
transformants were observed by eye to identify novel structural or developmental 
characteristics associated with the ectopic expression of the polynucleotides or 
polypeptides of the invention. 

Experiments were performed to identify those transformants or knockouts that 
exhibited modified sugar-sensing. For such studies, seeds from transformants were 
germinated on media containing 5% glucose or 9.4% sucrose which normally partially 
restrict hypocotyl elongation. Plants with altered sugar sensing may have either 
longer or shorter hypocotyls than normal plants when grown on this media. 
Additionally, other plant traits may be varied such as root mass. 

Flowering time was measured by the number of rosette leaves present when a 
visible inflorescence of approximately 3 cm is apparent Rosette and total leaf number 
on the progeny stem are tightly correlated with the timing of flowering (Koornneef et 
al (1991) MoL Gen. Genet 229:57-66. The vernalization response was measured. For 
vernalization treatments, seeds were sown to MS agar plates, sealed with micropore 
tape, and placed in a 4°C cold room with low light levels for 6-8 weeks. The plates 
were then transferred to the growth rooms alongside plates containing freshly sown 
non-vernalized controls. Rosette leaves were counted when a visible inflorescence of 
approximately 3 cm was apparent. 

Modified phenotypes observed for particular overexpressor or knockout plants 
are provided in Table 4. For a particular overexpressor that shows a less beneficial 
characteristic, it may be more useful to select a plant with a decreased expression of 
the particular transcription factor. For a particular knockout that shows a less 
beneficial characteristic, it may be more useful to select a plant with an increased 
expression of the particular transcription factor. 
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The sequences of the Sequence Listing or those in Tables 4 , 5 or those 
disclosed here can be used to prepare transgenic plants and plants with altered traits. 
The specific transgenic plants listed below are produced from the sequences of the 
Sequence Listing, as noted. Table 4 provides exemplary polynucleotide and 
polypeptide sequences of the invention. Table 4 includes, from left to right for each 
sequence: the first column shows the polynucleotide SEQ ID NO; the second column 
shows the Mendel Gene ID No., GID; the third column shows the trait(s) resulting 
from the knock out or overexpression of the polynucleotide in the transgenic plant; 
the fourth column shows the category of the trait; the fifth column shows the 
transcription factor family to which the polynucleotide belongs; the sixth column 
("Comment"), includes specific effects and utilities conferred by the polynucleotide 
of the first column; the seventh column shows the SEQ ED NO of the polypeptide 
encoded by the polynucleotide; and the eighth column shows the amino acid residue 
positions of the conserved domain in amino acid (AA) co-ordinates. 

Seed of plants overexpressing sequences G265 (SEQ ID NOs:871 and 872), 
G715 (SEQ ID NOs:925 and 926), G1471 (SEQ ID NOs:31 1 and 312), G1793 (SEQ 
ID NOs:365 and 366), G1838 (SEQ ID NOs:381 and 382), G1902 (SEQ ID NOs:405 
and 406), G286 (SEQ ID NOs:877 and 878), G2138 (SEQ ID NOs:865 and 866) and 
G2830 (SEQ ID NOs:875 and 876) was subjected to NIR analysis and a significant 
increase in seed oil content compared with seed from control plants was identified. 

G192: G192 (SEQ ID NO: 859) was expressed in all plant tissues and under 
all conditions examined. Its expression was slightly induced upon infection by 
Fusarium. G192 was analyzed using transgenic plants in which this gene was 
expressed under the control of the 35S promoter. G192 overexpressors were late 
flowering under 12 hour light and had more leaves than control plants. This 
phenotype was manifested in the three T2 lines analyzed. Results of one experiment 
suggest that G192 overexpressor was more susceptible to infection with a moderate 
dose of the fungal pathogen Erysiphe orontii. The decrease in seed oil observed for 
one line was replicated in an independent experiment. G192 overexpression delayed 
flowering. A wide variety of applications exist for systems that either lengthen or 
shorten the time to flowering, or for systems of inducible flowering time control. In 
particular, in species where the vegetative parts of the plants constitute the crop and 
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tlie reproductive tissues are discarded, it will be advantageous to delay or prevent 
flowering. Extending vegetative development can bring about large increases in 
yields. G192 can be used to manipulate the defense response in order to generate 
pathogen-resistant plants. G192 can be used to manipulate seed oil content, which 
can be of nutritional value. 

Closely Related Genes from Other Species 

G192 had some similarity within the conserved WRKY domain to non- 
Arabidopsis plant proteins. 

G1946: G1946 (SEQ ID NO: 801) was studied using transgenic plants in 
which the gene was expressed under the control of the 35S promoter. 
Overexpression of G1946 resulted in accelerated flowering, with 35S::G1946 
transformants producing flower buds up to a week earlier than wild-type controls (24- 
hour light conditions). These effects were seen in 12/20 primary transformants and in 
two independent plantings of each of the three T2 lines. Unlike many early flowering 
Arabidopsis transgenic lines, which are dwarfed, 35S::G1946 transformants often 
reached full-size at maturity, and produced large quantities of seeds, although the 
plants were slightly pale in coloration and had slightly flat leaves compared to wild- 
type. In addition, 35S::G1946 plants showed an altered response to phosphate 
deprivation. Seedlings of G1946 overexpressor plants showed more secondary root 
growth on phosphate-free media, when compared to wild-type control. In a repeat 
experiment, all three lines showed the phenotype. Overexpression of G1946 in 
Arabidopsis also resulted in an increase in seed glucosinolate M39501 in T2 lines 
land 3. An increase in seed oil and a decrease in seed protein was also observed in 
these two lines. G1946 was ubiquitously expressed, and does not appear to be 
significantly induced or repressed by any of the biotic and abiotic stress conditions 
tested at this time, with the exception of cold, which repressed G1946 expression. 
G1946 can be used to modify flowering time, as well as to improve the plant's 
performance in conditions of limited phosphate, and to alter seed oil, protein, and 
glucosinolate composition. 
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Closely Related Genes from Other Species 

A comparison of the amino acid sequence of G1946 with sequences available 
from GenBank showed strong similarity with plant HSFs of several species 
(Lycopersicon peruvianum, Medicago truncatula, Lycopersicon esculentum, Glycine 
max, Solanum tuberosum, Oryza sativa and Hordeum vulgare subsp. vulgare). 

G375: The sequence of G375 (SEQ ID NO:239) was experimentally 
determined and G375 was analyzed using transgenic plants in which G375 was 
expressed under the control of the 35S promoter. Overexpression of G375 produced 
marked effects on leaf development. At early stages of growth, 35S::G375 seedlings 
developed narrow, upward pointing leaves with long petioles (possibly indicating a 
disruption in circadian-clock controlled processes or nyctinastic movements). 
Additionally, some seedlings were noted to have elongated hypocotyls, and some 
were rather small compared to wild-type controls. Comparable phenotypes were 
obtained by overexpression of an AP2 family gene, G21 13 (SEQ ID NO: 85). 
Following the switch to flowering, 35S::G375 plants showed reduced fertility, which 
possibly arose from a failure of stamens to fully elongate. One of the three T2 lines, 
(#41) was later flowering than wild-type controls, and also developed large numbers 
of small secondary rosette leaves in the axils of the primary rosette. Although these 
effects were not noted in the other two lines, the phenotypes obtained in line 41 were 
somewhat similar to those produced by overexpression of another Z-dof gene, G736 
(SEQ ID NO: 211). G375 was expressed in all tissues, although at different levels. It 
was expressed at low levels in the root and germinating seed, and expressed at high 
levels in the embryo. The effects of G375 on leaf architecture are of potential interest 
to the ornamental horticulture industry. 

Closely Related Genes from Other Species 

G375 showed some homology to non-Arabidopsis plant proteins within the 
conserved Dof domain. 

G1255: The sequence of G1255 (SEQ ID NO: 273) was experimentally 
determined and G1255 was analyzed using transgenic plants in which G1255 was 
expressed under the control of the 35S promoter. Plants overexpressing G1255 had 
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alterations in leaf architecture, a reduction in apical dominance, an increase in seed 
size, and showed more disease symptoms following inoculation with a low dose of the 
fungal pathogen Botrytis cinerea. G1255 was constitutively expressed and not 
significantly induced by any conditions tested. On the basis of the phenotypes 
produced by overexpression of G1255, G1255 can be used to manipulate the plant's 
defense response to produce pathogen resistance, alter plant architecture, or alter seed 
size. 

Closely Related Genes from Other Species 

G1255 showed strong homology to a putative rice zing finger protein 
represented by sequence AC0871S1_3. Sequence identity between these two protein 
extended beyond the conserved domain, and therefore, these genes can be orthologs. 

G865: The complete cDNA sequence of G865 (SEQ ID NO: 557) was 
determined. G865 was ubiquitously expressed in Arabidopsis tissues. G865 was 
analyzed using transgenic plants in which G865 was expressed under the control of 
the 35S promoter. Plants overexpressing G865 were early flowering, with numerous 
secondary inflorescence meristems giving them a bushy appearance. G865 
overexpressors were more susceptible to infection with a moderate dose of the fungal 
pathogens Erysiphe orontii and Botrytis cinerea. In addition, seeds from G865 
overexpressing plants showed a trend of increased protein and reduced oil content, 
although the observed changes were not beyond the criteria used forjudging 
significance except in one line. G865 can be used to control flowering time. G865 
can be used to manipulate the defense response in order to generate pathogen-resistant 
plants. G865 can be used to alter seed oil and protein content of a plant. 

Closely Related Genes from Other Species 

G865 and other non-Arabidopsis AP2/EREBP proteins were similar within the 
conserved AP2 domain. 

G2509: G2509 (SEQ ID NO: 23) was studied using transgenic plants in 
which the gene was expressed under the control of the 35S promoter. Overexpression 
of G2509 caused multiple alterations in plant growth and development, most notably, 
altered branching patterns, and a reduction in apical dominance, giving the plants a 

144 

BNSDOCID: <WO_0301 3227A2J A> 



WO 03/013227 



PCT/US02/25805 



shorter, more bushy stature than wild type. Twenty 35S::G2509 primary 
transformants were examined; at early stages of rosette development, these plants 
displayed a wild-type phenotype. However, at the switch to flowering, almost all Tl 
lines showed a marked loss of apical dominance and large numbers of secondary 
shoots developed from axils of primary rosette leaves. In the most extreme cases, the 
shoots had very short intemodes, giving the inflorescence a very bushy appearance. 
Such shoots were often very thin and flowers were relatively small and poorly fertile. 
At later stages, many plants appeared very small and had a low seed yield compared 
to wild type. In addition to the effects on branching, a substantial number of 
35S::G2509 primary transformants also flowered early and had buds visible several 
days prior to wild type. Similar effects on inflorescence development were noted in 
each of three T2 populations examined. The branching and plant architecture 
phenotypes observed in 35S::G2509 lines resemble phenotypes observed for three 
other AP2/EREBP genes: G865 (SEQ ID NO: 557), G141 1 (SEQ ID NO: 3), and 
G1794 (SEQ ID NO: 13). G2509, G865, and G1411 form a small clade within the 
large AP2/EREBP family, and G1794, although not belonging to the clade, is one of 
the AP2/EREBP genes closest to it in the phylogenetic tree. It is thus likely that all 
these genes share a related function, such as affecting hormone balance. 
Overexpression of G2509 in Arabidopsis resulted in an increase in alpha-tocopherol 
in seeds in T2 lines 5 and 1 1 . G2509 was ubiquitously expressed in Arabidopsis plant 
tissue. G2509 expression levels were altered by a variety of environmental or 
physiological conditions. G2509 can be used to manipulate plant architecture and 
development. G2509 can be used to alter tocopherol composition. Tocopherols have 
anti-oxidant and vitamin E activity. G2509 can be useful in altering flowering time. 
A wide variety of applications exist for systems that either lengthen or shorten the 
time to flowering. 

Closely Related Genes fam Other Species 

G2509 showed some sequence similarity with known genes from other plant 
species within the conserved AP2/EREBP domain. 

G2347: G2347 (SEQ ID NO: 1119) was analyzed using transgenic plants in 
which G2347 was expressed under the control of the 35S promoter. Overexpression 
of G2347 markedly reduced the time to flowering in Arabidopsis. This phenotype 
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was apparent in the majority of primary transformants and in all plants from two out 
of the three T2 lines examined. Under continuous light conditions, 35S::G2347 plants 
formed flower buds up a week earlier than wild type. Many of the plants were rather 
small and spindly compared to controls. To demonstrate that overexpression of 
G2347 could induce flowering under less inductive photoperiods, two T2 lines were 
re-grown in 12 hour conditions; again, all plants from both lines bolted early, with 
some initiating flower buds up to two weeks sooner than wild-type. As determined by 
RT>PCR, G2347 was highly expressed in rosette leaves and flowers, and to much 
lower levels in embryos and siliques. No expression of G2347 was detected in the 
other tissues tested. G2347 expression was repressed by cold, and by auxin 
treatments and by infection by Erysiphe. G2347 is also highly similar to the 
Arabidopsis protein G2010 (SEQ ID NO: 1 121). The level of homology between 
these two proteins suggested they could have similar, overlapping, or redundant 
functions in Arabidopsis. In support of this hypothesis, overexpression of both G2010 
and G2347 resulted in early floweiing phenotypes in transgenic plants. 

Closely Related Genes from Other Species 

The closest relative to G2347 is the Antirrhinum protein, SBP2 (CAA63061). 
The similarity between these two proteins is extensive enough to suggest they might 
have similar functions in a plant. 

G988: G988 (SEQ ID NO: 43) was analyzed using transgenic plants in which 
G988 was expressed under the control of the 35S promoter. Plants overexpressing 
G988 had multiple morphological phenotypes. The transgenic plants were generally 
smaller than wild-type plants, had altered leaf, inflorescence and flower development, 
altered plant architecture, and altered vasculature. In one transgenic line 
overexpressing G988 (line 23), an increase in the seed glucosinolate M39489 was 
observed. The phenotype of plants overexpressing G988 was wild-type in all other 
assays performed. In wild-type plants, G988 was expressed primarily in flower and 
silique tissue, but was also present at detectable levels in all other tissues tested. 
Expression of G988 was induced in response to heat treatment, and repressed in 
response to infection with Erysiphe. Based on the observed morphological 
phenotypes of the transgenic plants, G988 can be used to create plants with larger 
flowers. This can have value in the ornamental horticulture industry. The reduction 

146 

BNSDOCID: <WO_03013227A2JA> 



WO 03/013227 



PCT/DS02/25805 



in the formation of lateral branches suggests that G988 can have utility on the forestry 
industry. The Arabidopsis plants overexpressing G988 also had reduced fertility. 
This can be a desirable trait in some instances, as it can be exploited to prevent or 
minimize the escape of GMO (genetically modified organism) pollen into the 
environment. 

Closely Related Genes from Other Species 

The amino acid sequence for the Capsella rubella hypothetical protein 
represented by GenBank accession number CRU303349 was significantly identical to 
G988 outside of the SCR conserved domains. The Capsella rubella hypothetical 
protein is 90% identical to G988 over a stretch of roughly 450 amino acids. 
Therefore, it is likely that the Capsella rubella gene is an ortholog of G988. 

G2346: G2346 (SEQ ID NO: 459) was analyzed using transgenic plants in 
which the gene was expressed under the control of the 35S promoter. 35S::G2346 
seedlings from all three T2 populations had slightly larger cotyledons and appeared 
somewhat more advanced than controls. This indicated that the seedlings developed 
more rapidly that the control plants. At later stages, however, G2346 overexpressing 
plants showed no consistent differences from control plants. The phenotype of these 
transgenic plants was wild-type in all other assays performed. According to RT-PCR 
analysis, G2346 is expressed ubiquitously. 

Closely Related Genes from Other Species 

G2346 shows some sequence similarity with known genes from other plant 
species within the conserved SBP domain. 

G1354: The complete sequence of G1354 (SEQ ID NO: 285) was determined. 
G1354 was analyzed using transgenic plants in which G1354 was expressed under the 
control of the 35S promoter. Overexpression of G1354 produced highly deleterious 
effects on growth and development. Only three 35S::G1354 Tl plants were obtained; 
all were extremely tiny and slow developing. After three weeks of growth, each of 
the plants comprised a completely disorganized mass of leaves and root that had no 
clear axis of growth. Since these individuals would not have survived transplantation 
to soil, they were harvested for RT-PCR analysis; all three plants showed moderate 
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levels of G 1354 overexpression compared to whole wild-type seedlings of an 
equivalent size. Only a very small number of transformants were obtained from two 
selection attempts on separate batches of TO seed. Usually between 15 and 120 
transformants are obtained from each aliquot of 300 mg TO seed from wild-type 
plants. The low transformation frequency obtained in this experiment suggests that 
high levels of G1354 overexpression might have completely lethal effects and prevent 
transformed seeds from germinating. As determined by RT-PCR, G1354 was 
uniformly expressed in all tissues and under all conditions tested in RT-PCR. 
However, the gene was repressed in leaf tissue in response to Erysiphe infection. 

Closely Related Genes from Other Species 

G1354 is closely related to a NAM protein encoded by polynucleotide from 
rice (AC005310). Similarity between G1354 and this rice protein extends beyond the 
signature motif of the family to a level that would suggest the genes are orthologs. 

G1063: G1063 (SEQ ED NO: 1 19) is a member of a clade of highly related 
HLH/MYC proteins that also includes G779 (SEQ ID NO: 113), G1499 (SEQ ID NO: 
7), G2143 (SEQ ID NO: 129), and G2557 (SEQ ID NO: 133). All of these genes 
caused similar pleiotropic phenotypic effects when overexpressed, the most striking 
of which was the production of ectopic carpelloid tissue. These genes can be 
considered key regulators of carpel development. A spectrum of developmental 
alterations was observed amongst 35S::G1063 primary transformants and the majority 
were markedly small, dark green, and had narrow curled leaves. The most severely 
affected individuals were completely sterile and formed highly abnormal 
inflorescences; shoots often terminated in pin-like structures, and flowers were 
replaced by filamentous carpelloid structures. In other cases, flowers showed 
internode elongation between floral whorls, with a central carpel protruding on a 
pedicel-like organ. Additionally, lateral branches sometimes failed to develop and 
tiny patches of carpelloid tissue formed at axillary nodes of the inflorescence. In lines 
with an intermediate phenotype, flowers contained defined whorls of organs, but 
sepals were converted to carpelloid structures or displayed patches of carpelloid 
tissue. In contrast, lines with a weak phenotype developed relatively normal flowers 
and produced a reasonable quantity of seed. Such plants were still distinctly smaller 
than wild-type controls. Since the strongest 35S::G1063 lines were sterile, three lines 
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with a relatively weak phenotype, that had produced sufficient seed for biochemical 
and physiological analysis, were selected for further study. Two of the T2 
populations (T2-2S,37) were clearly small, darker green and possessed narrow leaves 
compared to wild type. Plants from one of these populations (T2-28) also produced 
occasional branches with abnormal flowers like those seen in the TL The final T2 
population (T2-30) displayed a very mild phenotype. Overexpression of G1063 in 
Arabidopsis resulted in a decrease in seed oil content in T2 lines 28 and 37. No 
altered phenotypes were detected in any of the physiological assays, except that the 
plants were noted to be somewhat small and produce anthocyanin when grown in 
Petri plates. G1063 was expressed at low to moderate levels in roots, flowers, rosette 
leaves, embryos, and germinating seeds, but was not detected in shoots or siliques. It 
was induced by auxin. G1063 can be used to manipulate flower form and structure or 
plant fertility. One application for manipulation of flower structure can be in the 
production of saffron, which is derived from the stigmas of Crocus sativus. G1063 
has utility in manipulating seed oil and protein content. 

Closely Related Genes from Other Species 

G1063 protein shared extensive homology in the basic helix loop helix region 
with a protein sequence encoded by Glycine max cDNA clone (AW832545) as well 
as a tomato root, plants pre-anthesis Lycopersicon sculentum cDNA (BE451174). 

G2143: G2143 (SEQ ID NO: 129) is a member of a clade of highly related 
HLH/MYC proteins that also includes G779 (SEQ ID NO: 1 13), G1063 (SEQ ID NO: 
119), G1499 (SEQ ID NO: 7), and G2557 (SEQ ID NO: 133). All of these genes 
caused similar pleiotropic phenotypic effects when overexpressed, the most striking 
of which was the production of ectopic carpelloid tissue. These genes can be 
.considered key regulators of carpel development. Twelve out of twenty 35S::G2143 
Tl lines showed a very severe phenotype; these plants were markedly small and had 
narrow, curled, dark-green leaves. Such individuals were completely sterile and 
formed highly abnormal inflorescences; shoots often terminated in pin-like structures, 
and flowers were replaced by filamentous carpelloid structures, or a fused mass of 
carpelloid tissue. Furthermore, lateral branches usually failed to develop, and tiny 
patches of stigmatic tissue often formed at axillary nodes of the inflorescence. 
Strongly affected plants displayed the highest levels of transgene expression 
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(detemiined by RT-PCR). The remaining Tl lines showed lower levels of G2143 
overexpression; these plants were still distinctly smaller than wild type, but had 
relatively normal inflorescences and produced seed. Since the strongest 35S::G2143 
lines were sterile, three lines with a relatively weak phenotype, that had produced 
sufficient seed for biochemical analysis, were selected for further study. T2-1 1 plants 
displayed a very mild phenotype and had somewhat small, narrow, dark green leaves. 
The other two T2 populations, however, appeared wild-type, suggesting that 
transgene activity might have been reduced between the generations. Reduced 
seedling vigor was noted in the physiological assays. G2143 expression was detected 
at low levels in flowers and siliques, and at higher levels in germinating seed. G2143 
can be used to manipulate flower form and structure or plant fertility. One application 
for manipulation of flower structure can be in the production of saffron, which is 
derived from the stigmas of Crocus sativus. 

Closely Related Genes from Other Species 

G2143 protein shared extensive homology in the basic helix loop helix region 
with a protein encoded by Glycine max cDNA clones (AWS32545, BG726819 and 
BG 154493) and a Lycopersicon esculentum cDNA clone (BE451 174). There was 
lower homology outside of the region. 

G2557: G2557 (SEQ ID NO: 133) is a member of a clade of highly related 
HLH/MYC proteins that also includes G779 (SEQ ID NO: 113), G1063 (SEQ ID NO: 
119),G1499(SEQH)NO:7),andG2143(SEQIDNO: 129). All of these genes 
caused similar pleiotropic phenotypic effects when overexpressed, the most striking 
of which was the production of ectopic carpelloid tissue. These genes can be 
considered key regulators of carpel development. The flowers of 35S::G2557 primary 
transformants displayed patches of stigmatic papillae on the sepals, and often had 
rather narrow petals and poorly developed stamens. Additionally, carpels were also 
occasionally held outside of the flower at the end of an elongated pedicel like 
structure. As a result of such defects, 35S::G2557 plants often showed very poor 
fertility and formed small wrinkled siliques. In addition to such floral abnormalities, 
the majority of primary transformants were also small and darker green in coloration 
than wild type. Approximately one third of the Tl plants were extremely tiny and 
completely sterile. Three Tl lines (#7,9,12), that had produced some seeds, and 
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showed a relatively weak phenotype, were chosen for further study. All three of the 
T2 populations from these lines contained plants that were distinctly small, had 
abnormal flowers, and were poorly fertile compared to controls. Stigmatic tissue was 
not noted on the sepals of plants from these three T2 lines. Another line (#4) that had 
shown a moderately strong phenotype in the Tl was sown for only morphological 
analysis in the T2 generation. T2-4 plants were small, dark green, and produced 
abnormal flowers with ectopic stigmatic tissue on the sepals, as had been seen in the 
parental plant. G2557 expression was detected at low to moderate levels in all tissues 
tested except shoots. It was induced by cold, heat, and salt, and repressed by 
pathogen infection. G1063 can be used to manipulate flower form and structure or 
plant fertility. One application for manipulation of flower structure can be in the 
production of saffron, which is derived from the stigmas of Crocus sativus. 

Closely Related Genes from Other Species 

G2557 protein shows extensive sequence similarity in the region of basic helix 
loop helix with a protein encoded by Glycine max cDNA clone (BE34781 1). 

G2430: The complete sequence of G2430 (SEQ ID NO: 697) was 
determined. G2430 is a member of the response regulator class of GARP proteins 
(AJRR genes), although one of the two conserved aspartate residues characteristic of 
response regulators is not present. The second aspartate, the putative phosphorylated 
site, is retained so G2430 can have response regulator function. G2430 is specifically 
expressed in embryo and silique tissue. In morphological analyses, plants 
overexpressing G2430 showed more rapid growth than control plants at early stages, 
and in two of three lines examined produced large, flat leaves. Early flowering was 
observed for some lines, but this effect was inconsistent between plantings. G2430 
can regulate plant growth. Overexpression of G2430 in Arabidopsis also resulted in 
seedlings that are slightly more tolerant to heat in a germination assay. Seedlings 
from G2430 overexpressing transgenic plants were slightly greener than the control 
seedlings under high temperature conditions. In a repeat experiment on individual 
lines, G2430 line 15 showed the strongest heat tolerant phenotype. G2430 can be 
useful to promote faster development and reproduction in plants. 
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Closely Related Genes from Other Species 

G2430 had some similarity within of the conserved GARP and response- 
regulator domains to non-Arabidopsis proteins. 

G1478: The sequence of G1478 (SEQ ID NO: 831) was determined and 
G1478 was analyzed using transgenic plants in which G1478 was expressed under the 
control of the 35S promoter. Plants overexpressing G 147 8 had a general delay in 
progression through the life cycle, in particular a delay in flowering time. G1478 is 
expressed at higher levels in flowers, rosettes and embryos but otherwise expression 
is constitutive. Based on the phenotypes produced through G1478 overexpression, 
G1478 can be used to manipulate the rate at which plants grow, and flowering time. 

Closely Related Genes from Other Species 

G1478 shows some homology to non-Arabidopsis proteins within the 
conserved domain. 

G6S1 : G681 (SEQ ID NO: 579) was analyzed using transgenic plants in 
which the gene was expressed under the control of the 35S promoter. Approximately 
half of the 35S::G681 primary transformants were markedly small and formed narrow 
leaves compared to controls. These plants often produced thin inflorescence stems, 
had rather poorly formed flowers with low pollen production, and set few seeds. 
Three Tl lines with relatively weak phenotypes, which had produced reasonable 
quantities of seed, were selected for further study. Plants from one of the T2 
populations were noted to be slightly small, but otherwise the T2 lines displayed no 
consistent differences in morphology from controls. In leaves of two of the T2 lines, 
overexpression of G681 resulted in an increase in the percentage of the glucosinolate 
M39480. According to RT-PCR analysis, G681 expression was detected at very low 
levels in flower and rosette leaf tissues. G6S1 was induced by drought stress. G6S1 
can be used to alter glucosinolate composition in plants. Increases or decreases in 
specific glucosinolates or total glucosinolate content are desirable depending upon the 
particular application. For example: (1) Glucosinolates are undesirable components 
of the oilseeds used in animal feed, since they produce toxic effects. Low- 
glucosinolate varieties of canola have been developed to combat this problem. (2) 
Some glucosinolates have anti-cancer activity; thus, increasing the levels or 
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composition of these compounds might be of interest from a nutraceutical standpoint. 
(3) Glucosinolates form part of a plants natural defense against insects. Modification 
of glucosinolate composition or quantity could therefore afford increased protection 
from predators. Furthermore, in edible crops, tissue specific promoters can be used to 
ensure that these compounds accumulate specifically in tissues, such as the epidermis, 
which are not taken for consumption. 

Closely Related Genes from Other Species 

G6S1 shows some sequence similarity with known genes from other plant 
species within the conserved Myb domain. 

G878: G878 (SEQ ID NO: 61 1) was studied using transgenic plants in which 
the gene was expressed under the control of the 35S promoter. Analysis of primary 
transformants revealed that overexpression of G878 delays the: onset of flowering in. ■ 
Arabidopsis. 1 1/20 of the 35S::G878 Tl plants flowered approximately one week 
later than wild type under continuous light conditions. These plants were also darker 
green, had shorter stems, and senesced later than controls. G878 was ubiquitously 
expressed. G878 can be used to modify flowering time and senescence, and a wide 
variety of applications exist for systems that either lengthen or shorten the time to 
flowering. 

Closely Related Genes from Other Species 

G878 was highly related to other WRKY proteins from a variety of plant 
species, such as the Nicotiana tabacum DNA-binding protein 2 (WRKY2) 
(AF096299), and a Cucumis sativus SPFl-like DNA-binding protein (L44134). 

G374: G374 (SEQ ID NO: 47) was expressed at low levels throughout the 
plant and was induced by salicylic acid. G374 was investigated using lines carrying a 
T-DNA insertion in this gene. The T-DNA insertion was approximately three 
quarters of the way into the protein coding sequence and should result in a null 
mutation. Homozygosity for a T-DNA insertion within G374 caused lethality at early 
stages of embryo development. In an initial screen for G374 knockouts, heterozygous 
plants were identified. Seed from those individuals was sown to soil and eleven 
plants were PCR-screened to identify homozygotes. No homozygotes were obtained; 

153 

BNSDOCID: <WO_03013227A2_1A> 



WO 03/013227 



PCT/US02/25805 



6 of the progeny were heterozygous whilst the other 5 were wild type. This raised the 
prospect that homozygosity for the G374 insertion was lethal. To examine this 
possibility further, heterozygous KO.G374 plants were re-grown. These individuals 
looked wild type, but their siliques were examined for seed abnormalities. When 
green siliques were dissected, around 25% of developing seeds were white or aborted. 
Embryos from these siliques were cleared using Hoyers solution, and examined under 
the microscope. It was apparent that embryos from the white seeds had arrested at 
early (globular or heart) stages of development, whilst embryos from the normal seeds 
were fully developed. Such arrested or aborted seeds most likely represented 
homozygotes for the G374 insertion. To support this conclusion, seed was collected 
from heterozygous plants and sown to kanamycin plates (the T-DNA insertion carried 
the NPT marker gene). Of the seedlings that germinated, 160 were kanamycin 
resistant and 1 07 were kanamycin sensitive. These data more closely fitted a 2:1 (chi- 
sq., ldf, = 5.5, 0.05>P>0.01) than a 3:1 (chi-sq., ldf, = 32, PO.001) ratio. Such a 
segregation ratio suggested that a homozygous class of kanamycin resistant seedlings 
was absent from the progeny of KO.G374 plant. G374 can be a herbicide target. 

Closely Related Genes from Other Species 

Similar sequences to G374 are present in tomato and Medicago truncatula, and 
these sequences can be orthologs. 

Example VIII: Identification of Homologous Sequences 

Homologous sequences from Arabidopsis and plant species other than 
Arabidopsis were identified using database sequence search tools, such as the Basic 
Local Alignment Search Tool (BLAST) (Altschul et al. (1990) J. Mol. Biol. 215:403- 
410; and Altschul et al. (1997) Nucl. Acid Res. 25: 3389-3402). The tblastx sequence 
analysis programs were employed using the BLOSUM-62 scoring matrix (Henikoff, 
S. and Henikoff, J. G. (1992) Proc. Natl. Acad. Sci. USA 89: 10915-10919). 

Identified non- Arabidopsis sequences homologous to the Arabidopsis 
sequences are provided in Table 5. The percent sequence identity among these 
sequences can be as low as 47%, or even lower sequence identity. The entire NCBI 
GenBank database was filtered for sequences from all plants except Arabidopsis 
thaliana by selecting all entries in the NCBI GenBank database associated with NCBI 
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taxonomic ID 33090 (Viridiplantae; all plants) and excluding entries associated with 
taxonomic ID 3701 (Arabidopsis thaliana). These sequences are compared to 
sequences representing genes of SEQ IDs NOs:2 - 2N, where N = 2-561, using the 
Washington University TBLASTX algorithm (version 2.0al9MP) at the default 
settings using gapped alignments with the filter "off. For each gene of SEQ IDs 
NOs:2 - 2N, where N = 2-561, individual comparisons were ordered by probability 
score (P-value), where the score reflects the probability that a particular alignment 
occurred by chance. For example, a score of 3.6e-40 is 3.6 x 10 -40 . In addition to P- 
values, comparisons were also scored by percentage identity. Percentage identity 
reflects the degree to which two segments of DNA or protein are identical over a 
particular length. Examples of sequences so identified are presented in Table 5. 
Homologous or orthologous sequences are readily identified and available in 
GenBank by Accession number (Table 5; Test sequence ID). The identified 
homologous polynucleotide and polypeptide sequences and homologues of the 
Arabidopsis polynucleotides and polypeptides may be orthologs of the Arabidopsis 
polynucleotides and polypeptides (TBD: to be determined). 

Example IX Introduction of polynucleotides into dicotyledonous plants 

SEQ ID NOs:l-(2N - 1), wherein N = 2-561, paralogous, orthologous, and 
homologous sequences recombined into pMEN20 or pMEN65 expression vectors are 
transformed into a plant for the purpose of modifying plant traits. The cloning vector 
may be introduced into a variety of cereal plants by means well-known in the art such 
as, for example, direct DNA transfer or Agrobacterium tumefaciens-mediated 
transformation. It is now routine to produce transgenic plants using most dicot plants 
(see Weissbach and Weissbach, (1989J supra; Gelvin et al., (1990) supra; Herrera- 
Estrella et al. (1983) supra; Bevan (1984) supra; and Klee (1985) supra). Methods 
for analysis of traits are routine in the art and examples are disclosed above. 

Example X Transformation of Cereal Plants with an Expression Vector 

Cereal plants such as corn, wheat, rice, sorghum or barley, may also be 
transformed with the present polynucleotide sequences in pMEN20 or pMEN65 
expression vectors for the purpose of modifying plant traits. For example, pMEN020 
may be modified to replace the Nptn coding region with the BAR gene of 
Streptomyces hygroscopicus that confers resistance to phosphinothricin. The Kpnl 
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and Bglll sites of the Bar gene are removed by site-directed mutagenesis with silent 
codon changes. 

The cloning vector may be introduced into a variety of cereal plants by means 
well-known in the art such as, for example, direct DNA transfer or Agrobacterhun 
tumefaciens-mediated transformation. It is now routine to produce transgenic plants 
of most cereal crops (Vasil, L, Plant Molec. Biol. 25: 925-937 (1994)) such as corn, 
wheat, rice, sorghum (Cassas, A. et al, Proc. Natl. Acad Sci USA 90: 1 1212-1 1216 
(1993) and barley (Wan, Y. and Lemeaux, P. Plant Physiol. 104:37-48 (1994). DNA 
transfer methods such as the microprojectile can be used for corn (Fromm. et al. 
Bio/Technology 8: 833-839 (1990); Gordon-Kamm et al. Plant Cell 2: 603-618 
(1990); Ishida, Y., Nature Biotechnology 14:745-750 (1990)), wheat (Vasil, et al. 
Bio/Technology 10:667-674 (1992) ; Vasil et al., Bio/Technology 1 1:1553-1558 
(1993); Weeks et al., Plant Physiol. 102:1077-1084 (1993)), rice (Christou 
Bio/Technology 9:957-962 (1991); Hiei et al. Plant J. 6:271-282 (1994); Aldemita 
and Hodges, Planta 199:612-617; Hiei et al., Plant Mol Biol. 35:205-18 (1997)). For 
most cereal plants, embryogenic cells derived from immature scutellum tissues are the 
preferred cellular targets for transformation (Hiei et al., Plant Mol Biol. 35:205-18 
(1997); Vasil, Plant Molec. Biol. 25: 925-937 (1994)). 

Vectors according to the present invention may be transformed into corn 
embryogenic cells derived from immature scutellar tissue by using microprojectile 
bombardment, with the A188XB73 genotype as the preferred genotype (Fromm, et 
al., Bio/Technology 8: 833-839 (1990); Gordon-Kamm et al., Plant Cell 2: 603-618 
(1990)). After microprojectile bombardment the tissues are selected on 
phosphinothricin to identify the transgenic embryogenic cells (Gordon-Kamm et al., 
Plant Cell 2: 603-618 (1990)). Transgenic plants are regenerated by standard corn 
regeneration techniques (Fromm, et al., Bio/Technology 8: 833-839 (1990); Gordon- 
Kamm et al., Plant Cell 2: 603-618 (1990)). 

The plasmids prepared as described above can also be used to produce 
transgenic wheat and rice plants (Christou, Bio/Technology 9:957-962 (1991); Hiei et 
al, Plant J. 6:271-282 (1994); Aldemita and Hodges, Planta 199:612-617 (1996); 
Hiei et al., Plant Mol Biol. 35:205-18 (1997)) that coordinately express genes of 
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interest by following standard transformation protocols known to those skilled in the 
art for rice and wheat Vasil, et al. Bio/Technology 10:667-674 (1992) ; Vasil et aL, 
Bio/Technology 11:1553-1558 (1993); Weeks et al., Plant Physiol. 102:1077-1084 
(1993)), where the bar gene is used as the selectable marker. 

All references, publications, patent documents, web pages, and other 
documents cited or mentioned herein are hereby incorporated by reference in their 
entirety for all purposes. Although the invention has been described with reference to 
specific embodiments and examples, it should be understood that one of ordinary skill 
can make various modifications without departing from the spirit of the invention. 
The scope of the invention is not limited to the specific embodiments and examples 
provided. 
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We claim: 

1. A transgenic plant comprising a recombinant polynucleotide having a 
nucleotide sequence selected from the group consisting of: 

(a) a nucleotide sequence encoding a polypeptide comprising a sequence selected 
from those of SEQ ID NOs: 860, 802, 240, 274, 558, 24, 1120, 44, 460, 286, 120, 
130, 134, 698, 832, 580, 612, and 48, or a complementary nucleotide sequence 
thereof; 

(b) a nucleotide sequence of SEQ ID NOs: 859, 801, 239, 273, 557, 23, 1119, 43, 459, 
285, 1 19, 129, 133, 697, 831, 579, 611, 47, or a complementary nucleotide sequence 

• thereof; and 

(c) a nucleotide sequence that hybridizes under stringent conditions to a nucleotide 
sequence of one or more polynucleotides of: (a) or (b). 

2. The transgenic plant of claim 1 wherein the transgenic plant possesses an 
altered trait as compared to another plant, or the transgenic plant exhibits an altered 
phenotype as compared to another plant, or the transgenic plant expresses an altered 
level of one or more genes associated with a plant trait as compared to another plant, 
wherein the other plant does not comprise the recombinant polynucleotide. 

3. The transgenic plant of claim 1 wherein the plant possesses an altered trait as 
compared to another plant wherein the trait is an alteration in one or more physical 
characteristics selected from the group consisting of: the number of trichomes, fruit 
and seed size and number, yield of stems, leaves, inflorescences, or roots, stability of 
seeds during storage, susceptibility of the seed to shattering, root hair length and 
quantity, internode distances, or the quality of seed coat. 

4. The transgenic plant of claim 1 wherein the plant possesses an altered trait as 
compared to another plant wherein the trait is an alteration in a plant growth 
characteristic selected from the group consisting of: growth rate, germination rate of 
seeds, vigor of plants and seedlings, leaf and flower senescence, male sterility, 
apomixis, flowering time, flower abscission, rate of nitrogen uptake, osmotic 
sensitivity to soluble sugar concentrations, biomass or transpiration characteristics, 
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apical dominance, branching patterns, number of organs, organ identity, and organ 
shape or size. 

5. The transgenic plant of claim 1 wherein the plant possesses an altered trait as 
compared to another plant wherein the trait is an alteration in one or more 
characteristics selected from the group consisting of protein or oil production, seed 
protein or oil production, insoluble sugar level, soluble sugar level, and starch 
composition. 

6. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ED NO:860. 

7. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:802. 

8: The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:240. 

9. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:274. 

1 0. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:558. 

1 1 . The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:24. 

12. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO: 1120. 

1 3 . The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:44. 
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14. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:460. 

1 5. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:286. 

16. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO: 120. 

17. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:130. 

1 8. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO: 134. 

1 9. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:698. 

20. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:832. 

21. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:580. 

22. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:612. 

23. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:48. 

24. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:859. 
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25. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:801. 

26. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:239. 

27. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:273. 

28. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:557. 

29. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:23. 

30. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:l 1 19. 

3 1 . The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:43^ 

32. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:459. 

33. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:285. 

34. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:l 19. 

35. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO: 129. 
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36. The transgenic plant of claim.1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO: 133. 

37. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO.697. 

38. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:S31. 

39. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:579. 

40. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:611. 

41 . The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ED NO:47. 

42. The transgenic plant of claim 1 , further comprising a constitutive, inducible, 
tissue-specific promoter operably linked to said nucleotide sequence. 



or 



43. The transgenic plant of claim 1 , wherein the plant is selected from the group 
consisting of: soybean, wheat, corn, potato, cotton, rice, oilseed rape, sunflower, 
alfalfa, sugarcane, turf, banana, blackberry, blueberry, strawberry, raspberry, 
cantaloupe, carrot, cauliflower, coffee, cucumber, eggplant, grapes, honeydew, 
lettuce, mango, melon, onion, papaya, peas, peppers, pineapple, pumpkin, spinach, 
squash, sweet corn, tobacco, tomato, watermelon, mint and other labiates, rosaceous 
fruits, and vegetable brassicas. 

44. The transgenic plant of claim 1 wherein the encoded polypeptide is expressed 
and regulates transcription of a gene. 
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45. A method of using the transgenic plant of claim 1 to grow a progeny plant 
from a parent plant, the method comprising crossing the transgenic plant with another 
plant, selecting seed, and growing the progeny plant from the seed. 

46. An isolated or recombinant polynucleotide comprising a nucleotide sequence 
selected from the group consisting of: 

(a) a nucleotide sequence encoding a polypeptide comprising a sequence selected 
from SEQ ID NOs: 240, 274, 558, 286, 698, and 832, or a complementary nucleotide 
sequence thereof; 

(b) a nucleotide sequence of SEQ ID NOs:239, 273, 557, 285, 697, S31, or a 
complementary nucleotide sequence thereof; and 

(c) a nucleotide sequence that hybridizes under stringent conditions to a nucleotide 
sequence of one or more of: (a) or (b). 

47. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQH)NO:240. 

48. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQIDNO:274. 

49. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQIDNO:558. 

50. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQIDNO:286. 

5 1 . The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQIDNO:698. 
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52. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQIDNO:832. 

53. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:239. 

54. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:273. 

55. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:557. 

56. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:285. 

57. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:697. 

58. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:831. 

59. The isolated or recombinant polynucleotide of claim 46, further comprising a 
constitutive, inducible, or tissue-specific promoter operably linked to the nucleotide 
sequence. 

60. The isolated or recombinant polynucleotide of claim 46 wherein the encoded 
polypeptide is expressed and regulates transcription of a gene. 

61 . A vector comprising the isolated or recombinant polynucleotide of claim 46. 

62. A host cell comprising the vector of claim 6 1 . 
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63 . A method of using the isolated or recombinant polynucleotide of claim 46 for 
producing a plant having a modified trait, the method comprising selecting a 
polynucleotide that encodes a polypeptide, inserting the polynucleotide into an 
expression vector, introducing the vector into a plant or a cell of a plant to 
overexpress the polypeptide, thereby producing a modified plant, and selecting a 
modified plant for a modified trait. 

64. The method of claim 63 wherein the plant possesses a modified trait as 
compared to another plant wherein the trait is an alteration in one or more physical 
characteristics selected from the group consisting of: the number of trichomes, fruit 
and seed size and number, yield of stems, leaves, inflorescences, or roots, stability of 
seeds during storage, susceptibility of the seed to shattering, root hair length and 
quantity, internode distances, or the quality of seed coat. 

65. The method of claim 63 wherein the plant possesses a modified as compared 
to another plant wherein the trait is an alteration in a plant growth characteristic 
selected from the group consisting of: growth rate, germination rate of seeds, vigor of 
plants and seedlings, leaf and flower senescence, male sterility, apomixis, flowering 
time, flower abscission, rate of nitrogen uptake, osmotic sensitivity to soluble sugar 
concentrations, biomass or transpiration characteristics, apical dominance, branching 
patterns, number of organs, organ identity, and organ shape or size. 

66. The method of claim 63 wherein the plant possesses a modified trait as 
compared to another plant wherein the trait is an alteration in one or more 
characteristics selected from the group consisting of protein or oil production, seed 
protein or oil production, insoluble sugar level, soluble sugar level, and starch 
composition. 

67. A modified plant produced by the method of claim 63. 

68. A method of using the plant of claim 67 to grow a progeny plant from a parent 
plant, the method comprising crossing the transgenic plant with another plant, 
selecting seed, and growing the progeny plant from the seed. 
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69. The plant produced by the method of claim 68. 
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SEQUENCE LISTING 

<110> Mendel Biotechnology, Inc. 
Ratcliffe, Oliver 
Riechmann, Jose Luis 
Adam, Luc J. 
Dubell, Arnold T. 
Heard, Jacqueline E. 
Pilgrim, Marsha L . 
Jiang, Cai-Zhong 
Reuber, T. Lynne 
Creelman, Robert A. 
Pineda, Omaira 
Yu, Guo- Liang 
Broun, Pierre E. 

<120> YIELD - RELATED POLYNUCLEOTIDES AND 
POLYPEPTIDES IN PLANTS 

<130> 514442002041 

<150> 60/310,847 
<151> 2001-08-09 

<150> 60/336,049 
<151> 2001-11-19 

<150> 60/338,692 
<151> 2001-12-11 

<150> 10/171,468 
<151> 2002-06-14 

>G1275 (58.. 579) 

CCAAGAAAAGGGAAGATCACGCATTCTTATAGGCGTAATTCGTAAATAGTGGTGAGTATG 

AATGATGCAGACACAAACTTGGGGAGTAGTTTCAGCGATGATACTCACTCTGTGTTCGAG 

TTTCCGGAGCTAGACTTGTCAGATGAATGGATGGATGATGATCTTGTGTCTGCGGTTTCC 

GGGATGAATCAGTCTTATGGTTATCAGACTAGTGATGTTGCTGGTGCTTTATTCTCAGGT 

TCTTCTAGCTGTTTCAGTCATCCTGAATCTCCAAGTACCAAAACTTATGTTGCTGCTACA 

GCCACTGCTTCTGCCGACAACCAAAACAAGAAAGAAAAGAAAAAAATTAAA 

GCGTTCAAGACACGGTCCGAGGTGGAAGTGCTTGACGACGGGTTCAAGTGGAGAAAGTAT 

GGGAAGAAGATGGTGAAGAACAGCCCACATCCAAGAAACTACTACAAATGTTCAGTTGAT 

GGCTGTCCCGTGAAGAAAAGGGTTGAACGAGACAGAGATGATCCGAGCTTTGTGATAACA 

ACTTACGAGGGTTCCCACAATGACTCAAGCATGAACTAAG 

CGACCATGCTATATTCAGCACATCTTATTTTCTATGGTTACGAACGATACTTAAAACTGC 
TTCTAGTTCTTTATATCCATTGTAAACTGGTTGCAGGTTCACAAATTTTGAGAGGTTTA^ 
GACATTCTAAATCTGTAGTACTTATATA 

>G1275 Amino Acid Sequence (domain in AA coordinates: 113-169) 
MNDADTNLGS S FSDDTHSVFEFPELDLSDEWMDDDLVS AVSGMNQS YGYQTSDVAGALFS 
GSSSCFSHPESPSTKTYVAATATASADNQNKKEKJCKIKGRVAFKTRSEVEVLDDGFKWRK 
YGKK>TVKNSPHPRNYYKCSVDGCPVKK^ 
>G1411 (110. .856) 

TAT^GAAAAACTGAACAACCCTAAAGTACTGTATAAATCCTATATCAAATTTTTTTTTTG 
GAAGAAAAGGCTATATTTAAAAGAAAATCAAGCAAAAGTAGATCCTCGGATGTATGGGAA 
GAGGCCTTTTGGAGGTGATGAATCTGAAGAAAGGGAAGAAGATGAGAACTTGTTCCCGGT 
CTTCTCGGCCCGATCTCAACACGACATGCGTGTTATGGTCTCGGCCTTGACTCAAGTAAT 
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CGGAAACCAACAAAGCAAATCTCATGATAACATCAGCTCTATTGATGATAACTATCCTTC 
TGTGTATAATCCACAAGACCCTAATCAACAAGTTGCGCCTACTCATCAAGACCAAGGGGA 
CTTGAGGAGGAGACATTATAGAGGTGTAAGGCAAAGGCCATGGGGAAAGTGGGCAGCTGA 
AATCCGAGACCCAAAAAAGGCGGCACGTGTGTGGCTCGGGACATTTGAAACCGCTGAATC 
TGCGGCCTTAGCTTATGATGAAGCAGCCCTAAAGTTCAAAGGAAGCAAAGCAAAACTCAA 
TTTCCCGGAGAGGGTTCAGCTTGGAAGTAACTCTACATATTACTCCTCCAACCAAATTCC 
ACAAATGGAACCACAAAGTATACCGAACTATAATCAATACTATCATGATGCGAGTAGTGG 
TGATATGCTAAGTTTTAATTTGGGCGGTGGGTATGGGAGTGGTACCGGATATTCAATGTC 
TCATGATAATAGTACTACGACTGCTGCTACAACTTCTTCGTCTTCTGGTGGCTCTTCTAG 
GCAACAAGAAGAGCAAGATTATGCCAGATTCTGGCGCTTTGGGGATTCTTCTTCCTCTCC 
TCATTCGGGATATTAATTAGGAGATTTGATCAGTTACTTGTGATGAAGTAATGATACATT 
TCCCGTCAAAATTGAGATGATCATATGCTTCCTGAATGTTTTTGAGTGTCATTTTTGTCT 

AAAAAAAAAAAAAA 

>G1411 Amino Acid Sequence (domain in AA coordinates: 87-154) 

MYGKRPFGGDESEEREEDENLFPVFSARSQHDMRVMVSALTQVIGNQQSKSHDNISSIDD 

NYPSVYNPQDPNQQVAPTHQDQGDLRRRHYRGVRQRPWGKWAAEIRDPKKAARVVJLGTFE 

TAESAALAYDEAALKFKGSKAKLNFPERVQLGSNSTYYSSNQIPQMEPQSIPNYNQYYHD 

ASS6DMLSFNLGGGYGSGTGYSMSHDNSTTTAATTSSSSGGSSRQQEEQDYARFWRFGDS 

SSSPHSGY* 
>G1488 (1..996) 

ATGGAAGATGAAGCACATGAATTCTTCCACACATCTGATTTTGCCGTTGATGACCTTTTA 
GTTGATTTCTCTAACGATGATGACGAAGAAAACGATGTTGTTGCTGATTCCACCACTACC 
ACCACCATAACCGACAGCTCTAACTTCTCCGCTGCTGATCTTCCCAGTTTCCACGGTGAT 
GTTCAAGACGGCACTAGCTTCTCCGGTGACCTTTGTATACCTTCTGATGATTTGGCTGAT 
GAGTTAGAGTGGCTTTCGAACATTGTGGATGAATCATTGTCGCCTG7VAGATGTACACAAG 
CTCGAGCTAATATCCGGTTTTAAGAGTCGACCGGACCCGAAATCCGATACCGGAAGCCCG 
GAAAACCCGAATAGCAGCAGTCCGATTTTTACTACCGACGTTTCTGTACCGGCCAAAGCT 
AGAAGCAAACGCTCACGCGCCGCTGCGTGTAATTGGGCCTCACGTGGGCTTCTCAAGGAA 
ACGTTTTACGAC^GTCCTTTCACCGGAGAAACCATTCTCTCTAGCC^ACAACACTTGTCT 
CCGCCAACCTCGCCGCCTTTGTTGATGGCTCCGCTAGGGAAAAAGCAAGCCGTTGATGGA 
GGACACCGACGGAAGAAGGATGTTTCTTCACCGGAGTCTGGTGGCGCAGAGGAGAGACGG 
TGTCTCCACTGCGCCACGGATAAGACTCCGCAATGGCGGACAGGCCCAATGGGCCCGAAG 
ACGTTGTGCAACGCTTGCGGTGTTAGGTACAAATCGGGACGTTTAGTGCCGGAGTATCGG 
CCCGCGGCGAGTCCGACGTTTGTGCTGGCGAAACACTCAAATTCTCATCGGAAAGTTATG 
GAGCTCCGGCGACAGAAGGAGATGAGTAGGGCCCATCATGAGTTCATACATCACCATCAC 
GGTACGGACACTGCCATGATTTTCGACGTTTCATCGGACGGTGATGATTACTTGATCCAC 
CACAACGTTGGCCCAGATTTCAGACAGCTTATTTGA 

>G1488 Amino Acid Sequence (domain in AA coordinates: 221-246) 

MEDEAHEFFHTSDFAVDDLLVDFSNDDDEETOVVADSTTTTTITDSSNFSAADIiPSFHGD 

VQDGTSFSGDLCIPSDDLADELEWLSNIVDESIiSPEDVHKLELISGFKSRPDPKSDTGSP 

ENPNSS S PI FTIT)VS VPAKARSKRSRAAACNWASRGLLKETFYDS PFTGET ILS SQQHLS 

PPTSPPLLMAPLGKKQAVDGGHRRKKDVS S PESGGAEERRCLHCATDKTPQWRTGPMGPK 

TLOTACGWYKSGRLVPEYRPAASPTFVLAKHSNSHRKVMELRRQKEMSRAHHE 

GTDTAMIFDVSSDGDDYLIHHNVGPDFRQLI* 

>G1499 (159.. 833) 

TCGACTCCITAATTGCATCACCAACCTAA 

CCTTTAATATATATATATGCTCACACACACACATATATATATACATATAAGCATCGCCTC 
AAGCATTAAAATTTTTACGAACCAAACAAACAAAAATTATGAATAATTATAATATGAACC 
CATCTCTCTTCCAAAATTACACTTGGAACAACATCATCAACAGCAGCAACAACAACAACA 
AGAATGATGATCATCATCATCAACATAATAATGATCCAATCGGTATGGCCATGGACCAGT 
ACACACAGCTCCATATCTTCAATCCTTTCTCTTCT 

CCCTCACAACCACCACTCTTCTCTCCGGAGATCAAGAAGACGACGAAGACGAAGAAGAAC 
CTCTAGAGGAACTCGGTGCTATGAAGGAAATGATGTACAAGATCGCAGCCATGCAATCGG 
TTGACATCGACCCAGCAACCGTCAAGAAACCCAAACGCCGTAACGTGAGGATCTCCGACG 
ACCCTCAGAGTGTGGCGGCTAGACATCGCCGTGAGAGAATCAGTGAGAGGATCAGAATTC 
TTCAGAGACTCGTGCCAGGTGGCACTAAAATGGATACGGCTTCAATGCTCGATGAAGCTA 
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TACGCTATGTCAAGTTCTTGAAACGGCAGATCCGGCTACTCAATAATAATACCGGATATA 

CTCCTCCGCCGCCGCAAGATCAAGCTTCTCAGGCGGTGACGACGTCATGGGTTTCACCGC 

CACCACCGCCAAGTTTCGGCCGTGGGGGAAGAGGAGTAGGAGAATTAATCTAGACAAGAT 

GACATTTCCATTAGTAGTAACTAAATTATGCTATAATGTGTGAGTAATGGTGCAATTATG 
GA 

>G1499 Amino Acid Sequence (domain in AA coordinates- 118-181) 
MNNYNMNPSLFQNYTWim^ 

HFPPLSSSLTTTTLLSGDQEDDEDEEEPLEELGAMKEMMYKIAAMQSVDIDPATVKKPKR 
RNVRISDDPQSVAARHRRERISERIRILQRLVPGGTKMDTASMLDEAIRYVKFLKROIRL 
LNNNTGYTPPPPQDQASQAVTTSWVSPPPPPSFGRGGRGVGELI* 
>G1543 (1..828) 

ATGATAAAACTACTATTTACGTACATATGCACATACACATATAAACTATATGCTCTATAT 

CATATGGATTACGCATGCGTGTGTATGTATAAATATAAAGGCATCGTCACGCTTCAAGTT 
TG TCTCTTTT ATATT A A A PTH Ann nTTTTfPTP'Pr. a »»nmmm. ^™ 



>G1543 Ammo Acid Sequence (domain in AA coordinates: 135-195) 
MIKLLFTYICTYTYKLYALYHMDYACVCMYKYKGIVTLQVCLFYIKLRVFLSNFTFSSSI 
LALKNPNNSLIKIMAILPENSSNLDLTISVPGFSSSPLSDEGSGGGRDQLRLDMNRLPSS 
EDGDDEEFSHDDGSAPPRKKLRLTREQSRLLEDSFRQNHTLNPKQKEVLAKHLMLRPROI 
E VWFQNRRARS KLKQTEMECE YLKRWFG SLTEENHRLHREVEELRAI KVGPTTVNS AS S L 
TMCPRCERVTPAASPSRAWPVPAKKTFPPQERDR* 

^/~»-i«r-»r-/-» i — » \ 



>G1635 (1, .1164) 

ATGGCGTCGTCTCCGTTGACTGCAAATGTTCAGGGTACCAACGCTTCTTTGAGGAATAGA 

GATGAAGAAACTGCAGACAAGCAGATACAATTCAATGACCAAAGTTTTGGGGGAAATGAC 
TATGCACCCAAGGTACGGAAGCCATAPArnaTB sraM^An^,^^, „, 



i.^AGLACAAGAAGTTTGTTGAAGCCTTGAAATTATACGGGCGAGCTTGGAGACGAATA 
GAAGAACATGTGGGCTCAAAGACCGCAGTTCAGATTCGAAGCCATGCTCAGAAGTTTTTC 
TCTAAGGTTGCTCGAGAAGCAACTGGAGGTGATGGGAGCTCAGTAGAGCCGATTGTAATA 
CCTCCTCCTCGTCCCAAGAGAAAGCCAGCGCATCCGTACCCTCGTAAGTTTGGGAACGAG 
GC^GATCAAACARGTAGATCGGTTTCTCCCTCAGAACGTGATACTCAATCTCCAACCTCT 
GTGTTGTCCACTGTTGGATCAGAAGCATTGTGTTCCCTTGATTCGAGTTCACCCAS.TCGA 

n£™^ GCTTGAGACTCTG ^ G ^ 

GAGAGCTCGATCAA.GGAACCAACGAAGCAAAGTCTTAAACTCTTTGGGAAGACAG 

GTATCTGATTCARRraTr!TPr"rrTT/-"rr^r.^ * ™ » 



v-«rtuu^^i_v_Artuu/\AULAAA.GTCTTAAACTCTTTGGGAAGACAGTTTTG 

^ A ^ GACT ^ GGCATGTCCTC " CTCTM ^ CTTCAA CATATTGTAAATCCCCAATT 

CAGCCATTACCACGGAAACTCTCATCATCCAAGACACTACCCATAATAAGAAACTCACAA 

GAAGAACTCTTGAGCTGCTGGATACAAGTCCCTCTTAAGCAAGAAGATGTGGAAAATAGA 

TGTTTGGATTCAGGAAAGGCTGTCCAAAACGAAGGATCATCGACTGGATCAAACACTGGT 

TCGGTGGATGATACGGGACACACGGAAAAGACCACAGAACCCGAAACAATGCTATGTCAA 
TGGGAGTTTAAACCAAGTGAGAGGTCTf?raTT»T»rr''TV57>r><-«m/-.»<-.T.-ii^» 



TCAAATTCAAGAGGATTTGGTCCATACAAGAAGAGAAAGATGGTAACAGAAGAAGAAGAG 
CATGAGATTCATCTCCACTTATAA 

>G1635 Amino Acid Sequence (domain in AA coordinates: 44-104) 

MASSPLTANVQGTNASLRNRDEETADKQIQFNDQSFGGNDYAPKVRKPYTITKERERWTD 

EEHKKFVEALKLYGRAWRRIEEHVGSKTAVQIRSHAQKFFSKVAREATGGDGSSVEPIVT 

PPPRPKRKPAHPYPRKFGNEADQTSRSVSPSEPJDTQSPTSVIiSTVGSEALCSLDSSSPNR 

SLSPVSSASPPAAiTTTANAPEELETLKLELFPSERLLNRESSIKEPTKQSLKLFGKTVL 
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SNSRGFGPYKKRKMVTEEEEHE IHLHL* 
TCTTTCTTTCTTCCTCTTTGTCTCTG 

T^OTTTCAGAGGAGGAGGATTTAGAAGGAGGGTTTTGTATGTGTGTCTTAAAAGTGGCA 

^Sgaagataacgttggcaaaaaagccgagtctattagagacgatgatcatcggacg 

tStCTGAA^^ 

ScScc^Scagcagccgcctccatcgtcgtcgtcctcatctcttatctcaggtttc 

SSSgSgcagaaaaggaggagagaggtggaggaaggtggcgccaaagcgg^ 
gcagSaata^ 

aSgSSttcgagtaacatgtcaggtccgggcccaacatacgagtatacaactacggca 
™gacaaagScatggggaaagtgggcggctgagattcgagatccatttaaagcagct 

gcSSgg^ 

tcKggaactcgggttcaacgactacccttttgcccataagacctgcttcgaatcaaagc 
g^StcgSgcc^ 

caIcScagS^cagcaacatcatcaacaatctttgga™ 
ccgttccgtotcggtcacactggaggttcaatgatgcaatctacgtcgtcatcatcatct 

cattctcgtc^tctgtottccccggctgctgttcagccgccaccagaatcagctagcgaa 

agtcgatcctcctgatgacttgcttcattttatttgtttcactatagagtaatagaaa^ 
SaaSg^ 

g^ttg^tctgctttcatcctctcatgctttttttct^ 

°™^^™taacaaacattaaaaagaccacatggagaaaggaaaaaaaagag 

AG 

^01794 Amino Acid Sequence (domain in AA coordinates: TBD) 
SSL1SGFSREMEMSAIVSALTHWAGNVPQHQQGGGEGSGEGTSNSSSSSGQKRRREVEE 

ggaSvSLtltvdqyfsggsstskvreassnmsgpgptyeytttatassetssfsgdq 

P^WPASTEAQPVHQTAAQRPTQSRNSGSTT^ 

YSEMARQQQQFQQHHQQSLDLYDQMSFPLRFGHTGGSMMQSTSSSSSHSRPLFSPAAVQP 
PPESASETGYLQDIQWPSDKTSNNYNNSPSS* 

ATCACAGTTAIGTTTCCAT^ 

ACACCATTTGCAGGAAAAAATGAATAGTTGTCAGTCTAATCCCACCAAAATGGATAATTC 
AGAAAATGTTCTATTTAATGATCAAAACGAAAATTTCACAC^ 

TTCTTCGTACTTGACAAGAGATCAAGAGCACGAGATCATGGTCTCTGCTCTGCGACAAGT 
GATATCTAACTCCGGAGCTGACGACGCGTCATCATCAAACTTGATCATCACAAGCGTTCC 

gcctccagacgctggcccttgtcctctctgtggcgtcgccggttgctacggctgcacatt 

ACAACGGCCGCACCGAGAGGTAAAGAAGGAGAAGAAATACAAAGGAGTAAGGAAAAAACC 

ATCGGGTAAATGGGeGGCGGAGATATGGGATCCGAGATCGAAATCAAGGAGGTGGCTTGG 

AACGTTTCTTACGGCGGAGATGGCGGCACAATCTTACAATGATGCGGCGGCTGAGTATC 

AGCAAGACGTGGTAAAACAAACGGAGAAGGAATTAAACGGCGGTGGAGATCACTGAGAAG 

GACATGGTCGGTGATCATACACGGCGAGGTGGAAATGTTATATTTACTATTGAAAACTA^ 

ATTATTTATTATAGAGGGAGATATTACTCTTTACGCTTTCATTAAGATTTATTTTTATAA 

GTTTTAAAGTATTTTATTGTTATAAAAAAAAAAAAAAAAAAAAAA 

>G1839 Amino Acid Sequence (domain in AA coordinates: TBD) 

MLTPFCSSHHLQEKMNSCQSNPTKMDNSENVLF^^ 

MVSALRQVI SNSGADDAS S SNLI ITSVPPPDAGPCPLCGVAGCYGCTLQRPHREVKKEKK 
YKGVRKKPSGKWAAEIWDPRSKSRRWLGTFLTAEMAAQSYNDAAAEYRARRGKTNGEGIK 
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RRWR* 

>G2108 (35 694) 

GAGAGAGAAACATTGATCTCTGAATATTGTGAACATGTTGAAATCAAGTAACAAGAGAAA 
AAGCAAAGAAGAGAAGAAGTTACAAGAAGGGAAGTACCTTGGAGTGAGGAGACGTCCATG 
GGGAAGATATGCAGCTGAAATCAGAAACCCTTTTACTAAAGAAAGACATTGGCTTGGAAC 
GTTTGATACAGCCGAAGAAGCTGCTTTTGCATATGACGTTGCTGCTCGATCCATCAGCGG 
CTCTCTAGCTACAACAAACTTCTTCTACACTGAAAACACCTCTTTAGAAAGACATCCACA 
ACAGTCTTTGGAGCCTCATATGACTTGGGGATCTTCTAGTCTCTGTCTTCTTCAAGATCA 
GCCTTTTGAAAACAACCATTTTGTTGCTGATCCTATCTCTTCTTCTTTTTCTCAAAAACA 
AGAGTCTTCTACCAATCTCACTAACACTTTCTCACATTGTTATAATGATGGTGATCATGT 
TGGCCAAAGCAAAGAGATTTCTTTACCTAATGATATGTCAAACAGTTTATTCGGTCATCA 
GGACAAAGTCGGTGAACATGACAATGCAGACCATATGAAGTTTGGCTCAGTTCTCAGCGA 
CGAACCTCTCTGCTTTGAGTATGACTACATTGGGAATTATCTTCAGAGTTTTCTCAAAGA 
TGTCAACGACGATGCTCCACAGTTTCTTATGTGAGCTTGTATTACCGATCCTTCAATTTA 

TG 

>G2108 Amino Acid Sequence (domain in AA coordinates: 18-85) 

MLKSSNI<^KSKEEKKLQEGKYLGVRRRPWGRYAAEIRNPFTKERHWLGTFDTAEEAAFAY 

DVAARSISGSLATTNFFYTENTSLERHPQQSLEPHMTWGSSSLCLLQDQPFENNHFVADP 

ISSSFSQKQESSTNLTNTFSHCYNDGDHVGQSKEISLPNDMSNSLFGHQDKVGEHDNADH 

MKFGSVLSDEPLCFEYDYIGNYLQSFLKDVNDDAPQFLM* 

>G2291 (27.. 797) 

GCTTTCTCACCTTTATAAAATAGAAAATGGAAAACAGCTACACCGTTGATGGTCACCGTC 
TTCAATATTCCGTTCCGTTAAGCTCCATGCATGAAACCAGTCAAAACTCCGAAACTTACG 
GATTATCC^AAGAGTCGCCGTTGGTCTGCATGCCCTTGTTCGAAACCAACACTACTTCAT 
TCGATATCTCTTCTCTTTTCTCGTTTAACCCAAAACCAGAACCCGAAAACACGCATCGTG 
TCATGGACGATTCCATCGCCGCCGTCGTGGGCGAAAACGTTCTTTTCGGTGATAAAAACA 
AAGTCTCTGATCACTTGACCAAAGAAGGTGGTGTGAAGCGGGGGCGGAAGATGCCGCAGA 
AGACCGGAGGATTCATGGGAGTGAGAAAACGGCCGTGGGGGAGATGGTCGGCGGAGATAA 
GAGACAGGATAGGGCGGTGCAGACACTGGTTAGGAACGTTCGACACGGCGGAAGAGGCAG 
CGCGTGCGTATGACGCGGCGGCGAGGAGGCTTAGAGGGACCAAAGCCAAGACCAATTTCG 
TGATTCCTCCGCTTTTTCCCAAGGAAATAGCTCAGGCTCAGGAGGATAATAGGATGAGGC 
AGAAGCAGAAGAAGAAGAAGAAGAAAAAAGTGAGTGTGAGGAAGTGTGTTAAAGTCACAT 
CGGTTGCACAGTTGTTCGATGATGCCAATTTTATAAATTCTTCTAGTATTAAAGGAAATG 
TGATTAGTTCTATTGATAATCTTGAAAAAATGGGTCTAGAGCTTGATTTGAGTTTAGGGT 
TGTTGTCTAGGAAGTGATAAAGCACTCGTAGTTAAGTAGTTGTAGTT 

>G2291 Amino Acid Sequence (domain in AA coordinates: TBD) 
MENSYTVDGHRLQYSVPLSSMHETSQNSETYGLSKESPLVCMPLFETNTTSFDISSLFSF 
NPKPEPENTHRVMDDSIAAWGENVLFGDKNKVSDHLTKEGGVKRGRKMPQKTGGFMGVR 
KRPWGRWSAEIRDRIGRCRHWI^TFDTAEEAARAYDAAARRLRGTKAKTNFVIPPLFPKE 

IAQAQEDNRMRQKQKKKKKKKVSW 
KMGLELDLSLGLLSRK* 
>G2452 (1..804) 

ATGTCATCGTCGACGATGTACAGAGGAGTTAATATGTTTTCACCGGCAAACACAAACTGG 
ATTTTTCAAGAAGTCAGAGAAGCCACGTGGACGGCGGAGGAGAACAAACGGTTCGAGAAA 
GCTCTCGCTTATCTGGACGACAAAGACAATCTTGAGAGCTGGTCCAAGATCGCAGATTTG 
ATTCCCGGCAAAACAGTAGCTGACGTCATTAAACGATACAAGGAGCTAGAGGATGATGTC 
AGCGACATCGAAGCCGGACTTATCCCCATTCCGGGATACGGCGGCGACGCCTCCTCCGCT 

GCAAACAGTGACTATTTCTTTGGTCT^ 

GGAGGAAAGAGGAGTTCGCCGGCGATGACTGATTGTTTTAGGTCTCCGATGCCGGAAAAG 
GAGAGGAAGAAAGGAGTTCCGTGGACCGAGGACGAACACCTACGATTTCTGATGGGTTTG 
AAGAAATATGGAAAAGGAGATTGGAGAAACATAGCAAAAAGCTTTGTGACGACTCGAACG 
CCGACGCAAGTCGCTTCACACGCTCAGAAATATTTTCTTCGACAACTCACAGATGGTAAA 
GACAAAAGACGATCAAGTATTCACGATATCACCACTGTTAACATCCCTGACGCAGACGCA 
TCCGCAACCGCCACGACCGCTGACGTAGCACTCTCTCCTACTCCAGCCAATTCTTTTGAC 
GTTTTCCTTCAGCCAAATCCTCATTACAGTTTCGCGTCTGCGTCTGCGTCTAGCTATTAT 

AATGCGTTTCCGCAGTGGAGTTAA 

>G2452 Amino Acid Sequence (conserved domain in AA coordinates : 27-213) 
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MSSSTMYRGVHMFSPAHTNWIFQEVREATWTAEENKRFEKALAYLDDKDNLESWSKIADL 
TPGK^fiDVIKRyKELEDDVSDIEAGLIPIPGYGGDASSAANSDYFFGLENSSYGYDYVV 

ggSSpamtdcfrspmpekerkkgvpwtedehlrflmglkkygkgdwrniaksfvttrt 
So^SSflrqltdgkdkrrssihdittwxpdadasatattadvalsptpansfd 

vflqpnphysfasasassyynafpqws * 

mata^ccctctScattctccttcttcgtcttttctttgtttctcatattc^ 
ScStccaaatcttaaaccctaaa™ 

aggtwaaagattttagcaaagatggcgaattcaggaaattatggaaagaggccctttc^ 
agS^tcggatgaaaagaaagaagccgatgatgatgagaacatattccctttctt 

CTCTGCCC^ATCCCAATATGACATGCGTGCCATGGTCTCAGCCTTGACTCAAGTCATTGG 

aSSaaagcagctctcatgataataaccaacatcaacctgttgtgtataatcaacaaga 

TCCTAACCCACCGGCTCCTCCAACTCAAGATCAAGGGCTATTGAGGAAGAGGCACTATAG 
AGGGGTAAGACAACGACCATGGGGAAAGTGGGCAGCTGAAATTCGGGATCCGCAAAAGGC 
A^CACGGGTGTGGCTCGGGACATTTGAGACTGCTGAAGCTGCGGCTTTAGCTTATGATAA 

cgcagctcSaagttcaaaggaagcaaagccaaactc^tttccc^ 

AGCAAGTAACACTAGTACAACTACCGGTCCACCAAACTATTATTCTTCTAATAATCAAAT 
^ACTACTCAAATCCGCAGACTAATCCGCAAACCATACCTTATTTTAACCAATACTACTA 

Sccaatatcttcatcaaggggggaatagtaacgatgcattaagttatagcttggccgg 

TG^AGA^ACCGGAGGCTCAATGTATAATCATCAGACGTTATCTACTACAAATTCTTCATC 

OTCTGCTGGATCTTCAAGGCAACAAGATGATGAACAAGATTACGCCAGATATTTGCGTTT 

TGGGGATTCTTCACCTCCTAATTCTGGTTTTTGAGATCTTCAATAAACTGATAATAAAGG 

ATTTGGGTCACTTGTTATGAGGGGATCATATGTTTTCTAA _ 

>G2509 Amino Acid Sequence (domain in aa coordinates: 89-156) 

mansgnyg^pfrgdesdekkeadddenifpffsarsqydmramvsaltqvignqssshd 
Sop™qqdpnppapptqdqgllrkrhyrgvrqrpwgkwaaeirdpqkaarvwlgt 

SAEAA^SNAALKFKGsLo.NFPERAQLASNTSTrrGPPNYYSSNNQIYYSNPQT 
NPQ^PYFNQYYYNQYiHQGGNSNDALSYSLAGGETGGSMYiraQTLSTTNSSSSGGSSRQ 

QDDEQDYARYLRFGDSSPPNSGF* 

ATGATGGCTCATCACTCCATGGACGATAGAGACTCTCCTGATAAAGGATTTGATTCCGGC 

AAGTACGTTAGATACACGCCGGAACAAGTTGAAGCTCTTGAGAGAGTTTATGCTGAGTGT 

rcTAAACCTAGCTCTCTGAGAAGACAACAGCTTATTCGTGAATGTCCCATTCTCTGTAAC 

ATCGAGCCTCGACAGATCAAAGTTTGGTTCCAGAATCGCAGATGTCGAGAGAAGCAGAGG 

AAAGAGTCAGCTCGTCTTCAGACAGTGAACAGGAAGCTGAGTGCTATGAACAAGCTTTTG 

ATGGAAGAGAATGATCGTTTGCAGAAGCAAGTCTCCAACTTGGTTTATGAGAATGGATTC 

ATGAAACATCGAATCCACACTGCTTCTGGGACGACCACAGACAACAGCTGTGAGTCTGTG 

GTCGTGAGTGGTCAGC^^CGTCAGCAGC^AAACCCAACACATCAGCATCCTCAGCGTGAT 

GTTAACAACCCAGCTAATCTTCrrCTCGATTGCGGAGGAGACCTTGGCGGAGTTCCTTTGC 

AAGGCTACAGGAACTGCTGTCGACTGGGTCCAGATGATTGGGATGAAGCCTGGTCCGGAT 

^S^MC^CTGTTTC^CGC^CTGCAGTGGAATAGCAGCACGTGCCTGTGGC 

rTCGTGAGTTTAGAACCCATGAAGGTCGCTGAAATCCTCAAAGATCGTCCATCTTGGTTC 

CGTGACTGTO^TGTGTCGAGACTCTGAATGrrATACCCACTGGAAATGGTGGTACTATC 

GAGCTTGTCAACACTCAGATTTATGCTCCTACAACATTAGCAGCAGCTCGTGACTTTTGG 

ACGCTGAGATATAGTACAAGTCTAGAAGATGGAAGCTATGTGGTCTGTGAGAGATCACTC 

ACTTCTGCAACTGGTGGCCCCAATGGTCCACTTTCTTCAAGCTTCGTGAGAGCCAAAATG 

CTGTCAAGCGGGTTTCTTATCCGTCCTTGTGATGGTGGTGGTTCCATTATTCACATCGTT 

GATCATGTGGACTXGGATGTCTCAAGTGTTCCTGAAGTCCTCAGGCCTCTTTATGAGTCT 

TCCAAAATCCTTGCTCAAAAAATGACTGTCGCTGCTCTGAGACATGTGCGCCAAATTGCT 

CAAGAGACTAGTGGAGAAGTCCAGTATAGTGGTGGACGCCAGCCTGCAGTTTTAAGGACT 

TTCAGCCAGAGACTCTGCCGGGGTTTCAATGATGCTGTAAATGGTTTTGTCGATGATGGA 

TCGTCTCCTiATGAGTAGTGATGGAGGAGAGGATATTACGATCATGATTAACTCTTCCTCT 

GCTAA^TTTCCTOTCTCCCAATACGGTAGCTCATTTCTTCCAAGTTTTGGAAGTGGTCT 

CTCTGTGCCAAAGCTTCTATGCTGTTGCAGAATGTTCCACCCCTTGTATTGATrCGGTTC 

CTGAGAGAACACCGAGCTGAATGGGCAGACTATGGTGTCGATGCCTATTCTGCTGCATCT 

GTCATOCTTCCTCTCGCACAGACACTCGAACATGAAGAGTTTCTCGAAGTGGTTAGACTT 
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GGAGGTCATGCTTACTCACCTGAAGACATGGGCTTATCCCGGGATATGTATTTACTGCAG 

CTTTGTAGCGGCGTTGATGAAAATGTGGTTGGAGGTTGTGCTCAGCTTGTCTTTGCCCCA 

ATCGATGAATCATTTGCTGATGATGCACCTTTGCTTCCTTCTGGTTTCCGTGTCATACCA 

CTCGACCAAAAAACAAATCCGAATGATCATCAATCTGCAAGTCGAACACGGGATCTAGCA 

TCGTCCCTAGATGGTTCCACCAAAACCGATTCGGAAAC7U^CTCTAGATTGGTCTTAACA 

ATAGCCTTCCAGTTCACGTTTGATAACCATTCCAGAGACAATGTTGCTACAATGGCGAGA 

CAGTATGTGAGGAACGTTGTTGGTTCGATTCAGAGAGTGGCTCTAGCCATTACGCCTCGT 

CCTGGCTCAATGCAACTTCCCACTTCCCCTGAAGCTCTCACTCTTGTCCGTTGGATCACC 

CGTAGTTACAGTATTCATACAGGTGCAGATCTGTTTGGAGCTGATTCTCAGTCCTGTGGA 

GGAGACACATTGCTTAAGCAACTCTGGGACCATAGTGATGCCATATTGTGCTGCTCCCTG 

AAAACTAATGCCTCACCGGTATTCACATTTGCAAACCAAGCTGGTTTAGACATGCTTGAA 

ACTACACTTGTGGCACTTCAGGATATAATGCTCGACAAAACACTTGATGACTCTGGTCGT 

AGAGCTCTTTGCTCCGAGTTCGCCAAGATCATGCAGCAGGGATATGCGAATCTTCCGGCA 

GGAATATGTGTGTCGAGCATGGGCAGACCGGTTTCGTATGAGCAAGCGACGGTGTGGAAA 

GTTGTTGATGACAACGAATCAAACCACTGCTTGGCTTTTACCCTCGTTAGTTGGTCGTTT 

GTTTGA 

>G390 Amino Acid Sequence (domain in AA coordinates: 18-81) 
MMAHHSMDDRDSPDKGFDSGKYVRYTPEQVEALERVYAECPKPSSLRRQQLIRECPILCN 

IEPRQIKVWFQNRRCREKQRKESARIiQTVNRKl^ 

MKHRIHTASGTTTDNSCESVWSGQQRQQQNPTHQHPQRDVNNPANLLSIAEETLAEFLC 

KATGTAVDWQMIGMKPGPDSIGIVAVSRNCSGIAARACGLVSLEPMKVAEILKDRPSWF 

RDCRCVETLlWIPTGNGGTIELTOTQIYAPTTIiAAARDFWTLRYSTSLEDGSyWCERSL 

TSATGGPNGPLSSSFVRAICMLSSGFLIRPCDGGGSIIHIVDHVDLDVSSVPEVLRPLYES 

SKILAQKMTVAALRHVRQIAQETSGEVQYSGGRQPAVLRTFSQRLCRGFNDAVNGFVDDG 

WSPMSSDGGEDITIMINSSSAKFAGSQYGSSFLPSFGSGVLCAKASMLLQNVPPLVLIRF 

LREHRAEWADYGVDAYSAASLRATPYAVPCVRTGGFPSNQVILPLAQTLEHEEFLEWRIi 

GGHAYSPEDMGIiSRDMYLLQLCSGVDENWGGCAQLVFAPIDESFADDAPLLPSGFRVIP 

LDQKTNPNDHQSASRTRDLASSLDGSTKTDSETNSRLVLTIAFQFTFDNHSRDNVATMAR 

QYVRNVVGSIQRVALAITPRPGSMQLPTSPEALTLVRWITRSYSIHTGADLFGADSQSCG 

GDTLLKQLWDHSDAILCCSLKTOASPVFTFANQAGLDMLETTLVALQDIMLDKTLDDSGR 

RALCSEFAKIMQQGYANLPAGICVSSMGRPVSYEQATVWKVVDDNE^ 

V* 

>G391 (1..2559) 

ATGATGATGGTCCATTCGATGAGCAGAGATATGATGAACAGAGAGTCGCCGGATAAAGGG 

TTAGATTCCGGCAAGTATGTGAGGTACACGCCGGAGCAAGTGGAAGCTCTCGAGAGAGTT 

TACACTGAGTGTCCTAAGCCAAGTTCTCTAAGAAGACAACAACTCATACGTGAATGTCCG 

ATTCTCTCTAACATCGAGCCTAAGCAGATCAAAGTTTGGTTTCAGAACCGCAGATGTCGT 

GAGAAGCAGAGGAAAGAAGCTGCTCGTCTTCAAACAGTGAACAGAAAACTCAATGCCATG 

AACAAACTCTTGATGGAAGAGAATGATCGTTTGCAGAAGCAAGTTTCTAACTTGGTCTAT 

GAGAATGGCCACATGAAAC^TCAACTTCACACTGCTTCTGGGACGACCACAGACAACAGC 

TGTGAGTCTGTGGTCGTGAGTGGTCAGCAACATCAACAGCAAAACCCAAATCCTCAGCAT 

CAGCAACGTGATGCTAACAACCCAGCAGGACTCCTTTCTATAGCAGAGGAGGCCCTAGCA 

GAGTTCCTTTCCAAGGCTACAGGAACTGCTGTTGACTGGGTTCAGATGATTGGGATGAAG 

CCTGGTCCGGATTCTATTGGCATAGTCGCTATTTCGCGCAACTGCAGCGGAATTGCAGCA 

CGTGCCTGCGGCCTCGTGAGTTTAGAACCCATGAAGGTTGCTGAAATTCTCAAAGATCGT 

CCATCTTGGCTCCGAGATTGTCGAAGTGTGGATACTCTGAGTGTGATACCTGCTGGAAAC 

GGTGGGACGATCGAGCTTATTTACACGCAGATGTATGCTCCTACGACTTTAGCAGCAGCT 

CGTGACTTTTGGAWCTGAGATATAGCACATGTTTGGAAGATGGAAGCTATGTGGTTTGT 

GAAAGGTCGCTTACTTCTGCAACTGGTGGCCCCACTGGGCCACCTTCTTCAAACTTTGTG 

AGAGCTGAAATGAAAGCAAGCGGGTTTCTCATCCGTCCTTGCGATGGTGGTGGTTCCATT 

CTCCACATTGTTGATCATGTTGATCTGGATGCCTGGAGTGTCCCTGAAGTCATGAGGCCT 

CTCTATGAATCATCGAAGATTCTTGCTCAGAAAATGACTGTTGCTGCTTTGAGACAT 

AGACAAATTGCACAAGAAACAAGTGGAGAAGTTCAGTATGGTGGAGGGCGCCAACCTGCG 

GTTTTAAGAACCTTCAGTCAAAGACTCT 

GTGGATGATGGATGGTCACCAATGGGTAGCGATGGTGCAGAGGATGTTACTGTAATGATA 
AACTTGTCCCCTGGGAAGTTTGGTGGGTCTGAGTACGGTAATTCATTCCTTCCAAGCTTT 
GGTAGTGGCGTGCTTTGTGCCAAGGCATCTATGTTGCTTCAGAACGTTCCACCCGCTGTG 
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CTGGTTCGATTCCTTAGAGAACACCGATCTGAATGGGCTGATTATGGCGTGGATGCTTAT 

GCTGCTGCATCGCTCAGAGCAAGTCCTTTTGCTGTTCCTTGTGCTAGAGCTGGGGGGTTC 

CCAAGTAACCAAGTCATTCTTCCTCTTGCGCAGACAGTTGAACATGAAGAGTCACTTGAG 

GTGGTTAGACTTGAAGGTCACGCTTACTCACCCGAAGACATGGGTTTAGCTCGGGATATG 

TATTTGCTACAGCTTTGTAGCGGTGTTGATGAAAATGTGGTTGGAGGTTGTGCACAGCTT 

GTATTTGCCCCTATCGATGAATCATTTGCTGATGATGCACCTTTGCTTCCTTCCGGTTTC 

CGCATCATACCTCTTGAACAGAAATCTACTCCGAACGGTGCATCTGCAAACCGTACCCTG 

GATTTAGCCTCAGCTTTAGAAGGATCCACACGTCAAGCTGGTGAAGCCGACCCAAATGGC 

TGTAACTTTAGGTCGGTACTAACCATAGCATTCCAGTTCACATTTGATAACCATTCAAGA 

GACAGTGTTGCTTCAATGGCACGTCAGTACGTGCGAAGCATAGTAGGATCGATTCAGAGG 

GTTGCTCTAGCCATTGCTCCTCGTCCTGGCTCCAATATCAGTCCAATATCTGTTCCCACT 

TCCCCTGAAGCTCTCACTCTGGTCCGTTGGATCTCCCGGAGTTACAGCCTTCACACTGGT 

GCAGATCTCTTTGGATCTGATTCTCAAACCAGTGGTGACACGTTGCTGCATCAACTCTGG 

AATCACTCTGATGCAATCTTGTGCTGCTCCCTCAAAACAAACGCTTCACCGGTTTTCACA 

TTCGCAAACCAAACCGGTTTAGACATGCTGGAAACGACTCTTGTAGCCCTTCAAGACATA 

ATGCTAGACAAGACCCTTGACGAACCTGGTCGTAAAGCTCTTTGCTCTGAGTTCCCCAAG 

ATCATGCAACAGGGCTATGCTCATCTGCCGGCAGGAGTATGTGCGTCAAGCATGGGAAGG 

ATGGTATCTTACGAGCAGGCAACGGTGTGGAAAGTTCTTGAAGACGATGAATCAAACCAC 

TGCTTAGCTTTCATGTTCGTGAATTGGTCGTTCGTTTGA 

>G391 Amino Acid Sequence (domain in AA coordinates: 25-85) 
MMMVHSMSRDMMNRESPDKGLDSGKYVRYTPEQVEALERVYTECPKPSSLRRQQLIRECP 

ILSNIEPKQIKWFQNRRCREKQRKEAARLQTVra 

ENGHMKHQLHTASGTTTDNSCESVVVSGQQHQQQNPNPQHQQRDANWPAGIiLSIAEEALA 
EFIiSKATGTAVDWVQMIGMKPGPDSIGIVAISRNCSGIAARACGLVSLEPMKVAEILKDR 
PSWLRDCRSVDTLSVIPAGNGGTIELIYTQMYAPTTLAAARDFWTIiRYSTCLEDGSYWC 
ERSLTSATGGPTGPPSSNFVRAEMKPSGFLIRPCDGGGSILHIVDHVDLDAWSVPEVMRP 
LYESSKILAQKMTVAALRHVRQIAQETSGEVQYGGGRQPAVLRTFSQRLCRGFNDAVNGF 
VDDGWSPMGSDGAEDVTVMINLSPGKFGGSQYGNSFLPSFGSGVLCAKASMLLQNVPPAV 
LTOFLREHRSEWADYGVDAYAAASLRASPFAVPCARAGGFPSNQVILPIAQTVEHEESLE 
WRLEGHAYSPEDMGLARDMYLLQLCSGVDENWGGCAQIiVFAPIDESFADDAPLLPSGF 
RIIPLEQKSTPNGASANRTLDLASALEGSTRQAGEADPNGCNFRSVLTIAFQFTFDNHSR 
D S VAS MARQYVRS I VGS IQR VAL AI APRPGSNI SPI S VPTSPEALTLVRW I SRS YSLHTG 
ADLFGSDSQTSGDTLLHQLWNHSDAILCCSLKTNASPVFTFANQTGLDMLETTLVALQDI 
MLDKTLDEPGRKALCSEFPKIMQQGYAHLPAGVCASSMGRMVSYEQATVWKVLEDDESNH 

CLAFMFVNWSFV* 
>G438 (188.. 2716) 

CGGGGTACCCAAGCCACGACCGTAGAATCTTCTTTTGTCTGAAAAGAATTACAATTTACG 
TTTCTCTTACGATACGACGGACTTTCCGAAGAAATTAATTTAAAGAGAAAAGAAGAAGAA 
GCCAAAGAAGAAGAAGAAGCTAGAAGAAACAGTAAAGTTTGAGACTTTTTTTGAGGGTCG 
AGCTAAAATGGAGATGGCGGTGGCTAACCACCGTGAGAGAAGCAGTGACAGTATGAATAG 
ACATTTAGATAGTAGCGGTAAGTACGTTAGGTACACAGCTGAGCAAGTCGAGGCTCTTGA 
GCGTGTCTACGCTGAGTGTCCTAAGCCTAGCTCTCTCCGTCGACAACAATTGATCCGTGA 
ATGTTCCATTTTGGCCAATATTGAGCCTAAGmGATCAAAGTCTGGTTTCAGAACCGCAG 
GTGTCGAGATAAGCAGAGGAAAGAGGCGTCGAGGCTCCAGAGCGTAAACCGGAAGCTCTC 
TGCGATGAATAAACTGTTGATGGAGGAGAATGATAGGTTGCAGAAGCAGGTTTCTCAGCT 
TGTCTGCGAAAATGGATATATGAAACAGCAGCTAACTACTGTTGTTAACGATCCAAGCTG 
TGAATCTGTGGTCACAACTCCTCAGCATTCGCTTAGAGATGCGAATAGTCCTGCTGGATT 
GCTCTCAATCGCAGAGGAGACTTTGGCAGAGTTCCTATCCAAGGCTACAGGAACTGCTGT 
TGATTGGGTTCAGATGCCTGGGATGAAGCCTGGTCCGGATTCGGTTGGCATCTTTGCCAT 
TTCGCAAAGATGCAATGGAGTGGCAGCTCGAGCCTGTGGTCTTGTTAGCTTAGAACCTAT 
GAAGATTGCAGAGATCCTCZAAAGATCGGCCZATCTTGGTTCCGTGACTGTAGGAGCCTTGA 
AGTTTTCACTATGTTCCCGGCTGGTAATGGTGGCACAATCGAGCTTGTTTATATGCAGAC 
GTATGCACCAACGACTCTGGCTCCTGCCCGCGATTTCTGGACCCTGAGATACACAACGAG 
CCTCGACAATGGGAGTTTTGTGGTTTGTGAGAGGTCGCTATCTGGCTCTGGAGCTGGGCC 
TAATGCTGCTTCAGCTTCTCAGTTTGTGAGAGCAGAAATGCTTTCTAGTGGGTATTTAAT 
AAGGCCTTGTGATGGTGGTGGTTCTATTATTCACATTGTCGATCACCrTAATCTrGAGGC 
TTGGAGTGTTCCGGATGTGCTTCGACCCCTTTATGAGTCATCCAAAGTCGTTG 
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AATGACCATTTCCGCGTTGCGGTATATCAGGCAATTAGCCCAAGAGTCTAATGGTGAAGT 
AGTGTATGGATTAGGAAGGCAGCCTGCTGTTCTTAGAACCTTTAGCCAAAGATTAAGCAG 
GGGCTTCAATGATGCGGTTAATGGGTTTGGTGACGACGGGTGGTCTACGATGCATTGTGA 
TGGAGCGGAAGATATTATCGTTGCTATTAACTCTACAAAGCATTTGAATAATATTTCTAA 
TTCTCTTTCGTTCCTTGGAGGCGTGCTCTGTGCCAAGGCTTCAATGCTTCTCCAAAATGT 
TCCTCCTGCGGTTTTGATCCGGTTCCTTAGAGAGCATCGATCTGAGTGGGCTGATTTCAA 
TGTTGATGCATATTCCGCTGCTACACTTAAAGCTGGTAGCTTTGCTTATCCGGGAATGAG 
ACCAACAAGATTCACTGGGAGTCAGATCATAATGCCACTAGGACATACAATTGAACACGA 
AGAAATGCTAGAAGTTGTTAGACTGGAAGGTCATTCTCTTGCTCAAGAAGATGCATTTAT 
GTCACGGGATGTCCATCTCCTTCAGATTTGTACCGGGATTGACGAGAATGCCGTTGGAGC 
TTGTTCTGAACTGATATTTGCTCCGATTAATGAGATGTTCCCGGATGATGCTCCACTTGT 
TCCCTCTGGATTCCGAGTCATACCCGTTGATGCTAAAACGGGAGATGTACAAGATCTGTT 
AACCGCTAATCACCGTACACTAGACTTAACTTCTAGCCTTGAAGTCGGTCCATCACCTGA 
GAATGCTTCTGGAAACTCTTTTTCTAGCTCAAGCTCGAGATGTATTCTCACTATCGCGTT 
TCAATTCCCTTTTGAAAACAACTTGCAAGAAAATGTTGCTGGTATGGCTTGTCAGTATGT 
GAGGAGCGTGATCTCATCAGTTCAACGTGTTGCAATGGCGATCTCACCGTCTGGGATAAG 
CCCGAGTCTGGGCTCCAAATTGTCCCCAGGATCTCCTGAAGCTGTTACTCTTGCTCAGTG 
GATCTCTCAAAGTTACAGTCATCACTTAGGCTCGGAGTTGCTGACGATTGATTCACTTGG 
AAGCGACGACTCGGTACTAAAACTTCTATGGGATCACCAAGATGCCATCCTGTGTTGCTC 
ATTAAAGCCACAGCCAGTGTTCATGTTTGCGAACCAAGCTGGTCTAGACATGCTAGAGAC 
AACACTTGTAGCCTTACAAGATATAACACTCGAAAAGATATTCGATGAATCGGGTCGTAA 
GGCTATCTGTTCGGACTTCGCCAAGCTAATGCAACAGGGATTTGCTTGCTTGCCTTCAGG 
AATCTGTGTGTCAACGATGGGAAGACATGTGAGTTATGAACAAGCTGTTGCTTGGAAAGT 
GTTTGCTGCATCTGAAGAAAACAACAACAATCTGCATTGTCTTGCCTTCTCCTTTGTAAA 
CTGGTCTTTTGTGTGATTCGATTGACAGAAAAAGACTAATTTAAATTTACGTTAGAGAAC 
TCAAATTTTTGGTTGTTGTTTAGGTGTCTCTGTTTTGTTTTTTAAAATTATTTTGATCAA 
A 

>G438 Amino Acid Sequence (domain in AA coordinates: 22-85) 

MEMAVANHRERSSDSMNRHLDSSGKYVRYTAEQVEALERVYAECPKPSSLRRQQLIRECS 

ILANIEPKQIKVWFQNRRCRDKQRKEASRLQSVN^ 

ENGYMKQQLTTVVm)PSCESWTTPQHSLRDANSPAGLLSIAEETLMFI.SKATGTAVDW 

VQMPGMKPGPDSVGIFAISQRCNGVAARACGLVSLEPMKIAEILKDRPSWFRDCRSLEVF 

TMFPAGNGGTIELVYMQTYAPTTLAPARDFWTLRYTTSLDNGSFWCERSLSGSGAGPNA 

ASASQFVRAEMLSSGYLIRPCDGGGSIIHIVDHLNLEAWSVPDVLRPLYESSKOTAQKMT 

ISALRYIRQLAQESNGEVVYGLGRQPAVLRTFSQRLSRGFNDAVWGFGDDGWSTMHCDGA 

EDIIVAINSTKHLNNISNSLSFLGGVLCAKASMLLQOTPPAVLIRFLREHRSEWADFNVD 

AYSAATLKAGSFAYPGMRPTRFTGSQIIMPLGHTIEHEEMLEWRLEGHSLAQEDAFMSR 

DVHLLQI CTGIDENAVGACSELI FAPINEMFPDDAPLVPSGFRVIPVDAKTGDVQDLLTA 

NHRTLDLTSSLEVGPSPENASGNSFSSSSSRCILTIAFQFPFENNLQENVAGMACQYVRS 

VISSVQRVAMAISPSGISPSLGSKLSPGSPEAVTLAQWISQSYSHHLGSELLTIDSLGSD 

DSVLKLLWDHQDAILCCSLKPQPVFMFANQAGIODMLETTLVALQDITLEklFDESGRKAI 

CSDFAKLMQQGFACLPSGICVSTMGRHVSYEQAVAWKVFAASEENNNNLHCI^ 

FV* 

>G47 (38.. 472) 

CTTCTTCTTCACATCGATCATCATACA^ 

TGAAAGTCAGTCAAAGTACAAAGGAATCCGTCGTCGGAAATGGGGCAAATGGGTATCAGA 

GATTAGAGTTCCGGGAACTCGTGACCGTCTCTGGTTAGGTTCATTCTCAACAGCAGAAGG 

TGCCGCCGTAGCACACGACGTTGCTTTCTTCTGTTTACACCAACCTGATTCTT^ 

TCTCAATTTCCCTCATTTGCTTAATCCTTCACTCGTTTCCAGAACTTCTCCGAGATCTAT 

CCAGCAAGCTGCTTCTAACGCCGGCATGGCCATTGACGCCGGAATCGTCCACAGTACCAG 

CGTGAACTCTGGATGCGGAGATACGACGACGTATTACGAGAATGGAGCTGATCAAGTGGA 

GCCGTTGAATATTTCAGTGTATGATTATCTGGGCGGCCACGATCACGTTTGATTTATCTC 

GACGGTCATGATCACGTTTGATCTTCTTTTGAGTAAGATTTO 

GTGTGGTGCTAAAATCTTACTCAAAACAAGATTAGGTACmCAGAGAAACAATCAAATGG 
TTGTGAATATACATTATAAGGTTTTGATTAATGTTTGTTTCACTGATTTAGTGAAGTTTG 
GTCCATTGTATACAAATCTATTCAAGAAACCTAGCGCGAGATCATGTTTCGTGATTGAAG 
ATTGAGATTTTTAAGTATTCGTAATATTTTTGTA7VAATACAAATAAAAAAAAAAAAAAAA 
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AAAAA 

>G47 Amino Acid Sequence (domain in AA coordinates: 11-80) 

MDYRESTGESQSKYKGIRRRKWGKWVSEIRVPGTRDRLWLGSFSTAEGAAVAHDVAFFCL 

HQPDSLESLNFPHLLNPSLVSRTSPRSIQQAASNAGMAIDAGIVHSTSVNSGCGDTTTYY 

ENGADQVEPLNISVYDYLGGHDHV* 

>G559 (89.. 1285) 

aaagttgctagctttaatttgccaacttactattcttatgtgtaataatcgtttgcaggg 
tcgttgatttggtgataagtcagtagaaATGgataaggagaaatctccagcacctccttg 
tggaggtcttcctcctccatctccatcaggtcgatgctctgcattctcagaagctggtcc 
cattggtcatggttcagatgctaatcgaatgagtcatgatattagccgtatgcttgataa 
cccacctaagaagattggacatcggcgagctcattctgaaatacttactctccctgatga 
tttgagctttgatagtgatcttggtgtggttggtaatgctgctgatggagcttctttctc 
tgatgagactgaagaagatttgctctctatgtatcttgatatggataagtttaattcttc 
tgctacatcttctgcccaagttggtgagccatcaggaactgcttggaaaaatgagacaat 
gatgcagacaggcacaggctcaacttccaatcctcagaatacggttaatagtcttggcga 
aaggccaagaatcaggcatcaacatagccaatctatggatggttcaatgaatatcaatga 
gatgcttatgtcgggaaatgaagatgattctgctattgatgctaagaagtctatgtctgc 
tactaaacttgctgagcttgctctcattgatcctaaacgtgctaagaggatatgggcaaa 
caggcagtccgcagcacgatcaaaagaaaggaagacgagatacatatttgagcttgagag 
aaaagtacagactttgcaaacagaggctacaactctctcagcccagttgaccctcttaca 
gagagacacaaatggcttgactgttgaaaacaatgagctgaagctgcggttacaaacaat 
ggagcagcaggttcacttgcaggatgaactaaacgaagcactaaaggaggaaatccagca 
tctgaaggtgttgactggccaagttgctccatcagcgttgaactatgggtcgtttggatc 
aaaccagcagcaattctattccaacaatcagtcaatgcaaacaatcttagctgcaaaaca 
gttccagcaacttcagattcattcacagaagcagcaacaacaacaacaacaacaacaaca 
gcaacaccaacagcagcagcagcaacagcaacagtatcagtttcaacagcaacagatgca 
acagcttatgcagcagcggcttcaacagcaagaacaacaaaatggagtaagactcaagcc 
ttcacaagcccagaaagagaacTGAggaatatgaatatgtcccacgtaagtgagaggttc 
tccttctgaacaattcctttctcattcataaattgttgttcatccatcacttgcagtctc 
ttggattttagggttttagctaacaca 

>G559 Amino Acid Sequence (domain in AA coordinates: 203-264) 

MDKEKSPAPPCGGIiPPPSPSGRCSAFSEAGPIGHGSDAWRMSHDISRMLDNPPKKIGHRR 

AHSEILTLPDDLSFDSDLGWGNAADGASFSDETEEDLLSMYLDMDKFNSSATSSAQVGE 

PSGTAWKNETMMQTGTGSTSNPQNTVNSLGERPRIRHQHSQSMDGSMNINEMLMSGNEDD 

SAIDAKKSMSATKLAELALIDPKRAKRIWANRQSAARSKERKTRYIFELERKVQTLQTEA 

TTLSAQLTLLQRDTOGLTVETOELKIjRLQTO^ 

PSALNYGSFGSNQQQFYSNNQSMQTILAAKQFQQLQIHSQKQQQQQQQQQQQHQQQQQQQ 

QQYQFQQQQMQQLMQQRIiQQQEQQNGVRLKPSQAQKEN* 

>G568 (141.. 995) 

GACCGGCTAAAGTCAAGAACCTCTCTCTGAGCTCTCACCACTTTCTCTCTCTACTCCCTC 
TCTGCGTGTAGGATACTACTAGACAATTGACAACCAAAGACTAAAGCrGTGTTGTTCGTO 
CACTTCTGTTCTCTT^ 

CTGCTACAAACAAGAACCAGACTCTCACCAAAGTTTCTTCCATTTCATCCT^ 

CGTCTTCTTCTTCATCATCATCAACCTCATCATCATCTCCTTTACCTTCTCAAGACTCTC 

AAGCCCAGAAGAGATCTCTTGTCACCATGGAAGAAGTTTGGAATGACATCAACCTTGCTT 

CCATCCACCACCTAAACCGACACAGCCCTCATCCACAACACAACCACGAGCCAAGGTTCA 

GGGGCCAAAACCTM^GACAACCAAAACCCTAACTC 

CTTTGAACC^GGAACCAGCACC(^CAAGCCAGACCACGGGTTCTGCGCCTAATGGCGAOT 
CCACC^CGGTCACTGTTCTTTACAGCTCTCCTTTTCCACCTCCTGCAA 
TGAATTCCGGCGCTGGCTTCGAGTTTCTCGATAACCAAGATCCTCTTGTTACCTCAAACT 
CTAATCTTCATACCCACCATCACCTCTCAAACGCT 

CTCTGGTTCCATCCAGTTCTTTTGGTAAGAAAAGAGGCCAAGATTCC71ATGAAGGTTCAG 
GG7^TAGAAGACATAAGCGTATGAT(^GAAC^GAGAATCTGCAGCTCGTTCCCGCGCTA 
GGAAACAGGCTTATACAAACGAGTTAGAACTO 

CAAGACTCAAGAGACAAC^GATCAAAAAATGGCTGCAGCAATTCAGCAACCCAAAAAG 

AGACACTTCAACGGTCTTCCACAGCTCCATTCT 

TTTGGGGATTGAGATTGTCTCATGAAGAAGTGAAAAAATGGC^ 
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TTATTAGCTATAAGTATAACTAAGCCTAAAATTGTAGAACTAAGATATTGTAGGGGAAAA 
AAGAAGATGTAAAACAAAAGACCCGGAAAGAGAAAAGGATCTTTCAATTTCCTAAGGCAC 
AGGAACACCTGTCCTGGGTCCTCTCTTAATGTTCTGTCGTTTTCCTATGCAAACCCTTTT 
TTCACTTCTGTACTAACTTATACTTGTATTCTTG 

>G568 Amino Acid Sequence (domain in AA coordinates: 215-265) 

MLSSAKHQRNHRLSATNKNQTLTKVSSISSSSPSSSSSSSSTSSSSPLPSQDSQAQKRSL 

VTMEEVWNDINLASIHHLNRHSPHPQHNHEPRFRGQNHHNQNPNSIFQDFLKGSLNQEPA 

PTSQTTGSAPNGDSTTVTVLYSSPFPPPATVLSLNSGAGFEFLDNQDPLVTSNSNLHTHH 

HLSWAHAFNTSFEALVPSSSFGKKRGQDSNEGSGNRRHKRMIKNRESAARSRARKQAYTN 

ELELEVAHLQAENARLKRQQDQKMAAAIQQPKKNTLQRSSTAPF* 

>G580 (43.-747) 

CCAAAAAACAAAGCATTCTATGCTATTCTGTTCTGTTCTCCAATGTTGTCATCAGCAAAG 
CATAATAAGATCAACAACCATAGTGCCTTTTCAATTTCCTCTTCATCATCATCATTATCA 
ACATCATCCTCCCTAGGCCATAACAAATCTCAAGTCACCATGGAAG7^AGTATGGAAAGAA 
ATCAACCTTGGTTCACTTCACTACCATCGGCAACTAAACATTGGTCATGAACCAATGTTA 
AAGAACCAAAACCCTAATAACTCCATCTTTCAAGATTTCCTCAACATGCCTCTGAATCAA 
CCACCACCACCACCACCACCACCTTCCTCTTCCACCATTGTCACTGCTCTCTATGGCTCT 
CTGCCTCTTCCGCCTCCTGCCACTGTCCTCAGCTTAAACTCCGGTGTTGGATTCGAGTTT 
CTTGATACCACAGAAAATCTTCTTGCTTCTAACCCTCGCTCCTTTGAGGAATCTGCAAAG 
TTTGGTTGTCTTGGTAAGAAAAGAGGCCAAGATTCTGATGATACTAGAGGAGACAGAAGG 
TATAAGCGTATGATCAAGAACAGAGAATCTGCTGCTCGTTCAAGGGCTAGGAAGCAGGCA 
TATACAAACGAACTTGAGCTTGAAATTGCTCACTTGCAGACAGAGAATGCAAGACTCAAG 
ATACAACAAGAGCAGCTGAAAATAGCCGAAGCAACTCAAAACCAAGTAAAGAAAACACTA 
CAACGGTCTTCCACAGCTCCATTTTGAGAAAAATCTACTATTTCTTTTTGGGGGAGTTTC 
AAGTGTTTCTTATGAAGATGAGAAAAACAGAA^UU^GTTTGTACATTTTAGCTAAGTTAT^A 

tttgtggtggtaagtaatgtaaaagaaaagtgtgtgtagaagaaaagtgtctagaaaaag 
aaagcaactaactttcttcttcttctctggtttcctatcaactcttttgacttttgtact 
ttttttcttctctacttaacctctattattgtaatgccaagtcaagtccttatctagcta 
gtacatgAgtttctgttttcactggttaagccat 

>G580 Amino Acid Sequence (domain in AA coordinates : 162-218) 
MLSSAKHNKINiraSAFSISSSSSSLSTSSSLGHNKSQVTMEEVWKEINLGSLHYHRQLNI 
GHEPMLKNQNPNNS I FQDFLNMPLNQPPPPPPPPS S STI VTALYGSLPLPPPATVLSLNS 
GVGFEFLDTTENLLASNPRSFEESAKFGCLGKKRGQDSDDTRGDRRYKRMIKNRESAARS 
RARKQAYTNELELEIAHLQTENARLKIQQEQLKIAEATQNQVKKTLQRSSTAPF* 
>G615 (197.. 1252) 

TTTTTTCTTTTCTTTCTTTTTTTGCTGGTGTGAGAAATTGTACGCTTACTATCTCTCTCT 
CTCTCTGCCAGATTCTCTCTTTTTGATGATGTGAAAGTTGTGCTTTTGTTTCTTAAGAAA 
AAGGCATATTTTTAATACTTGATTCTTGGTTCTTGATTCTTGATTCTTGGTTTTTTTTAG 
CTTCTTAAGTTCGGTGATGTCGTCTTCCACCAATGACTACAACGATGGTAATAACAATGG 
AGTGTACCCTCTCTCTCTTTACCTTTCTTC^CTCTCTGGCCATCAAGACATCATTCATAA 
TCCCTACAACCATCAGTTAAAAGCATCTCCGGGCCATATGGTATCAGCAGTTCCTGAATC 
TCTGATCGATTACATGGCGTTTAAGTCAAATAATGTTGTGAATCAACAAGGCTTTGAGTT 
TCCTGAGGTGTCAAAGGAAATCAAGAAGGTGGTGAAGAAGGACCGACATAGCAAGATTCA 
AACGGCACAAGGGATTAGAGACAGGAGGGTTAGGCTTTTTATTGGGATTGCTCGCCAATT 
CTTTGATCTTCAGGATATGTTGGGGTTTGATAAAGCTAGTAAAACGTTAGACTGGCTGCT 
CTiAGAAGTCAAGAAAAGCCATCAAAGAGGTCGTACAAGCAAAAAACCTCAACAATGATGA 
TGAAGATTTTGGAAACATTGGAGGCGATGTAGAACAAGAAGAGGAGAAGGAGGAGGATGA 
CAATGGCGATAAGAGCTTCGTGTATGGTTTGAGCCCCGGGTACGGTGAAGAAGAAGTGGT 
ATGTGAGGCCACGAAGGCAGGGATAAGAAAGAAGAAGAGTGAGTTGAGAAACATCTCATC 
AAAGGGGCTAGGAGCGAAAGCTAGAGGAAAAGCAAAGGAGCGAACAAAAGAGATGATGGC 
CTATGATAATCC^GAGACTGCCTCTGATATTACACAATCTGAAATCATGGACCCATTCAA 
GAGGTCTATAGTCTTC^TGAAGGAGAAGATATGACACACCTTTTCrrA(^GGAACC^T 
CGAGGAGTTTGATAATCAAGAATCTATCTTAACCAATATGACTCTACCAACGAAGATGGG 
TCAAAGTTACAATCAAAATAATGGGATACTTATGTTGGTAGATCAGAGTTCTAGCAGCAA 
CTATAATACATTTCTGCCTCAAAATTTGGATTATAGTTATGATCAAAACCCTTTTCATGA 
CCAAACCTTATATGTAGTCACCGACAAAAATTTCCCCAAAGGTTTCCTATAAATCTCGAC 
AGTTTTGAAGGACTATGCATGATCAAGTTTAAACATGTAAGCCAATATAGTCCCTTATTC 
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CTCTGAATGTATACAAAATCTATAGTTATGTATATCTGTTCCTTTTTAACGTATCTTTAT 

tgatcttctgtgccttgatcaaaat^^ 

AFKSNNWNQQ6FEPPBVSKEI KKWKKDRHS KI QTAQGIRDRRVRLPIGIMOFFDLOD 

mlgfdkasktldwllkksrkaikewqakhlnnddedfgniggdv^ebek^ddngdk^ 

^sditqseimdpfkrsivfnegedmthlfykepieefdnqesiltnmtlSSswo 
^gi^Qssssnyntflpqm 

aaaaaaaccaaacataaaacataaaactctgtcctttttttgtcttcttgtaacttttct 

tcttaaaaatcaatggcgtcatctagcagcacataccggagctcaagctcttccS 

ggtaataataacccgtcggactccgtcgtcaccgtcgacgaacgaaaIcSaaSS 

™TCGAACAGAGAATCTG(^CGTAGGT^^ 



GTAACATCTCAGCTTTACATGAAGATCCAAGCCGAGAACTCTGTTCTCACCGCTa ' 
? A ° GAGCTTAGCAC ^^ 

GGTGC^GGATTTCGTGTTGACCAGATCGACGGCTGTGGTTTTGATGMroTAreGTTCGG 



rATAATTATATGATTATATATGTTTATGTTAAAAAAAAA 
" t;id Se ^ en ce (domain in AA coordinates- 31-91) 
MASSSSTYRSSSSSDGC^PSDSVVTVBERKPJa^LSNRESAP^SR^RKOK^ 

™« ^^^^^^DMN^SNVNHWGGSVYTNQPlSS* 

ATGCTTACTTCCTTCAAATCCTCTAGCTCCTCCTCCGAAGATGCCACCGCTACCArrapr 
GAGAATCCTCCTCCTTTGTGCATCGCCTCCTCCTCGGCCGCAACC^CCGCCTCACATCAr 

CAAAACTTACTCTCAATCCTCTCCC^^ 



CTTGTACACCTCTTCACTAAAGCCTTGTCCGTACGAATCAACCG^ 



^GAGOICATC^CTMCAOrCCSCCTCCATCTCTcSSSS'SS 



cc 
cc 

AC 
CI 



IAATTGTGT 

:ttgtcagc 



ATGATCGGTCACTTCTTGTCAGCG 
GAGAGAGAAGCTAATCATGGAGAT 
CATTACATGGCGATCTTTGATTCG 
CTAACCCTAGAGCAACGGTGGTTC 
aCGGAGAGAAAGCAAAGACATCGG 
GGTTTCGTTAACGTTCCTATTGGA 

GTTTCGTCGTGC^TCA TGGCAAAATCGTCCCCTCITCTCC 



2caISS^ tatoa ^^ 

a?" 

' G ^ G ^ GA ^ GG ^^ G ^^ G ^ G ^^^^^^ATCCTTCAGAAGGTTAT 



^^^^^^^AGAAAGCAAAGACATCGG 

N3GTTTCGTTAACGTTCCTAT 
VCTTCATTATCCTTCAGAAGG 
^TGGCAAAATCGTCCCCTCTT 

1 AA coordinates: 178- 



>G988 Amino Acid Seguence (domain t„ 

MLTSPKSSSSSSroATAT^PPPLCI^sqLio^ coordi nates : 178-195) 
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FTSSVCKEQFLFRTKNNNSDFESCYYLWLNQLTPFIRFGHLTANQAILDATETNDNGALH 

ILDLDISQGLQWPPLMQALAERSSNPSSPPPSLRITGCGRDVTGLNRTGDRLTRFADSLG 

LQFQFHTLVIVEEDLAGLLLQIRLLALSAVQGETIAVNCVHFLHKIFNDDGDMIGHFliSA 

IKSLNSRIVTMAEREANHGDHSFLNRFSEAVDHYMAIFDSLEATLPPNSRERLTLEQRWF 

GKEILDWAAEETERKQRHRRFEIWEEMMKRFGFVNVPIGSFALSQAKLLLRLHYPSEGY 
NLQFLNNSLFLGWQNRPLFSVSSWK* 

>G1519 (1. .1146) 

ATGAGGCTTAATGGGGATTCGGGTCCGGGTCAGGATGAACCCGGTTCGAGCGGGTTTCAC 
GGCGGAATCAGACGATTCCCGTTAGCAGCTCAGCCGGAGATTATGAGAGCTGCTGAGAAA 
GACGATCAATACGCTTCTTTCATCCACGAAGCTTGCCGCGATGCCTTCCGACACCTTTTC 
GGTACAAGAATCGCTCTTGCTTACCAGAAGGAGATGAAGCTACTTGGACAGATGCTTTAC 
TATGTTCTTACGACAGGTTCAGGGCAACAAACTTTAGGAGAGGAATATTGTGACATTATA 
CAGGTTGCAGGGCCTTATGGACTCTCTCCTACACCAGCTAGACGTGCTTTGTTCATATTG 
TACCAGACCGCAGTTCCATATATCGCAGAGAGAATTAGCACTCGAGCTGCTACGCAAGCA 
GTCACCTTTGATGAGTCTGATGAGTTTTTTGGTGATAGTCATATCCACTCACCAAGAATG 
?™r CCATOTCATC ™ 

GATAGACTTATGAGATCGTGGCACCGAGCTATTCAGCGATGGCCTGTGGTTCTTCCTGTT 

GCCCGCGAAGTCTTACAACTGGTTTTGCGTGCCAATCTGATGCTCTTCTACTTTGAAGGT 

TTTTATTATCATATATCGAAACGTGCATCCGGGGTTCGTTATGTTTTCATAGGAAAGCAA 

CTGAATCAGAGACCTAGATACCAAATTCTTGGGGTTTTCCTTCTAATCCAATTGTGCATC 

CTTGCTGCTGAGGGCTTGCGTCGGAGTAATTTGTCATCTATCACTAGCTCCATTCAGCAG 

GCTTCTATAGGATCTTATCAAACTTCAGGAGGGAGAGGTTTACCTGTTTTAAATGAAGAG 

GGGAATTTGATAACTTCGGAAGCTGAAAAGGGAAACTGGTCTACCTCCGATTCAACTTCA 

™f GCAGTAGGG ^ TGCACTCTCTGCTTAAG CACCCGTCAGCACCCAACGGCCACT 

CCTTGTGGTCATGTGTTTTGTTGGAGCTGCA1TATGGAATGGTGCAACGAGAAGCAAGAA 

TGCCCTCTTTGTCGAACGCCCAATACCCATTCAAGTTTGGTTTGTTTGTATCATTCTGAT 
TTTTAG 

>G1519 Amino Acid Sequence (domain in AA coordinates- 327-364} 
™^v^f GSSGF ^ 

GTRIALAYQKEMKLLGQMLYYVLTTGSGQQTLGEEYCDIIQVAGPYGLSPTPARRALFIL 

^™^ IQRWP ^ P ^ E ^ QL ^ R ^ LFYraG ™ISKRASGVRYVFlS 
™t™™£ GVFLLIQLCI ^^ 

>G374 (1..1359) 
A ? G , GACAACAAAAATG ™ 

GATCTTTCCTTTGGTGCTCCCCTCTATGTGGTTGAGAGCATGTGCATGCGCTGCCAAGAA 
AATGGAACAACCAGATTTCTATTGACCTTAATTCCTCACTTCAGAAAGGTOTAA^?CT 

^^™ GGATGCTC ^ ACAATCTAGAGGTTCTAG ^^ 

CCACCAGAGGCCCAACGTGGAAGTTTGTCTACTGTGQAAQGGATATTAGCACGG^CTCOT 
A^ACCAATTCTTGTCCAAACTGAGAGCTTGTGCTAAAGCAGAGAC^TCCTTCACcSc 

ccctctctaaccatcaaattctatgagcgaacaccagagcaacaagcaaS 

ottcctaacccatctcaggctggacaatcagaaggaagccitggcgcacctgtS 

ttcccttcaacttgcggagcatgtacggagccgtgtcagac^cggatgttcWSgS 

JS A o GAACATOACAGACOTAGCCGAGATGTTM ^ 

A A cc i AGAAOTGAa ^ 

g^gggttggttacacagatcagagaaagcctagcgagagttcacggattcacttttggt 
gatagtatggaagagagtaagwgaacaaatggaga^tttggagcc^ 

C^TAAGOTTGAACAGCCGTGGACATTGATTCTTGATGATGAATTAGcWSccm 

attgcaccaotaacagatgatatcaaagatgaccatcagctcacatttgaa^ 
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AGG TCATGGGATCAAAACGAGGAGTTGGGTCTCAACGACATAGATACTTCTTCAGCTGAT 
GCTGCTTATGAATCCACAGAGACGACTAAATTACCTTAA 

=>G374 Amino Acid Sequence (domain in aa coordinates: 35-67, 245-^//) 

MDNKJTOQDIDVRSWEAVSADLSFGAPLYWESMCMRCQENGTTRFLLTLIPHFRKVLIS 

AFECPHCGERNNEVQFAGEIQPRGCCYNLEVLAGDVKIFDRQWKSESATIKIPELDFEI 

PPEAQRGSLSTVEGILARAADELSALQEERKKVDPKTAEAIDQFLSKLRACAKAETSFTF 

ILDDPAGNSFIENPHAPSPDPSLTIKFYERTPEQQATLGYVANPSQAGQSEGSLGAPVMT 

FPSTCGACTEPCETRMFKIEIPYFQEVIVMASTOTSCGYRNSEIjKPGGAIPEKGKKITIjS 

^itdLrdviksdtagviipeldlelaggtlggmvttveglvtqireslarwgftfg 
SeesSnkwrefgarltkllsfeqpwtlilddeuansfiapvtddikddhqltfeeye 

RSWDQNEELGLNDIDTSSADAAYESTETTKLP* 

P077 (397 ^460) 
CAAAGATTAGACTAATCCGACTGTGTTTTTAATCAATCATCATTTTATTTAGGGGAGAGA 

AGTTGTAAAGTTTTGATTTTTTTTTCTGGGTTTTTTCTGTGAGACCCAGAAGAAGAACAG 

AGAGAGGAAGAAGGAGAAGAAAAAAATATCTCTTTCTCTCCGGCTTTCAACAAAATCTCT 

ct^Stccttcatcagtgttaaattcggatccgggtcgggtgggttitcggtttttggt 

GTTCGGATCAGAGCACAGTTGGATGTTAGCGACGGAACTGAGGATTTCAGTTTGCGGCTG 

Sgcggctgtgacggtgtttgtgtgtcgtcttcttttatcaatcaggagtttcatcacag 
Stcatcagagattcagccaaattcttggatactaaatggctggttttgatgaaaatgtt 
StcSatgggagaatgggtgcctcgtagtcctagtcccgggacacttttctcctctgct 

SSag^gagaagagctcgaaacgtgttct^^ 

caagttattggtttagaagaagacactagtagtaatcataacaaggattcttcacaaagc 
aatgtttttcgaggtggtctcagtgaaagaattgctgcaagagctggatttaatgctcca 
aggttgaacactgagaatatccgcaccaacaccgacttttccattgactctaaccttcga 
SS?tcStaaccatctcttctcctggccttagccctgcaacactcttggaatctcct 

GTTWCCTTTCTAACCCATTGGCTCAACCTTCTCCAACTACCGGGAAATTrCCATTTCTT 
CCTGGTGTTAATGGTAATGCATTGTCTTCTGAGAAAGCGAAAGACGAGTTCTTTGATGAT 

aSggagcatcattcagcitccatcctc^^ 

AQVACAGAGATGATGTCAGTTGATTATGGTAACTACAACAATAGATCTTCTTCTCATCAA 

™Sgaagaagtaaaacctggctctgaaaacatagaaagctccaatctttatcgga^ 

GAAACTGACAATCAAAACGGGCAGAACAAGACATCTGATGTCACTACAAACACCAGTCTT 

GAAACCGTGGATCATCAAGAGGAAGAAGAAGAGCAAAGACGCGGTGATTCGATGGCTGGT 

GGTGCGCCTGCAGAGGATGGATATAACTGGAGGAAATACGGACAAAAGTTGGTCAAAGGA 

AGTGAGTATCCGCGAAGCTATTACAAGTGCACAAACCCGAATTGTCAGGTGAAGAAGAAA 

GTTGAGAGATCAAGGGAAGGTCACATCACAGAGATTATATACAAAGGAGCTCATAATCAT 

CTTAAACCTCCACCTAATCGCCGCTCAGGGATGCAAGTAGATGGAACTGAACAAGTTGAA 

CAACAACAACAACAGAGAGATTCTGCTGCAACGTGGGTTAGTTGTAATAACACTCAACAA 

CAAGGTGGAAGCAATGAGAACAATGTCGAAGAGGGATCTACGAGATTCGAGTATGGAAAC 

CAATCTGGATCAATTCAAGCTCAAACCGGAGGTCAATACGAGTCAGGTGATCCTGTGGTT 

GTGGTTGATGCTTCTTCAACATTCTCTAATGATGAAGATGAAGATGATCGAGGGACACAT 

GGAAGTGTTTCTTTGGGTTACGATGGAGGAGGAGGAGGTGGGGGAGGAGAAGGAGATGAA 

TCAGAGTCGAAAAGAAGGAAACTAGAAGCTTTTGCAGCAGAGATGAGTGGATCAACAAGA 

GCCATACGTGAGCCAAGAGTTGTTGTGCAGACAACGAGTGATGTTGACATTCTTGATGAT 

GGTTATCGCTGGCGAAAATATGGTCAGAAAGTTGTCAAAGGCAATCCAAATCCAAGGAGT 

TATTACAAATGCACAGCTCCAGGATGTACAGTGAGGAAACATGTTGAAAGAGCTTCTCAT 

GATCTCAAATCCGTTATAACMCTTACGAAGGCAAACATAACCATGACGTCCCCGCTGCA 

CGCAACAGCAGCCACGGAGGCGGTGGTGATAGTGGTAACGGTAACAGCGGCGGTTCAGCC 

GCAGTTTCTCACCAITACCACAACGGTCATCACTCAGAGCCGCCACGTGGGAGATTCGAC 

AGACAAGTCACAACTAACAATCAGTCTCCTTTTAGCCGTCCCTTTAGCTTTCAGCCACAT 

TTGGGTCCTCCTTCTGGTTTCTCCTTCGGTTTAGGACAAACCGGTTTGGTTAATCTTTCA 

ATGCCTGGTTTAGCGTATGGTCAAGGGAAAATGCCGGGTTTGCCTCACCCGTATATGACA 

CAACCGGTTGGGATGAGTGAAGCAATGATGCAGAGAGGGATGGAACCAAAGGTTGAACCG 

GTTTCAGATTCAGGACAATCGGTATATAACCAGATCATGAGTAGATTACCTCAGATTTGA 

AATTTACTCTTCTTCTTCTTCTTCTGCATTTGGTCACTCCTTATAATAACTTTTAATTTC 

TGCTTCTTCTTCTTCTTTCATTTATTGGTTTCAAACTTTGGGGAAGGTAAAGGCTGTTTT 

ATTGTTAAAAAAAAAAAAAAAAA 

>G877 Amino Acid Sequence (domain in AA coordinates: 272-328, 487-603) 
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MAGFDEfTVAVMGEWVPRSPSPGTLFSSAIGEEKSSKRVLERELSLWHGQVIGLEEDTSSN 
SSsSQSNVFRGGLSERIRARAGFNAPRLNTENIRTNTDFSIDSNLRSPCbTISSPGLS 
PRT^LESPVFLSNPLAQPSPTTGKFPFLPGVNGNAIiSSEKAKDEFFDDIGASFSFHPVSR 

ssKffqSteS^ 

DVTTOTS^TTOHQEEEEEQRRGDSMAGGAPAElXSYNWRKYGQKIjVKGSEYPRSYYKCra 

pSS^Sghiteiiykgahnhlkpppnrrsgmqvdgteqveqqqqqrdsaatw 

qnvnlLDDGYRWRKYGQKWKGNPNPRSYYKCTAPGCTVRKHVERASHDLKSVITTYEGK 

™paaSsshggggdsgngnsggsaavshhyhhghhsepprgrfdrqvtt N nqspfs 
gmepkvepvsdsgqsvynqimsrlpqi * 

atcggLgacctccttgttgtgacaagtccaatgtcaagaaaggtctctggaccgaggaa 

gaagaSSSgatcc^gcttatgttgctatccatggtgt^^ 

cSaaaaSS^ 

SaSa^gaccttaaacatgacagcttctctacccaagaagaagagcttatcattgag 

gSStgSSaagaatcactggaacacaaagctgaagaa^^ 
aSgacccLtgactcataaaccggtttctcaactccttgcagaattcagaaacattagc 

ScaSgaaItgcatccttcaaaacagaaccatcta^^^ 

SSagctcg^^ 

tcScaatgatgtttacaaattcctctgagtaccaaactactccatttcatttctatagc 
Stcc^cSctgctcaatggaaccacatcttcatgctcttcctcatcatcttctact 

SSStcagccaaaccaagtacctcaaacaccggttactaac^^ 
ScttSctcggacccggttcctcaagtagtgggatcctcagctactagcgacctcact 

SacgSgaacgaacatcatttcaac^tcgaagccgaatacatctctca^ 
tcaaaggcctcgggaacatgtcattccgcgagttccttcgttgacgaaatactagataaa 

gaccLgagatgttgtcacagtttcctcaactcttgaatcatttcgattattag 

>G1000 Amino Acid Sequence (domain in AA coordanates -. 14-117) 

Spots^kL.wteeeW 

"SdlSSsTQEEEMIEC^^ 

sSf^seyqttpfhfyshpnhli^gttsscsssssstsitqpnqvpqtp^fywsd 
fllsdpvpqvvgssatsdltftqnehhfnieaeyisqnidskasgtchsassfvdeildk 

dqemlsqfpqlludfdy* 
tScaLcttct^ 

ATGC^AAGAAGAAGCTACTTCTTTCTCTTGCCCTAATTAATCTACCTAACTAGGGTTO 

GTCT^AA^^AGTTCTGTTCTTGGTCTGTTTCTATATTTTGTCGGCTTGCGTAACCGAT 
CACACCTTAATGCTTTAGCTATTGTTTCCTCAAAA^^ 

AGTT^CTTrTTCTCTCTTTACGCTCTTCTTCACCTAGCTACCAATATATGAAC 

GATCAAGAATCGAGAAATTGATTTGAGCTGGCGAATAAGCAGTGGTGGGATAGGGAATTA 

GTaStoOTGCGGCGATGGAAGGCGGTTACGAGCIAAGGCGGTGGAGCTTCTAGATOiCTTC 

Staacctctttagaccggagattcaccaccaacagcttcaaccgcagggcgggatcaat 

C^TATCGACCAGCATCATCATCAGCACCAGCAACATCAACAACAACAACAACCGTCGGAT 
GA^CAAGAGAATCIGACCATTCAAACAAAGATCATCATCAACAGGGTCGACCCGATTCA 

gaSgS^^ 

^gaacaaagccaagccaccgatcatagtaactcgtGatagccccaacgcgcttagatct 

CACGTTCTTGAAGTATCTCCTGGAGCTGACATAGTTGAGAGTGTTTCCACGTACGCTAGG 
AGGAGAGGGAGAGGCGTCTCCGTTTTAGGAGGAAACGGCACCGTATCTAACGTCACTCTC 
CGTCAGCCAGTCACTCCTGGAAATGGCGGTGGTGTGTCCGGAGGAGGAGGAGTTCTCACT 
TTACATGGAAGGTTTGAGATTCTTTCGCTAACGGGGACTGTTTTCCCACCTCCTGCACCG 
CCTGGTGCCGGTGGTTTGTCTATATTTTTAGCCGGAGGGCAAGGTCAGGTGGTCGGAGGA 
AGCGTTGTGGCTCCCCTTATTGCATCAGCTCCGGTTATACTAATGGCGGCTTCGTTCTCA 
AATGCGGTTTTCGAGAGACTACCGATTGAGGAGGAGGAAGAAGAAGGTGGTGGTGGCGGA 
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GGRGTGACCGGTCAGGGACAGliA ttcaAGACCACCTTTTTAATTGAATTTT 
TGGTGTTTAATGTTTAGTTGATATGCATATTTT coordinates: 86 - 93 ) 

==fflSSs=S5SSSS== 

SpSSSSSgggggppqmqqapsasppsgvtgqgqi.ggnvggygfsgdphi.lg 
wgagtpsrppf* 



AATAGCTTCTAATGATAACTCTGGACTA< 



CTCGTCATGCGTCGGCCACGTGG' 



CCGTCCAGCTGGATCGAAGAACAAACCGAAGCCGCCG 



GT< 



GATTGTCACGCGCGAGAGCGCAAACA( 



GGCTGCGACGTTTTCGAATGTATCTCCAi 



GT' 
GC 



'TTTATCCGGGACGGGAACCGTCACT, 



CTCTTAGGGCTCACATTCTTGAAGTTGGAAGT 
CTTACGCTCGTCGGAGACAGCGCGGGATTTGC 
•AACGTCAGCATCCGTCAGCCTACGGCGGCCGGA 



TGTTGTGACTCTGCGGGGTACTTTTGAGA' 
CCACCTGCTCCTCCAGGGGCGACTAGCTTGAi 



TTCTTTCCCTCTCCGGATCTTTTCTTCCG 
^CGATATTCCTCGCTGGAGCTCAAGGACAG 
TAATGGCGGCGGGGCCGGTAATGGTCATGGCA 



GGGTTGCCTTTCTTTAATTTGCCGATGAGTi 



ATGCCTCAGATTGGAGTTGAAAGTTGGCAG 



GGGAATCACGCCGGCGCCGGTAGGGCTCCGTTTTAGCAAT^^ 

^^L^^!^^iaTTGACCCTCTTACTGCATGGTTTCTTCTATTGGGTTAATTGGCTAGCT 
ATTGTTCATGTATTGACCCll-li^ CAAATTTGCCCACATATAAAGCTTCTAGC 



CATAAGAATTGTTTAATTTGGTTATTGTCATCJ 
r I RTYARRRQRGI CVLSGTGTVTNVS IRQPTAAGAWT1.RGTFEI LSLSGS ^5 ? „nr 

SSSaqgqwggnwge^^ 

GGGNMYSEATGGGGGLPFFNLPMSMPQIGVESWQGNHAGAGRAPF* 
>G1266 (62. -718) - « • « njrninTj ryrr"& AGAAACAGAGTATTTTTTCTA 



CAATCCACTAACGATCCCTAACCGAAAAC^GAGTAGTCAAGAAACAGAGTATT' 
TTCAC 

GAAC GACTCAGAGGAAa£gT^ 



nT>T<r i raTCC^TTTTTAATTCAGTCCCCATTCTCCGGCTTCTCACCGGAATATTCTATCGG 



rCGCTTTCGGAG... 

GGTTGTGGCGTTGAAGAGGAAACACTCGATGAGACGGAGAATGACCAATAAGAAGACGAA 

^I^^ACTTTGATCACCGCTCCGTGAAGTTAGATAATGTAGTTGTCTTTGAGGA 
AGATAGTGACTTTGATCACCGCl^ tagtcggaotggtgj ^ 



GGAGAGAGTTCAAGAGTCGCTTTCGGAGATTAAATATACCTACGM 

GGTTGTGGCGTTGAAGA 
AGATAGTGACTTTGATC 

ACTTTTGTG ATACTTGG CG <7qia7\ 
>G0266 Amino Acid Sequence (domain in 

m pFLIQSPFSGFSPEYSIGSSPDSFSSSSSimSI.PFNEOTSEKMFLYGLIEQSTQQT^ 
roSDSQDLPIKSVSSRKSEKSYRGVRRRPWGKFAAEIRDST^ 
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AYDQAAFSMRGSSAILNFSAERVQESLSEIKYTYEDGCSPWALKRKHSMRRRMTNKKTK 
DSDFDHRSVKLDHVWFEDLGEQYLEEIiLGSSENSGTW* 

aaSataataacac^ 

AACACTTCGTAGAGGG CCATGGCTCGAAGAAG AAGACGAACGG CTAGTG AAGGTC ATTAG 
TCTTTTGGGAGAACGTCGTTGGGATTCTTTAGCAATAGTTTCCGGTTTGAAGAGGAGTGG 
TAAGAGTTGCAGGCTAAGGTGGATGAACTATCTGAATCCGACTCTGAAGCGTGGACCGAT 
GAGTCAAGAAGAAGAGAGAATCATCTTTCAGCTCCATGCTCTATGGGGTAACAAGTGGTC 
GAAGATTGCGAGAAGATTACCCGGTAGGACTGATAACGAGATAAAGAACTATTGGAGAAC 
TCATTATAGAAAGAAACAGGAAGCTCAAAACTATGGAAAGCTCTTTGAGTGGAGAGGAAA 
TACAGGAGAAGAATTGTTGCACAAGTATAAGGAAACAGAGATCACTAGGACAAAGACGAC 
GTCTCAAGAACATGGTTTTGTTGAAGTTGTGAGCATGGAAAGTGGTAAAGAAGCCAACGG 
TGGTGTTGGTGGAAGAGAAAGCTTCGGTGTTATGAAATCACCGTATGAAAATCGGATTTC 
GGATTGGATATCAGAGATTTCTACTGACCAGAGTGAAGCAAATCTTTCAGAAGATCACAG 
CAGCAATAGCTGCAGTGAGAACAATATTAACATTGGTACTTGGTGGTTTCAAGAGACTAG 
GGACTTTGAGGAGTTTTCATGTTCTCTATGGTCATAATTCTAAAGTTGGTTTATTTACTT 

TTTAAAAAAAAAAAAAAAAA 

>G1311 Amino Acid Sequence (domain in AA coordinates: 11-112) 
MDFKKEETLRRGPWLEEEDERLVKVISLLGERRWDSLAIVSGLKRSGKSCRLRWMNYLNP 
TLKRGPMSQEEERIIFQLHAI.WGNKWSKIARRLPGRTDNEIKNYWRTHYRKKQEAQNYGK 
LFEWRGNTGEELLHKYKETEITRTKTTSQEHGFVEWSMESGKEANGGVGGRESFGVMKS 
PYENRISDWISEISTDQSEANLSEDHSSNSCSENNINIGTWWFQETRDFEEFSCSLWS* 

gScttgtattggSggatcggtatacttagttgattacgtaattaaatagatcggcgt 

GAAGAAGAAAAATGATCATGTGCAGCCGAGGCCATTGGAGACCAGCTGAAGACGAGAAGC 
TCAAGGATCTTGTCGAACAATACGGTCCTCACAATTGGAACGCCATTGCTCTCAAGCTTC 
CTGGTCGCTCTGGTAAGAGTTGTAGATTGAGATGGTTTAATCAATTGGATCCAAGGATCA 
ACCGAAACCCTTTCACGGAAGAAGAAGAAGAAAGACTTTTAGCGGCTCATCGGATCCATG 
GGAACAGATGGTCCATCATCGCAAGGCTTTTCCCTGGAAGAACTGATAACGCCGTCAAGA 
ACCATTGGCACGTCATCATGGCTCGTCGCACACGCCAAACCTCTAAGCCTCGTCTTCTTC 
CCTCGACGACTTCGTCTTCTTCTTTAATGGCGAGTGAACAAATCATGATGAGTTCTGGTG 
GTTATAATCATAATTATAGTTCCGATGATCGGAAGAAAATATTTCCAGCAGACTTTATAA 
ATTTCCCTTACAAATTCTCTCATATCAATCATCTTCACTTCCTAAAGGAGTTTTTCCCCG 
GAAAGATCGCTTTAAGTCACAAAGCAAATCAGAGTAAGAAGCCTATGGAGTTCTACAATT 
TTCTACAAGTAAACACAGATTCAAACAAGAGCGAGATTATAGATCAAGATTCAGGTCAAA 
GCAAACGCAGTGACTCGGACACCAAACATGAAAGTCATGTTCCATTCTTCGACTTTTTAT 
CCGTTGGAAACTCTGCCTCCTAGGATTAGTTTTTTTGCAGTAACTCCTAAATTTCTAGAT 
TAACTATTTAGTCCGTATACGTACGAGATTATCTAGGTCGTTAGdATGTATGCTTGATGT 
GTATAATCACTAACTAGTGAGCTATTACCTGCGAAAATTGTAAGAAAAATACATAATGTT 
GATGTATCACACATTCTCAATGTCTGTAAAATTTCCATCGAGTTGTTAACTATCAAAGTT 

ATCCGTTTGAAAAAAAAAAAA 

>G1321 Amino Acid Sequence (domain in AA coordinates: 4-106) 
MIMCSRGHWRPAEDEKLKDLVEQYGPHHWNAIALKLPGRSGKSCRLRWFNQIjDPRINRNP 

FTEEEEERLLAAHRIHGNRWS 1 1 ARLFPGRTDNAVKNHWHV I MARRTRQTSKPRLLPSTT 
SSSSLMASEQIMMSSGGYNHNYSSDDRKKIFPADFINFPYKFSHINHLHFLKEFFPGKIA 
LSHKANQSKKPMEFYHFLQWTpSNKSEIIDQDSGQSKRSDSDTKHESHVPFFDFLSVGN 

SAS* 

>G1326 (32.. 784-) 

CGACGGTACGGTGGAGATAGAGATAGCATCCATGGAGATGTCTAGAGGAAGCAACAGTTT 
TGACAATAAGAAGCCTAGTTGCCAAAGAGGTCACTGGAGACCTGTTGAAGATGACAATCT 
CCGGCAACTCGTTGAACAATACGGTCCCAAGAACTGGAATTTTATTGCTCAACATCTCTA 
TGGAAGATCAGGGAAAAGCTGTAGATTAAGATGGTACAACCAACTTGATCCAAACATCAC 
CAAGAAACCCTTCACCGAGGAGGAAGAAGAGAGACTGCTTAAAGCTCATCGGATCCAAGG 
GAATCGTTGGGCCTCCATAGCCCGACTGTTCCCCGGGAGGACCGACAACGCTGTCAAAAA 
CCATTTTCATGTCATCATGGCTAGACGCAAACGGGAAAACTTCTCTTCCACAGCTACTTC 
TACGTTCAACCAAACTTGGCATACTGTTTTGAGCCCTAGTTCTAGTCTTACAAGGCTAAA 
TAGATCCCATTTCGGGCTATGGAGGTATCGAAAGGATAAGAGTTGCGGTCTCTGGCCTTA 
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CTCTTTTGTTTCACCACCTACGAATGGTCAATTTGGATCTTCATCTGTCTCTAACGTACA 
CCACGAAATTTATCTTGAGAGGAGAAAGTCGAAAGAGTTGGTGGATCCTCAGAATTACAC 
ATTTCATGCAGCCACACCAGATCATAAGATGACTTCAAATGAAGATGGACCATCCATGGG 
AGATGATGGTGAGAAGAACGATGTTACTTTCATTGATTTTCTTGGTGTTGGATTAGCTTC 

Sggttataacatcacaagtcaaagcttttaagggtttctatcattagggttaggcatc 

ATTTTCAGCCTTTTGCTTCCTTAAACTCTCATATGGATCT _ 

.ris^fi Amino Acid Sequence (domain in AA coordinates: 18-121) 

MEMSRGSNSFDNKKPSCQRGHWRP\7EDDNIjRQLVEQYGPKNWNFIAQHLYGRSGKSCRLR 

WYHOLDPNITKKPFTEEEEERLLKAHRIQGNRWASIARLFPGRTDNAVKNHFHVIMARRK 

RENFSSTATSTFNQTWHTVLSPSSSLTRLNRSHFGLWRYRKDKSCGLWPYSFVSPPTNGQ 

FGSSSVSNVHHEIYLERRKSKEl,VDPQNYTFHAATPDHKMTSNEDGPSMGDDGEiaroVTF 

IDFLGVGLAS* 

TCCTOCCACAAAACCTTTTTAATTTTATCTGAAAAATTAAAACAACCGAAACAAAAAA^ 

AAAACTAAAAATCAAAAATCTCATCACCTTCCTTGCTCTGTATTTTTTCTCTCTCACTAA 

ATCCTCCATGGATCCTTCTCTCTCTGCAACCAATGATCCTCATCATCCTCCTCCTCCTCA 

GTTCACATCTTTCCCTCCTTTCACCAACACCAACCCCTTCGCCTCTCCAAACCACCCCTT 

CTTCACCGGACCCACCGCCGTCGCGCCGCCAAACAACATCCATCTCTATCAAGCAGCTCC 

TCCGCAGCAGCCACAAACATCTCCAGTTCCTCCTCATCCATCTATTTCCCACCCTCCTTA 

CTCTGACATGATTTGCACGGCGATTGCAGCGTTAAACGAACCAGATGGGTCAAGCAAGCA 

AGCTATTTCGAGGTACATAGAGAGAATTTACACTGGGATTCCTACTGCTCATGGAGCTTT 

GTTGACACACCATCTCAAGACTTTGAAGACCAGTGGGATTCTTGTCATGGTTAAGAAATC 

TTACAAGCTTGCTTCTACTCCTCCTCCTCCTCCTCCTACTAGTGTAGCTCCTAGTCTTGA 

ACCTCCCAGATCTGATTTCATAGTCAACGAGAACCAACCTTTACCTGATCCGGTTTTGGC 

TTCTTCTACTCCTCAGACTATTAAACGTGGTCGTGGTCGACCTCCAAAAGCTAAACCAGA 

TGTTGTTCAACCTCAACCTCTGACTAATGGAAAACTCACCTGGGAACAGAGTGAATTACC 

TGTCTCTCGACCAGAGGAGATACAGATACAGCCGCCACAGTTACCGTTACAGCCACAGCA 

GCCGGTTAAGAGACCGCCGGGTCGTCCTAGAAAAGATGGAACTTCGCCGACGGTGAAGCC 
RGCTGCTTCTGTTTCCGGTGGTGTGGAGACTGTGAAACGAAGAGGTAGACCTCCGAGTGG 

AAGAGCTGCTGGGAGGGAGAGAAAGCCTATAGTAGTCTCAGCTCCAGCTTCAGTGTTCCC 

GTATGTTGCTAATGGTGGTGTTAGACGCCGAGGGAGACCAAAGAGAGTTGACGCTGGTGG 

TGCTTCCTCTGTTGCTCCACCACCACCACCACCAACTAACGTAGAGAGTGGAGGAGAGGA 

GGTTGCAGTCAAGAAACGAGGAAGAGGACGGCCTCCTAAGATTGGAGGTGTTATCAGGAA 

GCCTATGAAGCCGATGAGAAGCTTTGCTCGTACTGGAAAACCCGTAGGAAGACCCAGAAA 

GAATGCGGTGTCAGTGGGAGCTTCTGGACGACAAGATGGTGACTATGGAGAACTGAAGAA 

GAAGTTTGAGTTGTTTCAAGCGAGAGCTAAGGATATTGTAATTGTGTTGAAATCCGAGAT 

AGGAGGAAGTGGAAATCAAGCAGTGGTTCAAGCCATACAGGACCTGGAAGGGATAGCAGA 

GACAACAAACGAGCCAAAGCACATGGAAGAAGTGCAGCTGCCAGACGAGGAACACCTTGA 

AACCGAACCAGAAGCAGAGGGTCAAGGACAGACAGAAGCAGAGGCAATGCAAGAAGCTCT 

GTTCTAAAGATAAAGCCTTGACATAAAAAGCTAGCAAGTGGTGGGTTTACTTGTTGTGTG 

TTACATGAAATTTTTAATCTTATAAGGGTGTTTGCAGGAGAAAAACAAAAAGAACAATGT 

GATGAACTGATGATGATGATTGTGTCTCTAACCAAACAACAAGGAGAGGTAGGGTAATGT 

CTGTAAAGTGAATTAGGATGTTACCATTGTTCATGCTTCCCATCTCTCTCCATCGTCCAT 

ATCTGTGTAGGCAGCTTTGTTCTTTGTTCCCTCGTGTrTTTTTTAGACTGTTGTGTCTCT 

TATTCTATTTTGTCTCCTTAGGCTrrTTAGGAGTTGTTGTTGATGTTTATCAAAAACGCT 

TATGTAATTTTTATGACCACTTCTACrTTTTATGATGGTTTCTT 298 - 3 19, 
>G1367 Amino Acid Sequence (domain in AA coordinates: 179-201, 262 2B5, 298 Ji*, 
225-357) 

OTPSIiSATlTOPHHPPPPQFTSFPPFTNTNPFASPNHPFFTGPTAVAPPNHIHLYQAAPPQ 
OPOTSPVPPHPSISHPPYSDMICTAIAALNEPDGSSKQAISRYIERIYTGIPTAHGALLT 
IfflLKTLKTSGILVMVKKSYKLASTPPPPPPTSVAPSLEPPRSDFIVNENQPLPDPVLASS 
TPQTIKRGRGRPPKAKPDWQPQPLTNGKLTWEQSELPVSRPEEIQIQPPQLPLQPQQPV 
KRPPGRPRKDGTSPTVKPAASVSGGVETVKRRGRPPSGRAAGRERKPIVVSAPASVFPYV 
ANGGVRRRGRPKRVDAGGASSVAPPPPPPTNVESGGEEVAVKKRGRGRPPKIGGVIRKPM 
KPMRSFARTGKPVGRPRKNAVSVGASGRQDGDYGELKKKFELFQARAKDIVIVLKSEIGG 
SGNQAWQAI QDLEG I AETTNEPKHMEEVQLPDEEHLETEPEAEGQGQTEAEAMQEAIjF * 
>G1386 (89.-673) 
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AATTTTATTTCCTTCTCTCAAATCTTCCCACCAAAAATTAACTCTTTCGTTCACACTAAG 
TCCCTTTTAAAAGAAAATATCCCAATTAATGGAACGTGACGACTGCCGGAGATTTCAGGA 
CTCGCCGGCGCAGACGACGGAGAGAAGAGTGAAATATAAACCAAAGAAGAAAAGAGCCAA 
AGATGATGATGATGAGAAAGTTGTTTCGAAGCATCCAAATTTTCGAGGTGTCAGAATGAG 
ACAATGGGGAAAATGGGTGTCCGAAATCAGAGAGCCAAAAAAGAAATCAAGAATCTGGCT 
CGGTACTTTCTCCACGGCGGAGATGGCGGCGCGTGCTCACGACGTGGCAGCTTTAGCCAT 
CAAAGGCGGTTCTGCACATCTCAACTTCCCGGAGCTCGCTTATCACCTCCCTAGACCAGC 
TAGTGCCGACCCTAAAGACATCCAAGCTGCCGCCGCCGCAGCTGCAGCCGCTGTGGCCAT 
TGACATGGATGTAGAGACGTCTTCGCCGTCGCCATCTCCCACAGTTACGGAAACGTCATC 
TCCGGCTATGATAGCACTCTCCGACGACGCGTTCTCCGATCTTCCTGATCTCTTGCTCAA 
CGTGAACCATAACATCGATGGCTTCTGGGACTCTTTTCCCTATGAAGAACCCTTCCTCTC 
TCAAAGTTACTAGAAACTCAAAACTATGTCGTTTTTGTATGTATTTTTGTCATGTGACCA 
TTTTTTGACGTCGAAAATCACCCGGAT7VATCCAAATTGTATGATTTATTAATGGTTGATG 
ATTTTCTTTGTGTGGAACAATGTGTATGATACGTAATCAAAAGTTCAAAAAAAAAAT7UVA 

AAAAA 

>G1386 Amino Acid Sequence (domain in AA coordinates: TBD) 
MERDDCRRFQDSPAQTTERRVKYKPKKKRAi^ 

REPKiCKSRIWLGTFSTAEMAARAHDVAALAIKGGSAHLNFPELAYHLPRPASADPKDIQA 
AAAAAAAAVAIDmVETSSPSPSPTVTETSSPAMIALSDDAFSDLPDLLLNVNHNIDGFW 
DSFPYEEPFLSQSY* 
>G1421 (292 -.1155) 

GAAATTTCATCCCT7y\AT7^AGAAAAAAGCATCTCCTTCTTTAGTGTCCTCCTTCACCAAA 

CTCTTGATTCCATAAGCATATATTAAAAAAGCTCTCTGCTTTCTTCAACTTTCCCGGGAA 

AATCTTCTTGTTACAAAGCATCAATCTCTTGTTTTACCAATTTTCTCTCTTTATTCCTTT 

TTTGCCCTTTACTTTTCCTAACTTTGGTCTTTATATATAAACACACGACACAAAGAAGAA 

CACACATAAGTTAAAACTATTACAACAGTTTTAAAGAGAGAGATTTAAAAAATGGAGACA 

GAGAAGAAAGTTTCTCTCCCAAGAATCTTACGAATCTCTGTTACTGATCCTTACGCAACA 

GATTCGTCAAGCGACGAAGAAGAAGAAGTTGATTTTGATGCATTATCTACAAAACGACGT 

CGTGTTAAGAAGTACGTGAAGGAAGTGGTGCTTGATTCGGTGGTTTCTGATAAAGAGAAG 

CCGATGAAGAAGAAGAGAAAGAAGCGCGTTGTTACTGTTCCAGTGGTTGTTACGACGGCG 

ACGAGGAAGTTTCGTGGAGTGAGGCAAAGACCGTGGGGAAAATGGGCGGCGGAGATTAGA 

GATCCGAGTAGACGTGTTAGGGTTTGGTTAGGTACTTTTGACACGGCGGAGGAAGCTGCC 

ATTGTTTACGATAACGCAGCTATTCAGCTACGTGGTCCTAACGCAGAGCTTAACTTCCCT 

CCTCCTCCGGTGACGGAGAATGTTGAAGAAGCTTCGACGGAGGTGAAAGGAGTTTCGGAT 

TTTATCATTGGCGGTGGAGAATGTCTTCGTTCGCCGGTTTCTGTTCTCGAATCTCCGTTC 

TCCGGCGAGTCTACTGCGGTTAAAGAGGAGTTTGTCGGTGTATCGACGGCGGAGATTGTG 

GTTAAAAAGGAGCCGTCTTTTAACGGTTCAGATTTCTCGGCGCCGTTGTTCTCGGACGAC 

GACGTTTTTGGTTTCTCGACGTCGATGAGTGAAAGTTTCGGCGGCGATTTATTTGGAGAT 

AATCTTTTTGCGGATATGAGTTTTGGATCCGGGTTTGGATTCGGGTCTGGGTCTGGATTC 

TCCAGCTGGCACGTTGAGGACCATTTTCAAGATATTGGGGATTTATTCGGGTCGGATCCT 

GTCTTAACTGTTTAAGAAATAACTGGCCGTTTAACGGCGTTTAGTGAAGTTTTGTTACCG 

GCGACGGCGAGGATTAAAAAAAAACGGCGATTTATTTTTTGAATGAAGATTTGTTAAATA 

>G1421 Amino Acid Sequence (domain in AA coordinates: 74-151) 

METEKKVSLPRILRISVTDPYATOSSSDEEEEVDFDALSTK 

KEKPMKKKRKKRVVTVPVVVTTATRKFRG WQRPWGKWAAE I RDPSRRVRVWLGTFDTAE 

EAAIVYDNAAIQLRGPlTAELNFPPPPVTENVEEASTEVKGVSDFIIGGGECLRSPVSVXjE 

SPFSGESTAVKEEFVGVSTAEIAA^KKEPSFNGSDFSAPLFSDDDVFGFSTSMSESFGGDL 

FGDNLFADMSFGSGFGFGSGSGFSSWHVEDHFQDIGDLFGSDPVLTV* 

>G1453 (39.. 917) 

CGTCGACGCGAAATAAATCCTAGAAAATAACTATC7UVTATGATGAAGGTTGATCAAGATT 
ATTCGTGTAGTATACCGCCTGGATTTAGGTTTCATCCGACAGATGAAGAACTTGTCGGAT 
ATTATCTCAAGAAGAAAATCGCCTCCCAGAGGATTGATCTCGACGTTATCAGAGAAATTG 
ATCTTTACAAGATCGAACCATGGGATCTACAAGAGAGATGTAGGATAGGGTACGAGGAGC 
AAACGGAGTGGTATTTCTTCAGCCATAGAGACAAGAAGTATCCGACTGGGACTAGGACAA 
ACCGAGCCACCGTGGCCGGTTTCTGGAAAGCAACGGGCCGGGACAAGGCGGTTTACCTCA 
ACTCCAAACTTATCGGTATGAGAAAAACGCTTGTCTTTTACCGAGGTCGAGCGCCTAATG 
GCCAAAAGTCCGATTGGATCATTCACGAATACTACAGCCTCGAGTCACACCAGAACTCTC 
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CTCCACAGGAAGAAGGATGGGTAGTGTGTAGAGCATTTAAGAAACGAACGACCATCCCAA 
CAAAAAGGAGGCAACTTTGGGATCCGAACTGCTTATTCTACGACGACGCCACTCTCTTGG 
AACCTCTCGACAAGCGAGCCAGACATAATCCTGATTTTACCGCCACACCGTTCAAGCAAG 
AACTACTCTCCGAGGCCAGTCACGTCCAGGATGGAGATTTCGGATCTATGTACCTTCAAT 
GCATCGATGATGATCAATTCTCCCAGCTTCCTCAGCTCGAGAGCCCCTCTCTTCCGTCGG 
AAATAACTCCCCATAGTACTACTTTTTCTGAGAACAGTAGCCGGAAAGATGACATGAGCT 
CCGAGAAGAGGATCACTGACTGGAGATATCTAGATAAGTTCGTGGCGTCTCAATTTTTGA 
TGAGTGGAGAAGACTAAAAAAGGCTTTCCTATGCATGCATGCACTAGAAACGTCGTCGCA 

TTTTGGATTTACATGCGGCCGCT ., a . 1( - n , 

>G1453 Amino Acid Sequence (conserved domain in AA coordinates : 13-160) 

MMKVDQDYSCSIPPGPRFHPTDEELVGYYLKKKIASQRIDLDVIREIDbYKIEPWDLQER 

CRIGYEEOTEWYFFSHRDKKYPTGTRTNRATVAGFWKATGRDKAVYLNSKLIGMRKTLVF 

YRGRAPNGQKSDWIIHEYYSLESHQNSPPQEEGWWCRAFKKRTTIPTKRRQLWDPNCLF 

YDDATLLEPLDKRARHNPDFTATPFKQELLSEASHVQDGDFGSMYLQCIDDDQFSQLPQL 

ESPSLPSEITPHSTTFSENSSRKDDMSSEKRITDWRYLDKFVASQFLMSGED* 

ATCOTTCAATTTCCACTCCTCTCTAATATAArrCACATTTTCCCACTATTGCTGATTCA 

TTTTTTTTTGTGAATTATTTCAAACCCACATAAAAAAATCTTTGTTTAAATTTAAAACCA 

TGGATCCTTCATTTAGGTTCATTAAAGAGGAGTTTCCTGCTGGATTCAGTGATTCTCCAT 

CACCACCATCTTCTTCTTCATACCTTTATTCATCTTCCATGGCTGAAGCAGCCATAAATG 

ATCCAACAACATTGAGCTATCCACAACCATTAGAAGGTCTCCATGAATCAGGGCCACCTC 

CATTTTTGACAAAGACATATGACTTGGTGGAAGATTCAAGAACCAATCATGTCGTGTCTT 

GGAGCAAATCCAATAACAGCTTCATTGTCTGGGATCCACAGGCCTTTTCTGTAACTCTCC 

TTCCCAGATTCTTCAAGCACAATAACTTCTCCAGTTTTGTCCGCCAGCTCAACACATATG 

GTTTCAGAAAGGTGAATCCGGATCGGTGGGAGTTTGCAAACGAAGGGTTTCTTAGAGGGC 

AAAAGCATCTCCTCAAGAACATAAGGAGAAGAAAAACAAGTAATAATAGTAATCAAATGC 

AACAACCTCAAAGTTCTGAACAACAATCTCTAGACAATTTTTGCATAGAAGTGGGTAGGT 

ACGGTCTAGATGGAGAGATGGACAGCCTAAGGCGAGACAAGCAAGTGTTGATGATGGAGC 

TAGTGAGACTAAGACAGCAACAACAAAGCACGAAAATGTATCTCACATTGATTGAAGAGA 

AGCTCAAGAAGACCGAGTCAAAACAAAAACAAATGATGAGCTTCCTTGCCCGCGCAATGC 

AGAATCCAGATTTTATTCAGCAGCTAGTAGAGCAGAAGGAAAAGAGGAAAGAGATCGAAG 

AGGCGATCAGCAAGAAGAGACAAAGACCGATCGATCAAGGAAAAAGAAATGTGGAAGATT 

ATGGTGATGAAAGTGGTTATGGGAATGATGTTGCAGCCTCATCCTCAGCATTGATTGGTA 

TGAGTCAGGAATATACATATGGAAACATGTCTGAATTCGAGATGTCGGAGTTGGACAAAC 

TTGCTATGCACATTCAAGGACTTGGAGATAATTCCAGTGCTAGGGAAGAAGTCTTGAATG 

TGGAAAAAGGAAATGATGAGGAAGAAGTAGAAGATCAACAACAAGGGTACCATAAGGAGA 

ACAATGAGATTTATGGTGAAGGTTTTTGGGAAGATTTGTTAAATGAAGGTCAAAATTTTG 

ATTTTGAAGGAGATCAAGAAAATGTTGATGTGTTAATTCAGCAACTTGGTTATTTGGGTT 

CTAGTTCACACACTAATTAAGAAGAAATTGAAATGATGACTACTTTAAGCATTTGAATCA 

ACTTGTTTCCTATTAGTAATTTGGCTTTGTTTCAATCAAGTGAGTCGTGGACTAACTTGC 

>G1560 Amino Acid Sequence (domain in AA coordinates: 62-151) 

MDPSFRFIKEEFPAGFSDSPSPPSSSSYLYSSSMAEAAINDPTTLSYPQPLEGIiHESGPP 

PFLTKTYDLVEDSRTNHWSWSKSNNSFIVV/DPQAFSVTLLPRFFKHMNFSSFVRQIjNTY 

GFRKVNPDRWEFANEGFIiRGQKHIiIjKNIRRRKTSNNSNQMQQPQSSEQQSLDNFCIEVGR 

YGLDGEMDSLRRDKQVLMMELVRLRQQQQSTKMYLTLIEEKLKKTESKQKQtMSFLARAM 

QNPDFIQQLVEQKEKRKEIEEAISKKRQRPIDQGKRNVEDYGDESGYGNDVAASSSALIG 

MSQEYTYGNMSEFEMSELDKLAMHIO<3IX3DNSSAREEvliNVEKGNDEEEVEDQQQGYHKE 

NNEIYGEGFWEDLIJJEGQNFDFEGDQENvDvI.IQQIjGYLGSSSHTN* 

>G1594 (1. .984) ™™,»„v, 

ATGGATGGAATGTACAATTTCCATTCGGCCGGTGATTATTCAGATAAGTCGGTTCTGATG 

ATGTCACCGGAGAGTCTCATGTTTCCTTCCGATTACCAAGCTTTGCTATGTTCCTCCGCC 

GGTGAAAATCGTGTCTCTGATGTTTTCGGATCCGACGAGCTACTCTCAGTAGCCGTCTCC 

GCTTTGTCGTCGGAGGCGGCTTCGATCGCTCCGGAGATCCGAAGAAATGATGATAACGTT 

TCTCTAACTGTCATCAAAGCTAAAATCGCTTGTCATCCTTCGTATCCTCGCTTACTTCAA 



GCTOACATCGATTGCOAAAGGTCGGAGCACCACCGGAGATAGCGTGTTTACTAGAGGAG 
ATTCAACGGGAGAGTGATGTTTATAAGCAAGAGGTTGTTCCTTCTTCTTGCTTTGGAGCT 
GATCCTGAGCTTGATGAATTTATGGAAACGTACTGCGATATATTAGTGAAATACAAATCG 
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GATCTAGCAAGACCGTTTGACGAGGCAACGTGTTTCTTGAACAAGATTGAGATGCAGCTA 
CGGAACCTATGTACTGGTGTCGAGTCTGCCAGGGGAGTTTCTGAGGATGGTGTAATATCA 
TCTGACGAGGAACTGAGTGGAGGTGATCATGAGGTAGCAGAGGATGGGAGACAAAGATGT 
GAAGACCGGGACCTCAAAGATAGGTTGCTACGCAAATTTGGAAGCCGTATTAGTACTTTA 
AAGCTTGAGTTCTCAAAGAAGAAGAAGAAAGGAAAGTTACCAAGAGAAGCAAGACAAGCT 
CTTCTTGATTGGTGGAATCTCCATTATAAGTGGCCTTACCCTACTGAAGGAGATAAGATA 
GCATTAGCTGATGCAACGGGGTTAGACCAAAAACAAATCAACAATTGGTTTATAAACCAA 
AGGAAACGTCATTGGAAGCCATCAGAGAATATGCCTTTCGCTATGATGGATGATTCTAGT 
GGATCATTCTTTACCGAGGAATGA 

>G1594 Amino Acid Sequence (conserved domain in AA coordinates : 343-308) 

MDGMTOFHSAGDYSDKSVLMMSPESLMFPSDYQALLCSSAGENRVSDVFGSDELLSVAVS 

ALSSEAASIAPEIRROTDNVSLTVIKAKIACHPSyPRLLQAYIDCQICVGAPPEIACLLEE 

IQRESDVYKQEWPSSCFGADPELDEFMETYCDILVKYKSDLARPFDEATCFLNKIEMQL 

RNLCTGVESARGVSEDGVISSDEELSGGDHEVAEDGRQRCEDRDLIODRLIiRKFGSRISTIi 

KLEFSKKKKKGKLPREARQALLDWWNLHYKWPYPTEG 

RKRHWKPSENMPFAMMDDSSGSFFTEE* 

>G1750 (94 . .1101) 

CCCTTTTCCTCTCTTTCTCCAAATCTCTGAAAATTTTCACCAGAATCTCTGTTCTTTTTT 

TCACCAGAATCTCTCTGTTTAAAATAATAGGTGATGATGATGGATGAGTTTATGGATCTT 

AGACCAGTGAAGTACACAGAGCACAAGACTGTTATCAGAAAGTACACTAAAAAGTCGTCT 

ATGGAGAGGAAGACCAGTGTTCGTGACTCGGCCAGGTTGGTTCGGGTCTCAATGACGGAT 

CGTGACGCCACTGATTCATCAAGCGACGAGGAAGAGTTTCTGTTCCCTCGAAGACGTGTC 

AAGAGATTGATTAACGAGATCAGAGTCGAGCCTAGCAGCTCTTCCACCGGCGACGTCTCT 

GCTTCTCCGACGAAGGACCGGAAAAGAATCAACGTTGATTCTACGGTTCAAAAGCCCTCT 

GTTTCCGGCCAAAACCAGAAGAAGTACCGCGGCGTGAGACAGCGACCATGGGGAAAATGG 

GCGGCGGAGATTCGTGATCCTGAGCAACGCCGGAGAATCTGGCTCGGTACTTTTGCAACG 

GCGGAGGAAGCTGCCATCGTCTACGACAACGCAGCAATCAAACTTCGTGGCCCTGATGCT 

CTTACCAACTTCACCGTACAACCAGAACCAGAACCGGTACAAGAACAAGAACAAGAACCG 

GAGAGCAACATGTCGGTTTCGATATCAGAATCAATGGACGATTCTCAACATCTATCATCT 

CCGACATCGGTTCTCAACTACCAAACATATGTCTCGGAGGAACCAATCGATAGTCTTATC 

AAACCGGTTAAACAAGAGTTTCTTGAACCAGAACAAGAGCCAATAAGCTGGCATCTTGGA 

GAAGGTAATACTAATACTAATGATGATTCATTTCCATTGGACATTACATTTCTCGACAAC 

TATTTCAATGAATCATTACCAGACATCTCCATCTTCGATCAACCTATGTCTCCTATTCAA 

CCAACAGAGAATGATTTCTTC7VACGACCTTATGTTATTCGATAGCAACGCAGAAGAATAC 

TACTCCTCCGAGATCAAAGAGATTGGTTCATCGTTCAACGATCTTGATGATTCTTTGATA 

TCCGATCTCTTACTTGTGTGATATTTTTGCCATTAACCAAACACCGGTTTGGTTGC 

>G1750 Amino Acid Sequence (domain in AA coordinates: 107-173) 

MMMDEFMDLRPVTCYTEHKTVIRKYTKKSSMEROT 

EFLFPRRRVKRLINEIRTOPSSSSTGDVSASPTKDRKRINTOSTVQKPSVSGQNQKKYR 

WQRPWGKWAAEIRDPEQRRRIWLGTFATAEEAAIVTONAAIKLRGPDALTNFTVQPEPE 

PVQEQEQEPESNMSVSISESMDDSQHLSSPTSVLNYQTYVSEEPIDSLIKPVKQEFLEPE 

QEPISWHLGEGNTNTlsroDSFPIiDITFLDlsryFNESLPDISIFDQPM 

LFDSNAEEYYSSEIKEIGSSFNDLDDSLISDLLLV* 

>G1947 (70.. 918) 

ACAACTATTCTCTCCTCTCTCTTTTTTTATT^ 

GTTCAGAAAATGGATTATAACCTTCCAATTCCATTAGAGGGTCTCA7VAGAAACGCCACCA 
ACGGCTTTCTTGACGAAAACATACAACATAGTGGAGGATTC^ 

TCATGGAGCAGAGASAACAACAGCTTCATTGTTTGGGAACCAGAGACTTTTGCCCTAATT 

TGCCTCCCTAGATGCTTTAAGCACAATAATTTCTCCAGCTTTGTTAGACAGCTCAATACT 

TATGGGTTTAAGAAGATTGATACAGAGAGATGGGAATTTGCAAATGAGCATTTTCTGAAG 

GGAGAGAGGCATCTTCTTAAGAACATGAAGAGAAGAAAGACATC^^ 

CAGTCGCTAGAAGGAGAGATCCATGAGCTGCGAAGAGACAGAATGGGTTTAGAAGTAGAA 

CTGGTTAGACTGCGACGAAAACAAGAAAGCGTGAAGACATATCTGCATTTGATGGAAGAG 

AAACTGAAAGTCACAGAAGTAAAGCAAGAAATGATGATGAATTTCTTGCTAAAGAAGATT 

AAGAAACCGAGTTTTTTACAGAGCTTAAGGAAACGTAATCTGCAAGGAATCAAGAATCGA 

GAGCAAAAGCAAGAGGTGATCTCAAGCCATGGTGTTGAGGATAATGGAAAGTTTGTTAAA 

GCTGAGCCAGAAGAGTATGGTGATGACATCGATGATCAATGTGGAGGTGTGTTTGATTAT 
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GGTGATGAGCTTCACATAGCTTCAATGGAGCATCAAGGACAAGGGGAGGATGAAATTGAA 

ATGGATAGTGAAGGAATTTGGAAGGGTTTCGTGTTGAGTGAGGAGGAGATGTGTGATTTA 

GTGGAACATTTTATATAATAAACTAATGTATTATGAGAGGTTTTTTTTTGTTTTTTTGCT 

TTTTTTTTCCGAGTTTGTCATCAAGCATTGTATACAATTTGGGCCAAACTAAAAGCCCAA 

CAAAATATTTGGCCTTGGCATTTGTTAACAAATTGACTAATTCGGCCACACCTTCC 

>G1947 Amino Acid Sequence (domain in AA coordinates: 37-120) 

MDYNLPIPLEGLKETPPTAFLTKTYNIVEDSSTNNIVSWSRDNNSFIVWEPETFALICLP 

RCFKHNNFSSFVRQLNTYGFKKIDTERWEFANEHFLKGERHLLKNIKRRKTSSQTQTQSL 

EGEIHELRRDRMALEVELVRLRRKQESVKTYLHLMEEKLICVTEVKQEMMMNFLLKKIKKP 

SFLQSLRKRNLQGIKNREQKQEVISSHGVEDNGKFVKAEPEEYGDDIDDQCGGVFDYGDE 

LHIASMEHQGQGEDEIEMDSEGIWKGFVLSEEEMCDLVEHFI* 

>G2011 (309. .1547) 

AATGTCGGTTGTACAATTATTTGTCACTAAAGTTTCCAAATTTCTTCTAAACTGATGAAT 

CT^ATGGAACATGATGACGAAAAAGATAAATCCACGGTGGCGGGAACTGACCCACCCATTT 

CCACCGCCTCTCTATTCCCCAGATTTTTTTCAATTATCTGACTACAGTTTGTCGGTTACT 

TCCTTCCCTAAACCTTTATAAACCATTAAACCTCTCATCCTTCTTCTCTTAAACCCCCTA 

ATTATCACACACACCCCAATTTCTCACTCTCTCTCTCACTAAAACCCGTAAATTTTCTAC 

TATATCAAATGAGCCCAAAAAAAGATGCTGTTTCTAAACCAACTCCT^ATTTCAGTACCCG 

TTTCGAGACGATCCGATATACCCGGGTCTCTCTACGTCGACACTGACATGGGTTTCTCTG 

GGTCACCACTTCCCATGCCACTAGACATCTTACAAGGGAATCCAATTCCACCTTTTTTAT 

CCAAGACTTTTGATTTGGTTGATGACCCGACTCTTGACCCGGTCATCTCTTGGGGACTGA 

CCGGAGCTAGCTTCGTAGTTTGGGATCCTCTAGAGTTTGCCAGAATCATACTTCCAAGGA 

ATTTCAAACACAACAATTTCTCCAGCTTCGTCAGACAGCTTAACACTTATGGATTTCGAA 

AGATTGATACTGACAAGTGGGAATTCGCTAACGAGGCTTTCCTTAGAGGCAAGAAGCATC 

TTCTGAAGAACATTCATCGTCGTCGATCACCACAATCCAACCAAACTTGCTGCAGTAGCA 

CTAGCCAAAGCCAAGGGTCACCTACTGAGGTTGGAGGAGAGATTGAGAAGCTGAGGAAAG 

AGCGGCGTGCATTGATGGAGGAAATGGTTGAGCTTCAGCAGCAAAGCAGAGGCACAGCTC 

GACATGTGGACACTGTAAACCAGAGGCTGAAAGCTGCAGAGCAACGTCAGAAGCAATTGC 

TCTCTTTCTTGGCTAAGTTGTTTCAGAACCGGGGTTTCTTGGAACGCCTGAAGAACTTCA 

AAGGAAAAGAAAAAGGAGGAGCTCTTGGATTGGAAAAGGCGAGAAAGAAGTTCATCAAGC 

ACCACCAGCAGCCTCAAGATTCTCCAACAGGAGGGGAGGTGGTGAAGTATGAAGCTGATG 

ATTGGGAGAGATTGCTAATGTATGACGAAGAGACTGAGAACACCAAGGGTTTAGGAGGGA 

TGACTTCAAGCGATCCAAAAGGCAAGAACTTGATGTATCCATCAGAAGAAGAGATGAGCA 

AACCAGATTACTTGATGTCCTTCCCATCTCCTGAAGGACTTATTAAACAAGAAGAGACGA 

CATGGAGCATGGGTTTCGATACTACAATACCGAGTTTCAGCAACACCGATGCATGGGGAA 

ACACAATGGACTATAATGATGTCTCAGAGTTTGGTTTTGCTGCAGAAACAACAAGTGATG 

GTTTGCCTGATGTCTGCTGGGAACAATTTGCTGCAGGAATCACAGAGACTGGATTCAACT 

GGCCAACTGGTGATGATGATGATAATACGCCAATGAATGATCCTTAGGATCTTTTCATAT 

ATAGTTTAGACCAAAAACCCGTTTCTTATCGGGTGAACTATTAATTCATTATTCATTTTG 

AATGCACTCTTTATACATATATATAATATTGATGAGTTTGATTGTTCC^AAAAAAAA 

>G2011 Amino Acid Sequence (domain in AA coordinates: 56-147) 

MSPKKDAVSKPTPISVPVSRRSDIPGSLYVDTDMGFSGSPLPMPLDILQGNPIPPFLSKT 

FDLVDDPTLDPVI SWGLTGASFWWDPLEFARI I LPRNFKHNNFS S FVRQLNTYGFRKID 

TDKWEFANEAFLRGKKHLLKNIHRRRSPQSNQTCCSSTSQSQGSPTEVGGEIEKLRKERR 

ALMEEMVELQQQSRGTARHVDTWQRLKAAEQRQKQLLSFIJUaFQNRG 

EKGGALGLEKARKKFIKHHQQPQDSPTGGEVVKYEMDWERLLMYDEETENTKGLGGMTS 

SDPKGKNLMYPSEEEMSKPDYLMSFPSPEGLIKQEETITWSMGFDTTIPSFSNTDAWGNTM 

DYNDVSEFGFAAE'CrSDGLPDVCWEQFAAGITETGFNWPTGDDDDNTPMNDP* 

>G2094 (1..450) 

ATGCTAGATCCCACCGAGAAAGTAATCGATTCAGAATCAATGGAAAGCAAACTCACATCA 

GTAGATGCGATCGAAGAACACAGCAGCAGTAGCAGTAATGAAGCTATCAGCAACGAGAAG 

AAGAGTTGTGCC^TTTGTGGTACCAGCAAAACCCCTCTTTGGCGAGGCGGTCCTGCCGGT 

CCCAAGTCGCTTTGTAACGCATGCGGGATCAGAAACAGAAAGAAAAGAAGAACAC 

TCAAATAGAT(^GAAGATAAGAAGAAGAAGAGTCATAACAGAAACCCGAAGTTTGGTGAC 

TCGTTGAAGCAGCGATTAATGGAATTGGGGAGAGAAGTGATGATGCAGCGATCAACGGCT 

GAGAATCAACGGCGGAATAAGCTTGGCGAAGAAGAGCAAGCCGCCGTGTTACTCATGGCT 

CTCTCTTATGCTTCTTCCGTTTATGCTTAA 
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>G2094 Amino Acid Sequence (domain in AA coordinates -.43-68) 

MLDPTEKVIDSESMESKLTSVDAIEEHSSSSSNEAISNEKKSCAICGTSKTPLWRGGPAG 

PKSLCNACGIRNRKKRRTLISNRSEDKKKKSHNRNPKFGDSLKQRLMELGREVMMQRSTA 

ENQRRNKLGEEEQAAVLLMALSYASSVYA* 

>G2113 (90.. 590) 

ATAACAAACTCATCAAACTTCCTCAGCGTTTCTTTTTCTTACATAAACAATTTTTCTTAC 
ATAAACAAATCTTGTTGTTTGTTGTTGTCATGGCACCGACAGTTAAAACGGCGGCCGTCA 
AAACCAACGAAGGTAACGGAGTCCGTTACAGAGGAGTGAGGAAGAGACCATGGGGACGTT 
ACGCAGCCGAGATCAGAGATCCTTTCAAGAAGTCACGTGTCTGGCTCGGTACTTTCGACA 
CTCCTGAAGAAGCCG CTCGTGCCTACGACAAACGTGCTATTGAGTTTCGTGGAG CTAAAG 
CCAAAACCAACTTCCCTTGTTACAACATCAACGCCCACTGCTTGAGTTTGACACAGAGCC 
TGAGCCAGAGCAGCACCGTGGAATCATCGTTTCCTAATCTCAACCTCGGATCTGACTCTG 
TTAGTTCGAGATTCCCTTTTCCTAAGATTCAGGTTAAGGCTGGGATGATGGTGTTCGATG 
AAAGGAGTGAATCGGATTCTTCGTCGGTGGTGATGGATGTCGTTAGATATGAAGGACGAC 
GTGTGGTTTTGGACTTGGATCTTAATTTCCCTCCTCCACCTGAGAACTGATTAAGATTTA 
ATTATGATTATTAGATATAATTAAATGTTTCTGAATTGAG 

>G2113 Amino Acid Sequence (domain in AA coordinates: TBD) 

mPTVKTAAVKTNEGNGVRYRGTOKKPWGRYAAEIRDPFKKSRVWLGTFDTPEEAARAYD 

KRAIEFRGAKAKTNFPCYNINAHCLSLTQSLSQSSTVESSFPNLNLGSDSVSSRFPFPKI 

QVKAGMMVFDERSESDSSSWMDWRYEGRRWLDLDLNFPPPPEN* 

>G2115 (41.. 733) 

AATCACTCTACAAAGCCTGTACGTACACAACAACATTACCATGGTGAAACAAGAACGCAA 
GATCCAAACCAGCAGCACAAAAAAGGAAATGCCTTTGTCATCATCACCATCTTCTTCTTC 
TTCTTCATCTTCTTCCTCGTCTTCGTCTTCGTGTAAGAACAAGAACAAGAAGAGTAAGAT 
TAAGAAGTACAAAGGAGTGAGGATGAGAAGTTGGGGATCATGGGTCTCTGAGATTAGGGC 
ACCAAATCAAAAGACAAGGATTTGGTTAGGTTCTTACTCAACAGCTGAAGCAGCTGCTAG 
AGCTTACGATGTTGCACTCTTATGTCTCAAAGGCCCTCAAGCCAATCTCAACTTCCCTAC 
TTCTTCTTCTTCTCATCATCTTCTTGATAATCTCTTAGATGAAAATACCCTTTTGTCCCC 
CAAATCCATCCAAAGAGTAGCTGCTCAAGCTGCCAACTCATTTAACCATTTTGCCCCTAC 
TTCATCAGCCGTCTCGTCACCGTCCGATCATGATCATCACCATGATGATGGGATGCAATC 
TTTGATGGGATCTTTTGTGGACAATCATGTGTCTTTGATGGATTCAACATCTTCATGGTA 
TGATGATCATAATGGGATGTTCTTGTTTGATAATGGAGCTCCATTCAATTACTCTCCTCA 
ACTAAACTCGACGACGATGCTCGATGAATACTTCTACGAAGATGCTGACATTCCGCTTTG 
GAGTTTCAATTAATCCGACGGTCCATAATACATACTTTAATTAGT 

>G2115 Amino Acid Sequence (conserved domain in AA coordinates :46-115) 

MVKQERKIQTSSTKKEMPLSSSPSSSSSSSSSSSSSSCKNKNKKSKIKKYKGVRMRSWGS 

WSEIRAPNQKTRIWLGSYSTAEAAARAYDVALLCLKGPQANLNFPTSSSSHHLLDNLLD 

ENTLLSPKSIQRVAAQAANSFmiFAPTSSAVSSPSDHDHHHDDGMQSLMGSFvTDNHVSLM 

DSTSSWYDDHNGMFLFDNGAPFNYSPQLNSTTMLDEYFYEDADIPLWSFN* 

>G2130 (41.. 988) 

CCTCTCTTCATTTTTTAACTCCCTCTCTCTCTCTCTCTCTATGGAGAGACGAACGAGACG 
AGTGAAGTTCACAGAGAATCGTACGGTCACAAACGTAGCAGCTACACCATCTAACGGGTC 
TCCGAGACTGGTCCGTATCACTGTTACTGATCCTTTCGCTACTGACTCGTCTAGCGACGA 
CGACGACAACAACAACGTCACGGTGGTTCCAAGAGTGAAACGATAGGTGAAGGAGATTAG 
ATTCTGCCAAGGTGAATCTTCTTCCTCCACCGCGGCGAGGAAAGGTAAGCACAAGGAGGA 
GGAAAGCGTAGTGGTTGAAGATGACGTGTCGACGTCGGTGAAGCCTAAAAAGTACAGAGG 
CGTGAGACAGAGACCTTGGGGAAAATTCGCGGCGGAGATTAGAGATCCGTCGAGCCGTAC 
TCGGATTTGGCTTGGGACTTTTGTCACGGCGGAGGAAGCTGCTATAGCGTACGATAGAGC 
CGCGATTCATCTCAAAGGACCTAAAGCGCTCACGAATTTCCTAACTCCGCCGACGCCAAC 
GCCGGTTATCGATCTCCAAACGGTTTCCGCCTGCGATTACGGTAGAGATTCTCGGCAGAG 
CCTTCATTCACCGACCTCTGTTCTAAGATTCAACGTCAACGAGGAAACAGAGCATGAGAT 
TGAAGCGATCGAGCTATCTCCGGAGAGAAAGTCGACGGTTATAAAAGAAGAAGAAGAATC 
GTCGGCGGGTTTGGTGTTCCCGGATCCGTATCTGTTACCGGATTTATCTCTCGCCGGCGA 
ATGTTTTTGGGATACCGAAATTGCCCCTGACCTTTTGTTTCTCGATGAAGAAACCAAAAT 
CCAATCAACGTTGTTACCAAACACAGAGGTTTCGAAACAAGGAGAAAACGAAACTGAAGA 
TTTCGAGTTTGGTTTGATTGATGATTTCGAGTCTTCTCC^TGGGATGTGGATCATTTCTT 
CGACCATCATCATCACTCTTTCGATTAAAAATCTCTTCTTTTTTGGGG7\AATTTTTGTG 
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*P2130 Amino Acid Sequence (domain in AA coordinates 93-160) 
MERRTRRVKFTENRTVTNVAATPSNGSPRLVRITVTDPFATDSSSDDDDNNNVTVVPRVK 
RYVKMRFCQGES S S STAARKGKHKEEESVVVEDDVSTSVKPKKYRGVRQRPWGKFAAEI 
ROTSSRTRIWLGTFVTAEEAAIAYDRAAIHLKGPKALTNFLTPPTPTPVIDLQTVSACDY 

Sdsrqslhsptsvlrfnvneeteheieaielsperkstvikeeeessaglvfpdpyllp 

DLSLAGECFWDTEIAPDLLFLDEETKIQSTLLPNTEVSKQGENETEDFEFGLIDDFESSP 
WDVDHFFDHHHHSFD* 

CTGTGATTGTCAAG^GTOTGAACACACAAA 

gaIgaaagagagaagagagaaggtccaataatagagagaacaaaaaaaaagagagct^ 

TTGTCAGTTTATTCTCTGCAAACGTGCGGCCTAAGTAACACATGTCGAATTATGGAGTTA 
AAGAGCTCACATGGGAAAATGGGCAACTAACCGTTCATGGTCTAGGCGACGAAGTAGAAC 
CAACCACCTCGAATAACCCTATTTGGACTCAAAGTCTCAACGGTTGTGAGACTTTGGAGT 
CTGTGGTTCATCAAGCGGCTCTACAGCAGCCAAGCAAGTTTCAGCTGCAGAGTCCGAATG 

SSaaaccacaattatgagagcaaggatggatcttgttcaagaaaacgcggtt^ 

Sgaaatggaccgatggttcgctgttcaagaggagagccatagagttggccacagcgtca 

^caagtgcgagtggtaccaatatgtcttgggcgtcttttgaatccggtcggagcttga 

agSSag^ 

aaggagatcaacaagagacaagaggagaagcaggtagatctaatggacgacggggacgag 
SgSgcgattcacaacgagtccgaaaggagacggcgtgataggataaaccagaggatg^ 

GAACACTTCAGAAGCTGCTTCCTACTGCAAGTAAGGCGGATAAAGTCTCAATCTTGGATG 
ATGTTATCGAACACTTGAAACAGCTACAAGCACAAGTACAGTTCATGAGCCTAAGAGCCA 
AC^TCCCACA^CAAATGATGATTCCGCAACTACCTCCACCACAGTCAGTTCTCAGCATCC 
^CACC^CAACAACAACAACAACAGCAGCAGCAGCAGCAACAACAGCAGCAACAGTTTC 
AGATCTCGTTGCTTGCAACAATGGCAAGAATGGGAATGGGAGGTGGTGGAAATGGTTATG 

Sgtttagttcctcctcctcctcctccaccaatgatggtccctcctatgggtaacagag 

ACTGCACCAACGCTTCTTCAGCCACATTATCTGATCCATACAGCGCCTTTTTCGCACAGA 

CAATGAATATGGATCTCTACAATAAAATGGCAGCAGCTATCTATAGACAACAGTCTGATC 

AAACAACAAAGGTAAATATCGGCATGCCTTCAAGTTCTTCGAATCATGAGAAAAGAGATT 

AGTCTAGCGACCTAGTATTATTGATCCATATATATAGTTCTTGAAAGATTGTTGTATCAT 

GATTGTAAAAACTGTTTTGAGTATGGAAAAAGACTTGCAGATAAAA 

>C147 Amino Acid Sequence (domain in AA coordinates :160-234) 

MSNYGVKELTWENGQLTVHGLGDEVEPTTSNNPIWTQSLNGCETLESWHQAALQQPSKF 

OLOSPNGPNHNYESKDGSCSRKRGYPQEMDRWFAVQEBSHRVGHSVTASASGTHMSWASF 

ESGRSLKTARTGDRDYFRSGSETQDTEGDEQETRGEAGRSNGRRGRAAAIHNESERRRRD 

RINQRMRTLQKLLPTASKADKVSILDDVIEHLKQLQAQVQFMSLRANLPQQMMIPQLPPP 

QSVLSIQHQQQQQQQQQQQQQQQQQFQMSLIATMARMGMGGGGNGYGGL^ 
PPMG^CTNGSSATLSDPYSAFFAQTMNMDLYNKMAAAIYRQQSDQTTKVNIGMPSSSS 

NHEKRD* 

GCACATGAArrAATTTGAAGCTTCCCTAGAATTCTTTCACATCAATTAATACGACACCGT 
CTCGGGTGAAGAATCTCTCCTCTCTTGCCCTAAAGCGAGTTAGGGTTTAACACACAAAGC 
ATACCCTTTAGATTTGTGTCTCTrAGCTCTGTTTTTGTCGGCTTGTGTAACCGATCAACT 
CAAGCTATTGGCTCCTCACCTCCTGAAATTTGACTTCTCCAATGGATCTCAAAGTTTCTC 
TTATATGAATTCTATCTTCACCCTCACAATATCTTTATATATATGAGCCACAAGAACAAG 
AAGAGTCAGTAGATGCGGCTGCCATGGACGGTGGTTACGATCAATCCGGAGGAGCTTCTA 
GATACTTTCACAACeTCTTCAGGCCTGAGCTTCATCACCAGCTTCAACCTCAGCCTCAAC 
TTCACCCTTTGCCTCAGCCTCAGCCTCAACCTCAGCCTCAGCAGCAGAATTCAGATGATG 
AATCTGACTCCAACAAGGATCCGGGTTCCGACCCAGTTACCTCTGGTTCAACCGGGAAAC 
GTCCACGTGGACGTCCTCCGGGATCCAAGAACAAGCCGAAGCCACCGGTGATAGTGACTA 
GAGATAGCCCCAACGTGCTTAGATCTCATGTTCTTGAAGTCTCATCTGGAGCCGACATAG 
TCGAGAGCGTTACCACTTACGCTCGCAGGAGAGGAAGAGGAGTCTCCATTCTCAGTGGTA 
ACGGCACGGTGGCTAACGTCAGTCTCCGGCAGCCGGCAACGACAGCGGCTCATGGGGCAA 
ATGGTGGAACCGGAGGTGTTGTGGCTCTACATGGAAGGTTTGAGATACTTTCCCTCACAG 
GTACGGTGTTGCCGCCCCCTGCGCCGCCAGGATCCGGTGGTCTTTCTATCTTTCTTTCCG 



GCGTTCAAGGTCAGGTGATTGGAGGAAACGTGGTGGCTCCGCTTGTGGi 



CTTCGGGTCCAG 
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TGATACTAATGGCTGCATCGTTCTCTAATGCAACTTTCGAAAGGCTTCCCCTTGAAGATG 
AAGGAGGAGAAGGTGGAGAGGGAGGAGAAGTTGGAGAGGGAGGAGGAGGAGAAGGTGGTC 
CACCGCCGGCCACGTCATCATCACCACCATCTGGAGCCGGTCAAGGACAGTTAAGAGGTA 
ACATGAGTGGTTATGATCAGTTTGCCGGTGATCCTCATTTGCTTGGTTGGGGAGCCGCAG 
CCGCAGCCGCACCACCAAGACCAGCCTTTTAGAATTGAAAATTATGTCCGTAACATAGCT 
GTAACCAAATTTCATTTCTC^AAATTAAAAGAAAAAAAAAA 

>G2156 Amino Acid Sequence (domain in AA coordinates : 66-86) 

MDGGYDQSGGASRYFHNLFRPELHHQLQPQPQLHPLPQPQPQPQPQQQNSDDESDSNKDP 

GSDPVTSGSTGKRPRGRPPGSKNKPKPPVIVTRDSPNVLRSHVLEVSSGADIVESVTTYA 

RRRGRGVSILSGNGTVANVSLRQPATTAAHGANGGTGGWALHGRFEILSLTGTVLPPPA 

PPGSGGLSIFLSGVQGQVIGGNWAPLVASGPVILMAASFSNATFERLPLEDEGGEGGEG 

GEVGEGGGGEGGPPPATSSSPPSGAGQGQLRGNMSGYDQFAGDPHLLGWGAAAAAAPPRP 

AF* 

>G2294 (24.. 659) 

TCCTCCCTTAATTAGTATCAAAAATGGTGAAAACACTTCAAAAGACACCAAAGAGAATGT 
CATCTCCATCATCATCATCTTCATCATCCTCATCAACATCATCATCATCCATAAGGATGA 
AGAAGTACAAGGGAGTGAGAATGAGAAGTTGGGGTTCATGGGTTTCAGAGATCAGAGCTC 
CTAATCAAAAGACAAGGATCTGGCTTGGTTCTTACTCAACTGCTGAAGCCGCGGCTAGAG 
CCTACGACGCAGCACTCCTATGTCTTA7VAGGATCCTCAGCTAATAATCTCAACTTCCCAG 
AGATCTCAACTTCTCTCTACCATATTATCAACAATGGTGATAACAACAATGACATGTCCC 
CTAAGTCTATACAAAGAGTAGCAGCTGCAGCTGCTGCTGCCAACACAGATCCTTCCTCAT 
CATCAGTCTCTACTTCATCTCCATTGCTTTCCTCTCCATCTGAAGATCTCTATGATGTTG 
TCTCCATGTCACAGTATGACCAACAAGTCTCCTTGTCTGAATCATCATCATGGTACAACT 
GCTTTGATGGTGATGATCAGTTCATGTTCATTAATGGAGTCTCCGCGCCGTATTTGACAA 
CATCACTTTCTGATGATTTCTTTGAGGAAGGAGATATCAGATTATGGAACTTCTGCTGAT 
TCTACTTTCATTATACCTTATTCTTTG 

>G2294 Amino Acid Sequence (conserved domain inAA coordinates : 32-102) 
IWKTLQKTPKRMSSPSSSSSSSSSTSSSSIRMKKYKGTOMRSWGSWVSEIRAPNQKTRIW 
LGSYSTAEAAARAYDAALLCLKGSSA15NLNFPEISTSLYHI INNGDNNNDMS PKS I QRVA 
AAAAAANTDPSSSSVSTSSPLLSSPSEDLYDWSMSQYDQQVSLSESSSWYNCFDGDDQF 

MFINGVSAPYLTTSLSDDFFEEGDIRLWNFC* 
>G2510 (16.. 594) 

ATAACAAACTCTTTAATGTCACCACAGAGAATGAAGCTATCATCACCACCAGTTACCAAC 
AACGAACCAACCGCCACCGCTTCTGCCGTTAAATCTTGCGGCGGAGGAGGTAAAGAAACC 
AGCTCATCGACCACGAGGCATCCAGTGTACCACGGAGTTCGCAAACGCCGATGGGGAAAA 
TGGGTTTCTGAGATCAGAGAGCCCCGGAAAAAGTCTCGGATTTGGCTCGGATCTTTTCCG 
GTGCCGGAGATGGCTGCTAAGGCCTACGACGTGGCAGCGTTTTGTCTAAAAGGTAGAAAA 
GCTCAGCTGAATTTCCCTGAAGAAATCGAGGATCTACCTCGACCGTCCACGTGTACTCCC 
AGAGATATCCAAGTCGCAGCGGCCAAAGCAGCCAACGCCGTGAAGATCATCAAAATGGGA 
GATGATGACGTGGCAGGAATAGACGACGGAGATGATTTCTGGGAAGGCATTGAGCTGCCT 
GAGCTTATGATGAGTGGAGGTGGGTGGTCG CCGGAG CCTTTTGTTGCCGGAGATGATGCC 
ACGTGGCTTGTCGACGGAGACTTGTATCAGTATCAGTTCATGGCGTGTCTGTGAGTGTTG 
CTGTCGATTGTGTCGTATTCGTTATACGTGTACGTTGTATCGTTATTGTGTTGGCTCACT 
TAATTTAATGCATATGCATGTATATTTTCATTTATTTGTTTCTAGTTTATTGTTTTACGC 
GATCAATAATTAGATACCTGTTTCTCAAGTTAGTO 

ATACGTATAAGTGTATGTTCI^ATATACAGTTTTTGTTTGCATAAGTATTGCTACTTATT 
CTAAAAAAAAAAAAAAAAAAAAAAAAA 

>G2510 Amino A_c id ' Sequence (conserved domain in AA coordinates : 41-108) 
MS PQRMKLS SPP VTNNEPTATASAVKSCGGGGKETS S STTRHP VYHGVRKRRWGKWVSE I 
REPRKKSR I WLGS FPVPEMAAKAYDVAAFCLKGRKAQLNFPEEI EDLPRPSTCTPRDIQV 
AAAKAANAVKI I KMGDDDVAGIDDGDDFWEG I ELPELMMSGGGW S PEPFVAGDDATWLVD 
GDLYQ YQFMACL * 
>G2893 (130.. 981) 

AAATCATAAAAGCCTCTCTCTTAGTCTATTTTTATCTCACGGCTCTCTCTCCCCTCTCTA 
C^CACACAAACACAAATAAAGCGTAAAACTGAAATATTTTAATTA(^^TTAGAAAGAGAA 
CATATTAATATGTCAAATATAACAAAGAAGAAGTGTAATGGAAATGAAGAGGGTGCAGAG 
CAGAGGAAAGGGCCTTGGACACTCGAGGAAGACACTCTTCTCACCAATTACATTTCCCAT 
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AACGGTGAAGGCCGATGGAATCTGCTCGCTAAATCTTCTGGGCTAAAGAGAGCAGGAAAA 

AGTTGTAGATTGAGATGGTTGAATTACCTTAAACCCGACATAAAGCGTGGGAATCTCACT 

CCTCAAGAACAACTTTTAATCCTTGAGCTCCATTCTAAATGGGGTAATAGGTGGTCAAAA 

ATTTCGAAGTATTTACCAGGAAGAACAGACAACGATATCA7VAAACTACTGGAGAACTAGA 

GTCC AG AAACAAG CACGCCAG CTCAACATAGATTCCAATAGCCACAAGTTCATAGAAGTT 

GTTCGTAGCTTTTGGTTTCCAAGACTGATCAACGAGATTAAAGACAACTCATACACCAAC 

AATATTAAAGCTAATGCTCCTGATTTACTTGGACCAATTTTACGAGACAGCAAAGATTTG 

GGTTTCAACAACATGGATTGTTCCACTTCCATGTCAGAAGATCTCAAGAAAACTTCACAA 

TTCATGGATTTTTCTGATCTTGAAACCACAATGTCCTTGGAAGGATCACGAGGGGGTAGT 

AGTCAATGTGTGAGTGAGGTTTATAGCTCCTTCCCTTGCCTAGAGGAGGAGTACATGGTG 

GCCGTTATGGG CAGTTCAGACATTTCAG C ATTGCATG ATTGTCACGTGGCTGATTCCAAG 

TACGAGGATGATGTGACACAAGATCTAATGTGGAACATGGATGACATTTGGCAGTTTAAC 

GAGTATGCACACTTTAATTAGGTTATATTATATTTATGTACTTCTTACAACTTGGAGGGG 

TTTATCGGTCTTTTATTAAATTTTGATTGTTTTGGATTCCTTAAAAATGTGTTCTTATTA 

TAGTTTTTAATGAAAAAAATGTTTAAGCG CAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

>G2893 Amino Acid Sequence (conserved domain in AA coordinates : 19-120) 

MSNITKKKCNGNEEGAEQRKGPWTLEEDTL^ 

LRWLNYLKPDIKRGNLTPQEQLLILELHSKWG^ 

QARQLNIDSNSHKFIEWRSFWFPRLINEIKDNSYTNNIKANAPDLLGPILRDSKDLGFN 
NMDCSTSMSEDLKKTSQFMDFSDLETTMSLEGSRGGSSQCVSEVYSSFPCLEEEYMVAVM 
GSSDISALHDCHVADSKYEDDVTQDLMWNMDDIWQFNEYAHFN* 
>G340 (97.. 834) 

ATGAAATCTCTGTAGTTTTTTTTTGTTCCTTTCTTAAATTTCGAAAGAAAGACATTTATT 
AAACCAAAATAACTCTTTAGATCATTGCAAGGAAAAATGTTGAAAAGTGCAAGTCCAATG 
GCATTCTACGATATCGGAGAGCAGCAATACTCTACTTTCGGGTACATTTTAAGCAAACCT 
GGGAACGCAGGAGCTTACGAGATTGACCCTTCGATCCCAAACATCGACGATGCGATCTAC 
GGCTCAGATGAGTTCCGTATGTACGCTTACAAAATCAAACGGTGTCCTCGTACTCGTAGC 
CACGACTGGACGGAGTGTCCCTACGCTCACCGTGGCGAGAAAGCCACACGCCGTGATCCT 
CGCCGTTACACTTACTGTGCAGTCGCATGCCCGGCTTTCCGAAATGGCGCATGCCACCGT 
GGCGACTCATGCGAATTCGCACATGGCGTATTCGAGTACTGGCTCCACCCGGCGCGTTAC 
CGAACACGCGCATGTAACGCCGGGAACTTGTGTCAGAGGAAAGTGTGTTTCTTTGCCCAC 
GCGCCGGAGCAGCTAAGGCAGTCTGAAGGAAAGCACAGGTGCAGGTACGCATATAGGCCG 
GTGAGGGCTAGAGGTGGTGGAAACGGCGATGGAGTGACGATGAGAATGGACGACGAGGGT 
TACGACACGTCACGGTCTCCGGTGAGAAGCGGGAAAGATGATTTAGATAGTAACGAGGAG 
AAGGTGTTGTTGAAGTGTTGGAGTCGGATGAGCATTGTGGATGATCATTATGAGCCGTCC 
GATTTGGATTTGGATTTGTCACACTTTGATTGGATCTCAGAGTTGGTCGATTAAATTTGG 
GAAATCAAAGCAGAGAACAAAAGAAACCCGATAAATAAAGTGGATTTTGTTAAAATCCAC 
AAGATC^GATT(^^GATGAGAGATCTTGTC^TGTATATGGTAAATTTAATTGTAATGAT 
TTATTGC^TGTCGCAAAAGAAGTTACTTCTCTTTGCATGTAAAC^GATTCTTGATCTTC 
TATAAGTCTTTGTATTAA 

>G340 Amino Acid Sequence (domain in AA coordinates: 37-154) 

MLKSASPMAFYDIGEQQYSTFGYILSKPGNAGAYEIDPSIPNIDDAIYGSDEFRMYAYKI 

K^CPRTRSHDWTECPYAHRGEKATRRDPRRYTYCAVACPAFRNGACHRGDSCEFAHGVFE 

YWLHPARYRTRACNAGNLCQRKVCFFAHAPEQLRQSEGKHRCRYAYRPVRARGGGNGDGV 

TMRMDDEGYDTSRSPTOSGKDDLDSNEEKVLLKCWSRMSrTODHYEPSDLDLDL 

SELVD* 

>G39 {75.. 638) 

GTTTCCACAGTCCCTOTACTTGTGCATAAAACTGTAAAACACTACTCTGAAAATTTTGCT 
TCTGTTAGGATATAATGCCACCCTCrCCTCCTAAATCTCCTTTTATTAGCTCTTCACT^ 
AAGGAGCTCATGAAGATCGCAAATTTAAATGCTATAGGGGTGTCCGAAAGAGGTCTTGGG 
GCAAATGGGTGTCTGAAATCAGAGTTCCAAAGACTGGACGACGAATATGGCTAGGTTCAT 
ACGATGCTCCAGAGAAGGCAGCTAGAGCCTATGATGCTGCTTTGTTCTGTATTAGGGGTG 
AGAAGGGAGTTTACAATTTTCCCACTGATAAAAAGCCGCAGCTTCCAGAAGGTTCTGTCC 
GGCCTCTGTCCAAGCTCGACATACAGACAATAGCAACAAAC 

ATGTACCTTCCCATGCCACCACACTCCCGGCAACAACCCAGGTTCCCTCTGAAGTTCCTG 
CTTCCTCTGATGTTTCTGCTTCTACTGAGATTACAGAGATGGTCGATGAATATTATCTCC 
CAACCGATGCAACTGCAGAATCAATATTCTCAGTTGAAGACTTACAACTGGACAGTTTCC 
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TCATGATGGACATTGATTGGATAAACAATCTAATCTGATGTGTAACGTCACTTGCAGTGA 
CTTGTCTTTGAACTTGTTTTATTTAGCATGCAAA 

>G39 Amino Acid Sequence (domain in AA coordinates: 24-90) 
MPPSPPKSPFISSSLKGAHEDRKFKCYRGVRKRSWGKVA7SEIRVPKTGRRIWLGSYDAPE 

SydaSfcirgekgvynfptdkkpqlpegsvrplskldiqtiatnyassvvhvpsh 

ATTLPATTQVPSEVPASSDVSASTEITEMVDEYYLPTDATAESIFSVEDLQLDSFbMMDI 
DWINNLI* 

tataLtStcgStctacttttttttcttccataatatagtc^ttcgttttcttaatt 
I?gScttctctttgtttctccaatctttattagtttatttatttattttggttattg 
tatacaaatggcaatggctttaaacatgaatgcttacgtagacgagttcatggaagctct 
SScca^catgaaggtaacttcatcttcttctacttcgaattcatcaaatccaaaac^ 
attaactcctaatttcatccctaataatgaccaagtcttaccggtatctaaccaaaccgg 

TCCGATTGGGCTAAACCAGCTCACTCCAACACAAATCCTCCAAATTCAGACAGAGTTACA 

ScSgcaaaaccaatctcgtcgtcgcgctggtagtcatcttctcaccgct^ 

CTCAATGAAGAAAATCGACGTAGCAACTAAACCGGTTAAACTATACCGAGGCGTAAGACA 

SStcgggt^tgggtagctgagattcggctacctaaaaaccgaacccggttatg 

GCTCGGTACGTTCGAAACGGCTCAAGAAGCTGCATTAGCTTACGATCAAGCAGCTCATAA 
GATCAGAGGAGACAACGCTCGTCTCAATTTCCCAGACATTGTTCGTCAAGGACACTATAA 

acIgatattgtctccgtctatcaacgcaaagatcgaatccatcxgcaatagttctgatct 

TCCACTGCCTCAGATCGAGAAACAGAACAAAACAGAGGAGGTGCTCTCTGGTTTTTCCAA 

TGAGTCGGATATAACGTTGTTGGATTTCTCAAGCGACTGTGTGAAAGAAGATGAGAGTTT 
OTGATGGGTTTGCACAAGTATCCTTCTTTGGAGATTGATTGGGACGCTATAGAGAAACT 

™tgaatccattttatctttttgattcatttgtctctaaattgtagaattttattttc 

AGAGCTTTGTAAGGGAAGTTCTTGAATGAGAGTTGCAGAGGACTAGTGGAACCTAACTCT 
GTTTTCTTTTGTAAGTATTGTTTATAATGGGCCGTTGAATGGGCCTTATTGATTTAAACA 

GCCCAAGTTTTTAAAAAAAAAAAAAAAAAAAAAAAAA 

>G439 Amino Acid Sequence (domain in AA coordinates: 110^77) 
MAMAI^AYVIJEFMEAI^PFMKVTSSSSTS^ 

GLNQLTPTQILQIQTELHLRQNQSRRRAGSHLLTAKPTSMKKIDVATKPVKLYRGVRQRQ 
WGKWVAEIRLPKNRTRLWLGTFETAQEAALAYDQAAHKIRGDNARLNFPDIVRQGHYKQI 
LSPSINAKIESICNSSDLPLPQIEKQNKTEEVLSGFSKPEKEPEFGEIYGCGYSGSSPES 
DITLLDFSSDCVKEDESFLMGLHKYPSLEIDVIDAIEKLF* 

ATCGCGAGTTCGGAGGTTTCAATGAAAGGTAATCGTGGAGGAGATAACTTCTCCTCCTCT 

GGTTTTAGTGACCCTAAGGAGACTAGAAATGTCTCCGTCGCCGGCGAGGGGCAAAAAAGT 

AATTCTACCCGATCCGCTGCGGCTGAGCGTGCTTTGGACCCTGAGGCTGCTCTTTACAGA 

GAGCTATGGCACGCTTGTGCTGGTCCGCTTGTGACGGTTCCTAGACAAGACGACCGAGTC 

TTCTATTTTCCTCAAGGACACATCGAGCAGGTGGAGGCTTCGACGAACCAGGCGGCAGAA 

CAACAGATGCCTCTCTATGATCTTCCGTCAAAGCTTCTCTGTCGAGTTATTAATGTAGAT 

TTAARGGCAGAGGCAGATACAGATGAAGTTTATGCGCAGATTACTCTTCTTCCTGAGGCT 

AATCAAGACGAGAATGCAATTGAGAAAGAAGCGCCTCTTCCTCCACCTCCGAGGTTCCAG 

GTGCATTCGTTCTGCAAAACCTTGACTGC^TCCGACACAAGTACAC^TGGTGGATTTTCT 

GTTCTTAGGCGACATGCGGATGAATGTCTCCCACCTCTGGATATGTCTCGACAGCCTCCC 

ACTCAAGAGTTAGTTGCAAAGGATTTGCATGCAAATGAGTGGCGATTCAGACATATATTC 

CGGGGTCAACCACG8AGGCATTTGCTACAGAGTGGGTGGAGTGTGTTTGTTAGCTCCAAA 

RGGCTAGTTGCAGGCGATGCGTTTATATTTCTAAGGGGCGAGAATGGAGAATTAAGAGTT 

GGTGTAAGGCGTGCGATGCGACAACAAGGAAACGTGCCGTCTTCTGTTATATCTAGCCAT 

AGCATGCATCTTGGAGTACTGGCCACCGCATGGCATGCCATTTCAACAGGGACTATGTTT 

ACAGTCTACTACAAACCCAGGACGAGCCCATCTGAGTTTATTGTTCCGTTCGATCAGTAT 

ATGGAGTCTGTTAAGAATAACTACTCTATTGGCATGAGATTCAAAATGAGATTTGAAGGC 

GAAGAGGCTCCTGAGCAGAGGTTTACTGGCACAATCGTTGGGATTGAAGAGTCTGATCCT 

ACTAGGTGGCCAAAATCAAAGTGGAGATCCCTCAAGGTGAGATGGGATGAGACTTCTAGT 

ATTCCTCGACCTGATAGAGTATCTCCGTGGAAAGTAGAGCCAGCTCTTGCTCCTCCTGCT 

TTGAGTCCTGTTCCAATGCCTAGGCCTAAGAGGCCCAGATCAAATATAGCACCTTCATCT 
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tcagccaggcatgaacctacttacacagatttgctctccggctttgggS 

ccatcccatggtc^gcggatacctttttatgaccattcatcatcaccttSmgJ™ 

aagagaatcttgagtgattcagaaggcaagttcgattatcttgctaaccaSggS 

atacactctggtctctccctgaagttacatgaatctcctaaggtacc?gcagS 

gcgtctctccaagggcgatgcaatgttaaatacagcgaatatcctgttcSSS 

tcgactgagaatgctggtggtaactggccaatacgtcc^^^ 

^g^gtcaatgctcaagcgcaagctcaggctagggagcaagtaacaaS 
acgatacaagaggagacagcaaactcaagagaagggaactgcaggcotgS 

CATCCGAAGGATGCTCAAACGAAAACCAACTCAAGTAGGAGTTGCACAAAGGTTCACAAO 

c^aattgcacttggccgttcagtggatctttc^ 

gtcgctgagctggacaggctgtttgagttcaatggagagSgaSSS 
tggttgatagtttacacagatgaagagaatgatatgatccttgttgSgaS 

CAGGAGTTTTGTTGCATGGTTCGCAAAATCTTCATATACACGAflAGA^ 
?^ CCCGGG ^^^ 

>G470 Ammo Acid Sequence (domain in AA coordinates- fii ioCT 



ELWHACAGPLVTVPRQDDRVFYFPQGHIEQVEASTOQAAEQQMPLYDLPSKLLCRVD > 
UCAEADTDEVYAQITLLPEANQDBNAIEKEAPLPPPPRFQW 



======= 

===k=== =^=~ii 

^^wsgsrrygsenwmssarheptytollsgfgtnidpshgqripftohsSpKp? 
^ sd segkfdyi^qwqmihsgl SLK lhespk^^ 

s^ggnwpirpralnyyeevwaqaqaqareqvtkqpftiq 

JGELMAPKKD 

DAKDAKSASNPSLSSAGNS * ~ """ »RKMNPGTIiSCRSEEEAWGEGS 



>G652 (1..606) 



S?* 99 * 9 ^!!! 9 ^!^^ 

icctagcgac 



ttc^S ataC ^ a9aa9g " ttt99tttCatcacaccta 9cgacggtggtgacgatctc 



^ggtgctcccgttcagggtaacagcggtgitggtggtt^ 
c 9 tt g f aCa90t9t99a ^^^ 

>G671 (61.. 1119) 

TTCACTTGAGAACAACCCCCTTTGAACTCGATCAAGAAAGCTAAGTTTGAAGAATCAAGA 
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ATGGTGCGGACACCGTGTTGCAAAGCCGAACTAGGGTTAAAGAAAGGAGCTTGGACTCCC 

GAGGAAGATCAGAAGCTTCTCTCTTACCTTAACCGCCACGGTGAAGGTGGATGGCGAACT 

CTCCCCGAAAAAGCTGGACTCAAGAGATGCGGCAAAAGCTGCAGACTGAGATGGGCCAAT 

TATCTTAGACCTGACATCAAAAGAGGAGAGTTCACTGAAGACGAAGAACGTTCAATCATC 

TCTCTTCACGCCCTTCACGGCAACAAATGGTCTGCTATAGCTCGTGGACTACCAGGAAGA 

ACCGATAACGAGATCAAGAACTACTGGAACACTCATATCAAAAAACGTTTGATCAAGAAA 

GGTATTGATCCAGTTACACACAAGGGCATAACCTCCGGTACCGACAAATCAGAAAACCTC 

CCGGAGAAACAAAATGTTAATCTGACAACTAGTGACCATGATCTTGATAATGACAAGGCG 

AAGAAGAACAACAAGAATTTTGGATTATCATCGGCTAGTTTCTTGAACAAAGTAGCTAAT 

AGGTTCGGAAAGAGAATCAATCAGAGTGTTCTGTCTGAGATTATCGGAAGTGGAGGCCCA 

CTTGCTTCTACTAGTCACACTACTAATACTACAACTACAAGTGTTTCCGTTGACTCTGAA 

TCAGTTAAGTCAACGAGTTCTTCCTTCGCACCAACCTCGAATCTTCTCTGCCATGGGACC 

GTTGCAACAACACCAGTTTCATCGAACTTTGACGTTGATGGTAACGTTAATCTGACGTGT 

TCTTCGTCCACGTTCTCTGATTCCTCCGTTAACAATCCTCTAATGTACTGCGATAATTTC 

GTTGGTAATAACAACGTTGATGATGAGGATACTATCGGGTTCTCCACATTTCTGAATGAT 

GAAGATTTCATGATGTTGGAGGAGTCTTGTGTTGAAAACACTGCGTTCATGAAAGAACTT 

ACGAGGTTTCTTCACGAGGATGAAAACGACGTCGTTGATGTGACGCCGGTCTATGAACGT 

CAAGACTTGTTTGACGAAATTGATAACTATTTTGGATGAGTGAAACTCATAATCGATGAA 

TCCCACGTGACCATGTCAATATGATGTCTATGGATATGTTACCTTGATGATGTTGATGGT 

AATAATAATAAATAATAGATGGTGATGATGACCATGCATGAATCATGAATGTAGTTCGTG 

TTGTCACATATGCTTGTGTTTTTGTGTTTTTTTTTTTTGGTCTGAAGTGTGTTGTTTCGT 

TGTAAATGGATTATAAATGGTGATGTAATAATTATAATGTTAAAAAAAAAAAAAAAAAAA 

AAAA 

>G671 Amino Acid Sequence (domain in AA coordinates: 15-115) 
MVRTPCCKAELGLKKGAWTPEEDQKLLSYLNRHGEGGWRTLPEKAGLKRCGKSCRLRWAN 
YLRPDIKRGEFTEDEERSIISLHALHGNKWSAIARGLPGRTDNEIKNYTOTHIKKRLI^ 
GIDPVTHKGITSGTDKSENLPEKQNVN^^ 

RFGKRINQSVLSEIIGSGGPIASTSHTTNTTTTSVSVDSESVKSTSSSFAPTSNLLCHGT 

VATTPVSSNFDVDGNVNLTCSSSTFSDSSVNNPLMYCDNFVGNNNVDDE 

EDFmLEESCVENTAFMKELTRFLHEDENDVVDVTPVYERQDLFDEIDNYFG* 

>G779 (110.. 712) ^ m 
GACATGCATGTAAGCATTCGGTTAATTAATCGAGTCAAAGATATATATCAGTAAATACAT 

• ATGTGTATATTTCTGGAAAAAGAATATATATATTGAGAAATAAGAAAAGATGAAAATGGA 

AAATGGTATGTATAAAAAGAAAGGAGTGTGCGACTCTTGTGTCTCGTCCAAAAGCAGATC 

CAACCACAGCCCCAAAAGAAGCATGATGGAGCCTCAGCCTCACCATCTCCTCATGGATTG 

GAACAAAGCTAATGATCTTCTCACACAAGAACACGCAGCTTTTCTCAATGATCCTCACCA 

TCTCATGTTAGATCCACCTCCCGAAACCCTAATTCACTTGGACGAAGACGAAGAGTACGA 

TGAAGACATGGATGCGATGAAGGAGATGCAGTACATGATCGCCGTCATGCAGCCCGTAGA 

CATCGACCCTGCCACGGTCCCTAAGCCGAACCGCCGTAACGTAAGGATAAGCGACGATCC 

TCAGACGGTGGTTGCTCGTCGGCGTCGGGAAAGGATCAGCGAGAAGATCCGAATTCTCAA 

GAGGATCGTGCCTGGTGGTGCGAAGATGGACACAGCTTCCATGCTCGACGAAGCCATACG 

TTACACCAAGTTCTTGAAACGGCAGGTGAGGATTC 

TCCTATGGCTAACCCCTCTTACCTTTGTTATTACCACAACTCCCAACCCTGATGAACTAC 

ACAGAAGCTCGCTAGCTAGACATTTGGTGTCATCCTCTCAACCTTT 

>G779 Amino Acid Sequence (domain in AA coordinates: 126-182) 

MKMENGMYKKKGVCDSCVSSKSRSNHSPKRSMMEPQP 

DPHHLMLDPPPETLIHLDEDEEYDEDMDAMKEMQYMIAVMQPVDIDPATVPKPNRRNVRI 
SDDPQTVVARRRRERISEKIRILKRIVPGGAKMDTASMLDEAIRYTKFLKRQVRILQPHS 

QIGAPMANPSYLCYYHNSQP* 

>G962 (148.. 1392) mrw 
CGTCGACTCTCTACTCAACACCACTCAATTTCATCTCTCTTTTTCCCTTCCATTGTTAGT 

ATAAAAACCAAGCAAACCCTTAATCACTTTTCATCATCATATATCACCTTAATCCACATG 

CATACACATATCTAGTCTTTTTGATATATGGCAATTGTATCCTCCACAACAAGCATCATT 

CCCATGAGTAACCAAGTCAACAATAACGAAAflAGGTATAGAAGACAATGATCATAGAGGC 

GGCCAAGAGAGTCATGTCCAAAATGAAGATGAAGCTGATGATCATGATCATGACATGGTC 

ATGCCCGGATTTAGATTCCATCCTACCGAAGAAGAACTGATAGAGTTTTACCTTCGCCGA 

AAAGTTGAAGGCAAACGCTTTAATGTAGAACTCATCACTTTCCTCGATCTTTATCGCTAT 
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GATCCTTGGGAACTTCCTGCTATGGCGGCGATAGGAGAGAAAGAGTGGTACTTCTATGTG 

CCAAGAGATCGGAAATATAGAAATGGAGATAGACCGAACCGAGTAACGACTTCAGGATAT 

TGGAAAGCCACCGGAGCTGATAGGATGATCAGATCGGAGACTTCTCGGCCTATCGGATTA 

AAGAAAACCCTAGTTTTCTACTCTGGTAAAGCCCCTAAAGGCACTCGTACTAGTTGGATC 

ATGAACGAGTATCGTCTTCCGCACCATGAAACCGAGAAGTACCAAAAGGCTGAAATATCA 

TTGTGCCGAGTGTACAAAAGGCCAGGAGTAGAAGATCATCCATCGGTACCACGTTCTCTC 

TCCACAAGACATCATAACCATAACTCATCGACATCATCCCGTTTAGCCTTAAGACAACAA 

CAACACCATTCATCCTCCTCTAATCATTCCGACAACAACCTTAACAACAACAACAACATC 

AACAATCTCGAGAAGCTCTCCACCGAATATTCCGGCGACGGCAGCACAACAACAACGACC 

ACAAACAGTAACTCTGACGTTACCATTGCTCTAGCCAATCAAAACATATATCGTCCAATG 

CCTTACGACACAAGCAACAACACATTGATAGTCTCTACGAGAAATCATCAAGACGATGAT 

GAAACTGCCATTGTTGACGATCTTCAAAGACTAGTTAACTACCAAATATCAGATGGAGGT 

AACATCAATCACCAATACTTTCAAATTGCTCAACAGTTTCATCATACTCAACAACAAAAT 

GCTAACGCAAACGCATTACAATTGGTGGCTGCGGCGACTACAGCGACAACGCTAATGCCT 

CAAACTCAAGCGGCGTTAGCTATGAACATGATTCCTGCAGGAACGATTCCAAACAATGCT 

TTGTGGGATATGTGGAATCCAATAGTACCAGATGGAAACAGAGATCACTATACTAATATT 

CCTTTTAAGTAATTTAATTAGATCATGATTATTATCCATGACAATAATTAATGCTGCTTT 
GCGC 

>G962 Amino Acid Sequence (domain in AA coordinates- 53-175) 
MAIVSSTTSIIPMSNQVNNNEKGIEDNDHRGGQESHVQNBDEADDHDHDMVMPGFRFHPT 
EEELIEFYLRRKVEGKRFNVELITFLDLYRYDPWELPAMAAIGEKEWYFYVPRDRKYRHG 
DRPNRVTTSGYWKATGADRMIRSETSRPIGLKKTLVFYSGKAPKGTRTSWIMNEYRLPHH 
ETEKYQKAEISLCRVYKRPGVEDHPSVPRSLSTRHHNHNSSTSSRLALRQQQHHSSSSNH 
SDlTOLNNNMJINl^EKIiSTEYSGDGSTTTTTTNSNSDVTIALANQNIYRPMPYDTSNNTL 
IVSTRNHQDDDETAIVDDLQRLVWYQIS0GGNINHQYFQIAQQFHHTQQQNANANALOLV 

AAATTATTLMPO/TQAALAMNMIPAGTIPNNALWDMWjMPIVPDGNRDHYTNIPFK* 
>G977 (46.. 591) 

CACCAAACTCACCTGAAACCCTATTTCCATTTACCATTCACACTAATGGCACGACCACAA 

CAACGCTTTCGAGGCGTTAGACAGAGGCATTGGGGCTCTTGGGTCTCCGAAATTCGTCAC 

CCTCTCTTGAAAACAAGAATCTGGCTAGGGACGTTTGAGACAGCGGAGGATGCAGCAAGG 

GCCTACGACGAGGCGGCTAGGCTAATGTGTGGCCCGAGAGCTCGTACTAATTTCCCATAC 

AACCCTAATGCC^TTCCTACTTCCTCTTCCAAGCTTCTATCAGCAACTCTTACCGCTAAA 

CTCCACAAATGCTACATGGCTTCTCTTCAAATGACCAAGCAAACGCAAACACAAACGCAA 

ACGCAGACCGCAAGATCACAATCCGCGGACAGTGACGGTGTGACGGCTAACGAAAGTCAT 

TTGAACAGAGGAGTAACGGAGACGACAGAGATCAAGTGGGAAGATGGAAATGCGAATATG 

CAACAGAATTTTAGGCCATTGGAGGAAGATCATATCGAGCAAATGATTGAGGAGCTGCTT 

CACTACGGTTCCATTGAGCTTTGCTCTGTTTTACCAACTCAGACGCTGTGAGAAATGGCC 
^CGTTTTAGCGTATTCTTTTCATTTTTATTTTTC 

GTGATGAGAGTAGTAGTGAGAGAAGGCTAATTTCAAGACATTTTGATCTGAATTGGCCTC 
TTTTCAAACACTGATTCTAGTTTCTATAAGAGCAATCGATCATATGCTATGTTATGTATA 
GTATTATAAAAAAATGTTATTTTCTGATTNAAAAAAAAAAAAAAAAAAAAAAA 
>G977 Ammo Acid Sequence (domain in AA coordinates- 5-72) 

MARPQQRFRGVRQPJIWGSWVSEIRHPLLKTRIWLGTFETAEDAARAYDEAARLMCGPRAR 
TNFPYNPNAIPTSSSKLLSATLTAKLHKCYMASLQMTKQTQTQTQTQTARSQSADSDGVT 
ANESHLNRGVTETTEIKWEDGNANMQQNFRPLEEDHIEQMIEELLHYGSIELCSVLPTQT 

>G1063 (241.. 966) 

GTTAAAGAAGATGGATGGGCCACAAGTTGCTATATAAATCCTTCCACTTCTTGTTGTATA 

CTATTGCTTGAGTTCTGATTGGGCACAGTAGTACCAlTGCCArrTCTCTCACACATACCG 

TCTCTTTCTCTCATCATCAATCATCAATCATCCAAAAGAAAAAACCCTAAAATTTCACTT 

GTAAGCTTTTCACCAGTTTCTCTCCATACCCATTTTATCAGCTTCTCCATATCTTTCT 

^^^ CTCA ^ T ^ TG ^^ TCATGATG ^ T ^ GA TGGAGAAGCTTCCTGAGTTTTGT 

AACCCTAATTCCTCTTTCTTCTCTCCCGACCIACIAACAACACTTACCCTTTTCTCTTTAAC 
J GC ^ TCATTACCAGTCCGAT ^CTCAATGAC^ 

GGTTTACTCACTAACCCTTCTTCTATCTCTCCCAACACAGCTTACTCTTCCGTTTOTCTT 
GACAAAAGAAACAACAGTAACAACMCAATAATGGCACGAACATGGCAGCTATGCGAGAG 
ATGATCTTCCGTATCGCCGTGATGCAACCGATCCATATCGATCCCGAGGCGGTTAAGCCA 
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CCGAAGAGGAGGAACGTCAGGATCTCTAAAGATCCTCAAAGCGTGGCGGCTAGGCATAGA 
AGGGAGAGAATAAGCGAGAGGATTCGGATTTTGCAACGGCTTGTTCCTGGTGGGACGAAG 
ATGGATACAGCTTCGATGCTCGATGAAGCAATTCATTATGTGAAGTTTTTAAAGAAACAG 
GTGCAGTCTCTGGAGGAGCAGGCGGTGGTTACTGGCGGAGGGGGAGGAGGAGGAGGAAGG 
GTTTTGATCGGTGGAGGTGGAATGACGGCGGCGAGTGGTGGTGGTGGCGGCGGGGGAGTG 
GTTATGAAAGGGTGTGGAACAGTGGGGACTCATCAGATGGTGGGCAATGCACAGATTCTT 
AGATGATGATGATTTTTAATTTTATTATTATTATATTAATGTTGGAGAAAAAGAGAAAAA 
TGATTCTGGAGAGGGAAGCCAAGTAATTTATGTGAGAGTCTTTAATTTAACTTTATTTTC 
TTGTTTAGATAATGTGTAATGATGGTTTTTAAAGCCAAAGACTCTCCATGGTTGTTGGAG 

CGAGTTTG 

>G1063 Amino Acid Sequence (domain in aa coordinates: 131-182) 
MDSDIMNMMMHQMEIO.PEFCNPNSSFFSPDHWNTYPFLFNSTHYQSDHSMTNEPGFRYGS 

GLLTNPSSISPNTAYSSVFLDK11NNS 

PKRRNVRISKDPQSVAARHRRERISERIRILQRLVPGGTKMDTASMLDEAIHYVKFLKKQ 
VQSLEEQAWTGGGGGGGGRVLIGGGGMTAASGGGGGGGVVMKGCGTVGTHQMVGNAQIIi 

R* 

>G1140 (67.-729) 

ATCCAAGATCCTCCAACTCACAGAAAGGCAGATTCAAGAACAGTAGTGAAGGAGAGATCT 

GGTAAAATGGCGAGAGAGAAGATAAGGATAAAGAAGATTGATAACATAACAGCGAGACAA 

GTTACTTTCTCAAAGAGAAGAAGAGGAATCTTCAAG7VAAGCCGATGAACTTTCAGTTCTT 

TGCGATGCTGATGTTGCTCTCATCATCTTCTCTGCCACCGGAAAGCTCTTCGAGTTCTCC 

AGCTCAAGAATGAGAGACATATTGGGAAGGTATAGTCTTCATGCAAGTAACATCAACAAA 

TTGATGGATCCACCTTCTACTCATCTCCGGCTTGAGAATTGTAACCTCTCCAGACTAAGT 

AAGGAAGTCG7VAGACAAAACCAAGCAGCTACGGAAACTGAGAGGAGAGGATCTTGATGGA 

TTGAACTTAGAAGAGTTGCAGCGGCTGGAGAAACTACTTGAATCCGGACTTAGCCGTGTG 

TCTGAAAAGAAGGGCGAGTGTGTGATGAGCCAAATTTTCTCACTTGAGAAACGGGGATCG 

GAATTGGTGGATGAGAATAAGAGACTGAGGGATAAACTAGAGACGTTGGAAAGGGCAAAA 

CTGACGACGCTTAAAGAGGCTTTGGAGACAGAGTCGGTGACCACAAATGTGTCAAGCTAC 

GACAGTGGAACTCCCCTTGAGGATGACTCCGACACTTCCCTGAAGCTTGGGCTTCCATCT 

TGGGAATGAATCTGAGAGAGAGAAAGATCCAGCAGAGTTGACTTCGATGGAAGCCCACAA 

ATATTAAGTCTACCTTTTCCCTTTCTTTTCTTTGAATAAGTGTTGAAAAAGAATTGAGAT 

GGGAAGGATGAATTCTCATTGCATTGCAGAGAAGCAAGTTTCAGATATTGTACGTGTTAT 

TGGGTCTTTATAACTATTTTTCTCCCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

>G1140 Amino Acid Sequence (conserved domain in AA coordinates : 2-57) 

MAREKIRIKKIDNITARQVTFSKRRRGIFKKADELSVLCDADVALIIFSATGKLFEFSSS 

RMRDILGRYSLHASNINKLlTOPPSTm^ 

LEELQRLEKLLESGLSRVSEKKGECVWSQIFSLEKRGSELV1DENKRLRDKLETLERAKLT 

TLKEALETESVTTNVSSYDSGTPLEDDSDTSLKLGLPSWE* 

>G1425 (43.. 1005) 

ACTCTCTCAAACCATAAAAAATATTCTCCGATCATCATTTTAATGGAGAGTACAGATTCT 
TCCGGTGGTCCTCCGCCGCCGCAACCAAACCTCCCTCCAGGATTCCGGTTTCATCCAACA 
GACGAAGAACTTGTAATTCATTACCTCAAACGCAAAGCAGATTCTGTTCCTTTACCAGTC 
GCGATCATCGCCGACGTTGATCTTTACAAATTTGATCCATGGGAACTTCCCGCGAAAGCT 
TCGTTTGGAGAACAAGAATGGTATTTTTTCAGTCCAAGAGATCGGAAATATCCCAACGGA 
GCTAGACCTAACCGAGCTGCGACTTCCGGTTATTGGAAAGCGACTGGTACAGATAAACCG 
GTGATTTCAACCGGCGGTGGTGGTAGTAAAAAAGTGGGAGTTAAAAAGGCTCTAGTGTTT 
TACAGTGGTAAACCACCAAAAGGAGTTAAATCAGATTGGATTATGCATGAATATCGGTTA 
ACTGATAATAAACCEACTCACATTTGTGACTTCGGCAACAAGAAAAACTCTCTCAGGCTT 
GATGATTGGGTGTTGTGTCGTATCTAC^AGAAAAACAATAGTACAGCATCTAGACATCAT 
CATCATCTTCATCATATTCATCTAGATAATGATCATCATCGTCATGATATGATGATTGAT 
GATGATCGATTCCGTCATGTTCCTCCTGGTCTTCACTTCCCGGCGATTTTTTCTGACAAT 
AATGATCCGACGGCTATATATGATGGTGGCGGCGGCGGATACGGAGGTGGAAGTTACTCG 
ATGAATCATTGTTTCGCATCTGGATCAAAGCAGGAGCAGTTGTTTCCACCGGTGATGATG 
ATGACTAGTCTAAATCAAGATTCCGGTATTGGATCGTCGTCGTCACCTAGGAAGAGATTT 
AACGGCGGCGGCGTTGGAGATTGTTCGACTTCTATGGCGGCGACGCCGTTAATGCAGAAC 
C^GGTGGGATTrACCAATTGCCTGGTTTGAATTGGTATTCTTGAAAACAATTTACGATG 
AAGAATTTTTAAAATTTGTGTATATATATACGGTTTGAGTGATTAGGGGGCATTGGGGGA 
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TTTATTTACGGTTGATTATTATTGTAGTGTTATAGAACTAAGGAGATTAAATTAAATAGA 
TTGGAGGAAAAAAAAAAAAAAAAA 

>G1425 Amino Acid Sequence (domain in AA coordinates: 20-173) 
MESTDSSGGPPPPQPNLPPGFRFHPTDEELVIHYLKRKADSVPLPVAIIADVDLYKFDPW 
ELPAICASFGEQEWYFFSPRDRKYPNGARPNRAATSGYWKATGTDKPVISTGGGGSKKVGV 

KKAL VF Y S G KP PKGVKSD W I MHE YRLTDNKPTH I CD FGNKKNS LRLDD WVLCR I YKKNNS 
TASRHHHHLHHIHLDNDHHRHDMMIDDDRFRHVPPGLHFPAIFSDNNDPTAIYDGGGGGY 
GGGSYSMNHCFASGSKQEQLFPPVMMMTSLNQDSGIGSSSSPSKRFNGGGVGDCSTSMAA 

TPLMQNQGGIYQLPGLNWYS* 
>G1449 (105.. 581) 

TAGACAGAGAGAAATAGAAATAGAGAGAGAGAGACATGAAGAGCACTCTCAATAGAGAAG 
AGAAGGAAGCATGAAGCTAGCTCTGCAGCTTCAAGGTCTCATTAATGGAGGTCTCTAACT 
CTTGTTCTTCATTTTCTTCATCCTCTGTCGACAGTACTAAACCTTCTCCTTCTGAATCTT 
CTGTTAATCTCTCCCTTAGTCTCACATTTCCTTCTACTTCTCCACAAAGAGAAGCAAGAC 
AAGATTGGCCACCGATAAAGTCTAGATTAAGAGATACACTAAAGGGTCGTCGTCTTCTTC 
GTCGTGGTGATGACACTTCTCTCTTTGTTAAGGTTTATATGGAAGGTGTTCCCATTGGAA 
GAAAACTCGACCTTTGCGTATTCTCAGGCTACGAGAGTCTATTAGAAAATCTCTCTCACA 
TGTTCGATACTTCAATCATCTGCGGTAATCGAGATCGAAAACATCATGTTTTGACATATG 
AAGACAAGGATGGAGATTGGATGATGGTCGGAGATATTCCATGGGATATGTTTCTTGAAA 
CCGTGAGAAGACTAAAGATCACGAGACCGGAGAGGTATTAAAACTTGGATCGGTCAAGGC 
TGTGATTGCGCAGTTACGAGACGTGTAAGATTTAGGCATTGATGAAGAGACTTGAGGCGG 
GACGGAGCTATTGCTGCATATTGCAACAAAGGCCTTGAAGAAGTTGGAGAATTGATTGAT 
GCATATATTTATTTATATGACACCTTTGAGTGTGTTTTTTCTTATAAATAAATCACAATA 

TCCAAGACTTCTCTTTAAA 

>G1449 Amino Acid Sequence (domain in AA coordinates: 48-53,74-107,122-152; 
MEVSNSCSSFSSSSVDSTKPSPSESSVNLSLSLTFPSTSPQREARQDWPPIKSRLRDTLK 
GRRLLRRGDDTSLFVK\n^EGVPIGRKLDLCVFSGYESLLENLSHMFDTSIICGNRDRKH 
HVIiTYEDKDGDWMMVGD I PWDMFLETVRRLKI TRPERY * 
>G1897 (1..678) 

ATGCCTTCTGAATTCAGTGAATCTCGTCGGGTTCCTAAGATTCCCCACGGCCAAGGAGGA 

TCTGTTGCGATTCCGACGGATCAACAAGAGCAGCTTTCTTGTCCTCGCTGTGAATCAACC 

AACACCAAGTTCTGTTACTACAACAACTACAACTTCTCACAACCTCGTCATTTCTGCAAG • 

TCTTGTCGCCGTTACTGGACTCATGGAGGTACTCTCCGTGACATTCCCGTCGGTGGTGTT 

TCCCGTAAAAGCTCAAAACGTTCCCGGACTTATTCCTCTGCCGCTACCACCTCCGTTGTC 

GGAAGCCGGAACTTTCCCTTACAAGCTACGCCTGTTCTTTTCCCTCAGTCGTCTTCCAAC 

GGCGGTATCACGACGGCGAAGGGAAGTGCTTCGTCGTTCTATGGCGGTTTCAGCTCTTTG 

ATCAACTACAACGCCGCCGTGAGCAGAAATGGGCCTGGTGGCGGGTTTAATGGGCCAGAT 

GCTTTTGGTCTTGGGCTTGGTCACGGGTCGTATTATGAGGACGTCAGATATGGGCAAGGA 

ATAACGGTCTGGCCGTTTTCAAGTGGCGCTACTGATGCTGCAACTACTACAAGCCACATT 

GCTCAAATACCCGCCACGTGGCAGTTTGAAGGTCAAGAGAGCAAAGTCGGGT^ 

GGAGACTACGTAGCGTGA 

>G1897 Amino Acid Sequence (domain in AA coordinates: 34 -62) 

MPSEFSESRRVPKIPHGQGGSVAIPTDQQEQLSCPRCESTNTKFCYYNNYNFSQPRHFCK 

SCRRYWTHGGTLRDI PVGGVSRKS S KRSRTYSSAATTSWGSRNFPLQATPVLFPQS S SN 

GGITTAKGSASSFYGGFSSLINYNAAVSRNGPGGGFNGPDAFGLGLGHGSYYEDVRYGQG 

ITVWPFSSGATDAATTTSHIAQIPATWQFEGQESKVGFVSGDYVA* 

>G2143 (89.. 784) 

TCTTCTTCTTCCTCCATACCTTATCTCACCAGCTTCTCCATATCTCTCAAAGAAAAAACA 
AACCCTATAAATTCCACAAAAAAGGAGGATGGATAACTCCGACATTCTAATGAACATGAT 
GATGCAGCAGATGGAGAAGCTTCCTGAACACTTCTCTAACTCAAACCCTAACCCTAATCC 
CCATAACATTATGATGCTTTCTGAATCCAACACCCACCCGTTCTTCTTCAACCCCACTCA 
TTCTCATCTCCCATTTGACCAAACCATGCCTCAC^ 

CGCCCCCrTCCCCGTCATCATCTCTCCCGGAGAAGAGAGGAGGCTGCAGCGACAACGCCAA 
CATGGCGGCGATGAGAGAGATGATCTTTCGAATAGCCGTGATGCAGCCTATACATATTGA 
TCCGGAATCCGTAAAGCCACCAAAGAGAAAGAACGTGAGGATCTCTAAGGATCCACAGAG 
CGTGGCAGCTCGGCATCGAAGGGAGAGGATAAGCGAGCGGATTCGGATTCTTCAGCGGCT 
TGTTCCCGGTGGGACTAAGATGGATACGGCGTCGATGCTCGATGAGGCTATCCATTACGT 
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TAAGTTTCTCAAG7VAGCAAGTGCAGTCGCTGGAGGAACATGCGGTGGTTAACGGCGGAGG 

AATGACGGCGGTGGCCGGAGGAGCACTTGCGGGTACTGTTGGTGGAGGATATGGAGGAAA 

AGGGTGTGGCATTATGCGGTCTGATCATCACCAGATGCTTGGAAATGCACAGATTCTTAG 

ATGATGATGATGTTGATTTTTAAATATATATCATATGTTTATTAATATGACGGGAAA7VAA 

TATTATCGAGGGAGTTGAATTTAGTATCATGAAACTATGAGAGCATTTTTTTTAAATGTT 

TTTATCTTTCCGGGTTTCGATAATGTTTGGGATGGTTAATTAACAATTTAAAAGTCAGAC 

AACTTGGTTGTAAAGACTAAAGAATAAGCATAGTTTATCAATTTATCATTACTAAATGAA 
ATAG 

>G2143 Amino Acid Sequence (domain in aa coordinates: 128-179) 

MDNSDILMNMMMQQMEKLPEHFSNSNPNPNPHNIMMLSESNTHPFFFNPTHSHLPFDQTM 

PHHQPGLNFRYAPSPSSSLPEKRGGCSDNANMAAMREMIFRIAVMQPIHIDPESVKPPKR 

K^AmISKDPQSVAARHRRERISERIRILQRLVPGGTKMDTASMLDEAIHYVKFLKKQVQS 

LEEHAWNGGGMTAVAGGALAGTVGGGYGGKGCGIMRSDHHQMLGNAQILR* 

>G2535 (1..1005) 

ATGAACATATCAGTAAACGGACAGTCACAAGTACCTCCTGGCTTTAGGTTTCACCCAACC 
GAGGAAGAGCTCTTGAAGTATTACCTCCGCAAGAAAATCTCTAACATCAAGATCGATCTC 
GATGTTATTCCTGACATTGATCTCAACAAGCTCGAGCCTTGGGATATTCAAGAGATGTGT 
AAGATTGGAACGACGCCGCAAAACGATTGGTACTTTTATAGCCATAAGGACAAGAAGTAT 
CCCACCGGGACTAGAACCAACAGAGCCACCACGGTCGGATTTTGGAAAGCGACGGGACGT 
GACAAGACCATATATACCAATGGTGATAGAATCGGGATGCGAAAGACGCTTGTCTTCTAC 
AAAGGTCGAGCCCCTCATGGTCAG7yVATCCGATTGGATCATGCACG7\ATATAGACTCGAC 
GAGAGTGTATTAATCTCCTCGTGTGGCGATCATGACGTCAACGTAGAAACGTGTGATGTC 
ATAGGAAGTGACGAAGGATGGGTGGTGTGTCGTGTTTTCAAGAAAAATAACCTTTGCAAA 
AACATGATTAGTAGTAGCCCGGCGAGTTCGGTGAAAACGCCGTCGTTCAATGAGGAGACT 
ATCGAGCAACTTCTCGAAGTTATGGGGCAATCTTGTAAAGGAGAGATAGTTTTAGACCCT 
TTCTTAAAACTCCCTAACCTCGAATGCCATAACAACACCACCATCACGAGTTATCAGTGG 
TTAATCGACGACCAAGTCAACAACTGCCACGTCAGCAAAGTTATGGATCCCAGCTTCATC 
ACTAGCTGGGCCGCTTTGGATCGGCTCGTTGCCTCACAGTTAAATGGGCCCAACTCGTAT 
TCAATACCAGCCGTTAATGAGACTTCACAATCACCGTATCATGGACTGAACCGGTCCGGT 
TGTAATACCGGTTTAACACCAGATTACTATATACCGGAGATTGATTTATGGAACGAGGCA 
GATTTCGCGAGAACGACATGCCACTTGTTGAACGGTAGTGGATAA 

>G2535 Amino Acid Sequence (conserved domain in AA coordinates : 11-114 ) 

MNISVNGQSQVPPGFRFHPTEEELLKYYLRKKISNIKIDLDVIPDIDLNKLEPWDIQEMC 

KIGTTPQNDWYFYSHKDK3CYPTGTRTNRATTVG 

KGRAPHGQKSDWIMHEYRLDESvXISSCGDHDVWETCDVIGSDEGWWCRVFKl^NNLCK 

NMISSSPASSVKTPSFNEETIEQLLEVMGQSCKGEIVLDPFLKLPNLECHNNTTITSYQW 

LIDDQVIJNCHVSKVT^PSFITSWAALDRLVASQLNGPNSYSIPAWETSQ 

CNTGLTPDYYI PEIDLWNEADFARTTCHLLNGSG* 

>G2557 (94.. 1215) 

TCGACTTCCTGTGAACTCATCTGTTTGTC 

GCCGTTATTACAACGAGGATTGTGTTTGATCCGATGGAAGGATTGGAATCTGTGTACGCT 
CAAGCTATGTATGGAATGACACGAGAGAGCAAAATCATGGAGCATCAAGGATCAGATTTG 
ATTTGGGGAGGAAATGAGCTAATGGCTCGAGAACTCTGTTCTTCTTCTTCTTATCACCAC 
CAACTCATTAATCCGAATCTTAGCAGCTGTTTCATGTCTGATCTTGGAGTCTTAGGTGAG 
ATTCAACAGCAGCAACATGTTGGCAACAGAGCTAGCTCGATAGATCCATCATCACTCGAT 
TGTTTGTTATCTGCGACGTCGAATAGCAACAACACCTCGACGGAGGACGATGAAGGAATA 
TCTGTGCTTTTCTCAGATTGTCAGACTCTTTGGAGCTTTGGTGGAGTCTCATCTGCAGAG 
TCTGAGAACAGAGAGATCACTACTGAGACGACAACAACGATAAAGCCTAAGCCTTTGAAG 
AGAAACAGAGGAGGAGATGGAGGAACTACTGAGACTACAACAACAACZAACAAAACCTAAG 
TCTTTGAAGAGAAACAGAGGAGACGAGACAGGAAGTCACTTTAGTCTTGTTCATCCTCAA 
GATGATTCGGAGAAAGGAGGTTTCAAGCTTATATACGATGAGAATCAATCGAAATCAAAG 
AAACCAAGAACAGAGAAAGAACGAGGCGGTTCTTCGAAC^TTAGTTTCCAACATTCAACT 
TGTTTGTCTGACAATGTCGAGCCCGATGCTGAGGCGATTGCACAAATGAAGGAGATGATA 
TACAGAGCGGCTGCATTTAGACCGGTGAATTTCGGGTTAGAGATTGTGGAGAAGCCTAAG 
AGGAAGAACGTCAAGATATCGACGGATCCTCAAACGGTTGCAGCGAGACAGAGAAGGGAG 
AGGATAAGTGAGAAGATTAGGGTTTTACAAACATTGGTTCCAGGTGGGACGAAGATGGAT 
ACTGCATCAATGCTTGATGAAGCTGCTAATTATCTCAAGTTCCTTAGAGCACAAGTAAAA 
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GCTTTAGAAAACTTGAGACCCAAGCTTGACCAAACCAATCTCTCTTTCTCTTCTGCTCCT 
ACATCGTTTCCATTATTCCACCCATCTTTTCTTCCATTGCAAAATCCTAATCAAATCCAT 
CATCCAGAGTGTTGACAGATTATAAACTTTTGAGTTTCATCATCATCAACAGAATCATGG 
CGTCTTGATTGTTTTAGCAGTTCTCAAGAAAGGCAACTTCTGTGACAAGGGTGGTGTCGG 
GCAGTGTTGTTTACACTTTCCAGTCTTTGTTTTGCATTTCTTTTTATATAAAGTTTGTAT 
TTTATATAGAATCTGTGGAATTCGAGGGTTGAAATATTGTGAAAAACAGAGCCGCAAGAG 
GTTAATTACAGTCTCTGCAATATTTTCAACCTTTTATTACTTTATTAGAGTAAAGATAGC 

GT 

>G2557 Amino Acid Sequence (domain in aa coordinates: 278-328) 

MEGLESVYAQAMYGMTRESKIMEHQGSDLIWGGNELMARELCSSSSYHHQLINPNLSSCF 

MSDLGVLGEIQQQQHVGNRASSIDPSSLDCLLSATSNSNNTSTEDDEGISVX.FSDCQTLW 

SFGGVSSAESENREITTETTTTIKPKPLKRNRGGDGGTTETTTTTTKPKSLKRNRGDETG 

SHFSLVHPQDDSEKGGFKLIYDENQSKSKKPRTEKERGGSSNISFQHSTCLSDNVEPDAE 

AIAQMKEMIYRAAAFRPVNFGLEIVEKPKRKNVKISTDPQTVAARQRRERISEKIRVLQT 

LVPGGTKMDTASMLDEAANYLKFLRAQVKALENLRPKLDQTNLSFSSAPTSFPLFHPSFIi 

PLQNPNQIHHPEC* 
>G259 (52.. 786) 

GAGATCTTCTACTACTTGTTTTCTTCAAGAATAATAATTTTCGTTTTATATATGGAAGAT 
GCTGGTGAACATTTACGGTGTAACGATAACGTTAACGACGAGGAGCGTTTGCCATTGGAG 
TTTATGATCGGAAACTCAACATCCACGGCGGAGCTACAGCCGCCTCCACCGTTCTTGGTA 
AAGACATACAAAGTGGTGGAGGATCCGACGACGGACGGGGTTATATCTTGGAACGAATAC 
GGAACTGGTTTCGTCGTGTGGCAGCCGGCAGAATTCGCTAGAGATCTGTTACCAACACTT 
TTCAAGCATTGCAACTTCTCTAGCTTCGTTCGCCAGCTCAATACTTACGGTTTTCGAAAA 
GTAACGACGATAAGATGGGAATTTAGTAATGAGATGTTTCGAAAGGGGCAAAGAGAGCTT 
ATGAGCAATATCCGAAGAAGGAAGAGCCAACATTGGTCACACAACAAGTCTAATCACCAG 
GTTGTACCAACAACAACGATGGTGAATCAAGAAGGTCATCAACGGATTGGGATTGATCAT 
CACCATGAGGATCAACAGTCTTCCGCCACTTCATCCTCTTTCGTATACACTGCATTACTC 
GACGAAAAC^^TGCTTGAAGAATGAAAACGAGTTATTAAGCTGCGAACTTGGGAAAACC 
AAGAAGAAATGCAAGCAGCTTATGGAGTTGGTGGAGAGATACAGAGGAGAAGACGAAGAT 
GCAACTGATGAAAGTGATGATGAAGAAGATGAAGGGCTTAAGTTGTTCGGAGTAAAACTT 
GAATGAAACTAGATTGCTAGATTGATATTCGTAATATACCAGTTTCTTCATATTCTTAGA 
AGTTTTGCATAACTATATATAGTACTCTTTTAAGACATGCAAGATCAGAACATATG 

>G259 Amino Acid Sequence (domain in AA coordinates: 27-131) 
MEDAGEHLRCND1TOJDEERLPLEFMIGNSTSTAELQPPPPFLVKTYKVVEDPTTDGVISW 
NEYGTGFWWQPAEFARDLLPTLFKHCNFS S FVRQLNTYGFRKVTTIRWEFSNEMFRKGQ 
REIiMSNIRRRKSQHWSHNKSNHQWPTTTMVNQEGHQRIGIDHHHEDQQSSATSSSFVYT 
ALLDENKCLKMNELLSCELGKTKKKCKQLMELVERYRGEDEDATDESDDEEDEGLKLFG 

VKLE* 

>G353 (82.. 570) 

ACCAAACTCAAAAAACACAAACCACAAGAGGATCATTTCATTT^ 

ATCATCATCATCAGAAGAAAAATGGTTGCGATATCGGAGATCAAGTCGACGGTGGATGTC 

ACGGCGGCGAATTGTTTGATGCTTTTATCTAGAGTTGGACAAGAAAACGTTGACGGTGGC 

GATCAAAAACGCGTTTTCACATGTAAAACGTGTTTGAAGCAGTTTCATTCGTTCC?^AGCC 

TTAGGAGGTCACCGTGCGAGTCACAAGAAGCCTAACAACGACGCTTTGTCGTCTGGATTG 

ATGAAGAAGGTGAAAACGTCGTCGCATCCTTGTCCCATATGTGGAGTGGAGTTTCCGATG 

GGACAAGCTTTGGGAGGACACATGAGGAGACACAGGAACGAGAGTGGGGCTGCTGGTGGC 

GCGTrTGGTTACACGCGCTTTGTTGCCGGAGCCCACGGTGACTACGTTGAAGAAATCTAGC 

AGTGGGAAGAGAGTGGCTTGTTTGGATCTGAGTCTAGGGATGGTGGACAATTTGAATCTC 

AAGTTGGAGCTTGGAAGAACAGTTTATTGATTTTATTTATT^ 

ATATTTGTTTCTCTCATTCTTTGAATTTTTCTTAATATTCTAGATTATACATACATC 

AGATTTAGGAAACTTTCATAGAGTGTAATCTTTTCTTTCTGTAAAAATATAT^ 

TAGCAAA 

>G353 Amino Acid Sequence (domain in aa coordinates: 41-61, 84-104) 
OTAISEIKSTVDTOAANCLMLLSRVGQ 

HKKPNNDALSSGLMKKVKTS SHPCP I CGVEFPMGQALGGHMRRHRNESGAAGGALVTRAL 
LPEPTVTTLKKS SSGKRVACLDLSLGMVDNLNLKLELGRTVY* 
>G354 (27.. 533) 
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CCTAGAAGTCACTAAGTCGATTCAAAATGGTTGCGAGAAGTGAGGAAATTGTGATAGTGG 

AAGAAGATACGACTGCGAAATGTTTGATGTTGTTATCAAGAGTCGGAGAATGCGGCGGCG 

GCTGCGGGGGAGATGAACGTGTTTTCCGATGCAAGACTTGTCTTAAAGAGTTCTCATCGT 

TTCAAGCTTTGGGAGGTCATCGTGCAAGCCACAAGAAACTTATCAACAGTGACAATCCAT 

CACTTCTTGGATCCTTGTCCAACAAGAAAACTAAAACGTCTCATCCTTGTCCGATATGTG 

GAGTGAAGTTTCCGATGGGACAAGCTCTTGGTGGTCACATGAGGAGACATAGGAACGAGA 

AAGTCTCAGGCTCGTTGGTTACACGTTCTTTTCTACCGGAGACGACGACGGTGACGGCTT 

TGAAGAAATTTAGTAGTGGGAAGAGAGTGGCTTGTTTGGATTTGGACTTAGATTCGATGG 

AGAGTTTGGTCAATTGGAAGTTGGAGTTGGGAAGAACGATTTCTTGGAGTTAAGTTTTTC 

GGTTGTATACAGTTTCACATGATTTTGTAATCTTTGTTGATCCAATTATCGTACCGATCG 
ATGTGAATATTATTTTGATACAATAAAA 

>G3S4 Amino Acid Sequence (domain in aa coordinates: 4?-6? 88-109) 

MVARSEBIVIVEEDTTAKCLMLLSRVGECGGGCGGDERVPRCKTCLKEFSSFQALGGHRA 

SHKKLINSDNPSLLGSLSNKKTKTSHPCPICGVKFPMGQALGGHMRRHRNEKVSGSLVTR 

SFLPETTTVTALKKFSSGKRVACLDLDLDSMESLVNWKLELGRTISWS* 

>G638 (86.. 1861) 

GAATTAAAAGGTTTAACCTTTACCTTTTTTTCCCTTCACTATCGATAATTGATCTTCTCT 
TTCGGCTGAATATAAATCTGAAAAAATGGATCAAGATCAGCATCCTCAGTACGGTATACC 
GGAGCTCCGGCAGCTCATGAAAGGCGGAGGAAGGACGACTACTACAACACCGTCTACTTC 
TTCTCATTTTCCCTCTGATTTCTTCGGTTTTAACCTTGCTCCGGTGCAGCCACCGCCACA 
CCGTCTTCATCAGTTCACTACTGATCAAGATATGGGTTTCTTGCCACGTGGCATACATGG 
ATTGGGTGGAGGTTCTTCAACGGCTGGAAATAACAGTAACTTAAACGCGAGTACTAGTGG 
TGGAGGAGTTGGGTTTAGTGGGTTTCTTGACGGTGGTGGTTTCGGCAGCGGAGTAGGAGG 
AGACGGTGGAGGAACTGGAAGGTGGCCGAGACAAGAAACCCTAACTCTGTTGGAAATTAG 
ATCTCGTCTTGATCATAAATTCAAAGAAGCTAATCATAAAGGACCTCTTTGGGATGAAGT 
TTCTAGGATTATGTCCGAGGAACATGGATACCAAAGGAGTGGGAAGAAATGCAGAGAGAA 
GTTTGAGAATCTGTACAAATACTATAGTAAGACTAAAGAAGGCGAAGeCGGAAGACAAGA 
CGGAAAACATCACAGATTTTTCCGCCAGCTCCAAGCGCTATACGGGGATTCTAATAACTT 
GGTTTCTTGTCCCAATCATAACACGCAGTTCATGAGCAGTGCTCTTCATGGTTTCCATAC 
TCAAAACCCTATGAACGTTGCTACAACAACGTCCAACATCCATAACGTTGATAGTGTTCA 

GACTTCCTCTTCGGAAGGGAATGATTCTAGTAGTAGAAGGAAAAAGAGGAGTTGGAAAGC 

GCTTGAGAAGTTGACAAAGGTTATTGAAGACAAAGAGGAACAACGGATGATGAAAG^ 

ggaatggaggaagattgaagctccaaggattgataaagagcatttgttSgggctS 
^ccatcgataaagccgctgtgttcatccccggaagagagga 

GATCCGAAACAATAGTGAGACACA^^^ 

attatgcaaggaaaagtgggaatggataagcaatggaatgaggaaagaaaaga^gc 
caacaagaaaagaaaggataattcgtcc^^ 

aaatccaatctacaataatcgagaaagtggatataatoataatgatccgcat^ 
cgaacaaggcaatgtaggttcttcaacatcaaacgcaaacgcaaacgcaScg^cc^ 

TCGAAATCCGAGCGGTGCAATGGCTGCTAGTACyu^CTGCTTCCCGTTCTTCATGGGA^A 
TOGaQATCaQAATTTCTQGGAGft^ 

AGTAATrrCTCTTAATGAAGAAGAAGAAGGTAATCATGTGGTTAACTA^ 
AG ™™ GTOGGGGOTA ^^ 

AATTATAT<^AATTAGGCTTTAACa^CGTACGATTATATATTATGTTTTCATGTSCTA 
TTCTGTAAGACTTTTTAATATCAATCTTTCTCTAAA ITTCATGTATTTA 
>G638 Amino Acid Sequence (domain in AA coordinates- 119-206) 
^QDQHPQYGIPELRQLMKGGGRTTTTTPSTSSHFPSDFFGFNIAPVQPPPHRiHQF^ 

prSS 
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TTSNIHNVDSVHGFHQSLSLSNNYNSSELELMTSSSEGNDSSSRRKKRSWKAKIKEFIDT 

NMKRLIERQDWLEKLTKVIEDKEEQRMMKEEEWRKIEAARIDKEHLFWAKERARMEARD 

VAVIEALQYLTG KPL I KPLC S S PEERTNGNNE I RNNSETQNENG S DQTMTNWVCVKGS S S 

CWGEQEILKLMEIRTSMDSTFQEILGGCSDEFLWEEIAAKLIQLGFDQRSALLCKEKWEW 

ISNGMRKEKKQINKKRKDNSSSCGVYYPRNEENPIYWNRESGYNDNDPHQINEQGNVGSS 

TSNANAJWANVTTGNPSGAMAASTNCFPFFMGDGDQNLWESYGLRLSICEENQ* 

>G869 (428- .1402) 

AGGAACAGTGAAAGGTTCGGTTTTTTGGGTTTCGATCTGATAATCAACAAGAAAAAAGGG 
TTTGATTTATGTCGGCTGGGTTTGAATCGACTGTGATTTTGTCTTTGATTCATATCTCTT 
CTCCGATTTCATCATCATCTTCCCCATCATCGTCGTCTTTGAAATCTTGTCTTCTCAACG 
CTCTTCACTTCTGCTGTAATAAGCAGAGGCTTGTTCTGGAGACTCCTTCTCTTTCCATGC 
GCTTAAGACCCAAAAGGACTTGTTCTAGTGTTGAAGTCTTTGGGGGTTTTCACATAAAGC 
AGCAAAAGTTTTCTTTTTTCATAGTTCGCTGAGAGTTTTGAGTTTTGATACCAAAAAAGT 
TTTGACCTTTTAGAGTGATTTTTTGTTCTTTCTGTTTTCTGGGTATTTTTGAGGAGTGGG 
TTTAACAATGGTTGCGATTAGAAAGGAACAGTCTTTGAGTGGTGTTAGTAGCGAGATTAA 
GAAGAGAGCTAAGAGAAACACTCTATCGTCCCTTCCTCAAGAAACCCAACCTTTGAGGAA 
AGTCCGTATTATTGTGAATGATCCTTATGCTACTGATGATTCCTCTAGTGATGAGGAAGA 
GCTTAAGGTTCCTAAGCCAAGGAAAATGAAACGTATCGTTCGTGAGATTAACTTTCCTTC 
TATGGAAGTTTCTGAACAGCCTTCTGAGAGTTCTTCTCAGGACAGTACTAAAACTGATGG 
CAAGATAGCTGTGTCAGCTTCTCCTGCTGTTCCTAGGAAGAAGCCTGTTGGTGTTAGGCA 
AAGGAAATGGGGGAAATGGGCTGCTGAGATTAGAGATCCTATTAAGAAAACTAGGACTTG 
GTTGGGTACTTTTGATACTCTTGAAGAAGCTGCTAAAGCTTATGATGCTAAGAAGCTTGA 
GTTTGATGCTATTGTTGCTGGAAATGTGTCCACTACTAAACGTGATGTTTCTTCATCTGA 
GACTAGCCAATGCTCTCGTTCTTCACCTGTTGTTCCTGTTGAGCAAGATGACACTTCTGC 
ATCAGCTCTCACTTGTGTCAACAACCCTGATGACGTCTCGACCGTTGCTCCAACTGCTCC 
AACTCCAAATGTTCCTGCTGGTGGAAACAAGGAAACGTTGTTCGATTTCGACTTTACTAA 
TCTACAGATCCCTGATTTTGGTTTCTTGGCAGAGGAGCAACAAGACCTAGACTTCGATTG 
TTTCCTCGCGGATGATCAGTTTGATGATTTCGGCTTGCTTGATGACATTCAAGGATTCGA 
AGATAACGGTCCAAGTGCGTTACCAGATTTCGACTTTGCGGATGTTGAAGATCTTCAGCT 
AGCTGACTCTAGTTTCGGTTTCCTTGATCAACTTGCTCCTATCAACATCTCTTGCCCATT 
AAAAAGTTTTGCAGCTTCATAGGATCTTGCTTAGTAATGTTAAGTGAGAAGAGTGTTTTG 
TTTTTTCGTTTATGCTTTAGTAATTTAAGACATACAAAAGTGTGTGTTCCGGATTGTAGT 
AAGATCTTAAGACATAAAGCCGGGTTTTGCAATTAGGAATCGAGTTTTT^ATGAAGTTTTA 
GTTTATGTTTG 

>G869 Amino Acid Sequence (domain in AA coordinates: 109-177) 

MVAIRKEQSLSGVSSEIKKRAKRNTLSSLPQETQPLRKVRIIVNDPYATDDSSSDEEELK 

VPKPRKMKRIVREINFPSMEVSEQPSESSSQDSTKTDGKIAVSASPAVPRKKPVGVRQRK 

WGKWAAE IRDP I KKTRTWLGTFDTLEEAAKAYDAKKLEFDAI VAGNVSTTKRDVSSSETS 

QCSRSSPWPVEQDDTSASALTCVNNPDDVSTVAPTAPTPNVPAGGNKETLFDFDFTNLQ 

IPDFGFLAEEQQDLDFDCFLADDQFDDFGLLDDIQGFEDNGPSAI1PDFDFADVEDLQI1AD 

SSFGFLDQLAPINISCPLKSFAAS* 

>G1645 (25.. 1104) 

CGTCGACCTCCCAACACTAACTCCATGTTTATAACGGAAAT^ACAAGTGTGGATGGATGAG 
ATCGTCGCAAGAAGAGCTTCTTCTTCTTGGGACTTCCCTTTCAACGACATTAATATTCAT 
C^GCATCATCATCGTCACTGCAACACAAGTCT^ 

GGAGATGTAGCGGTTCACGAAGAAGAGAGTAATAATAATAACCCTAATTTCAGTAACAGC 
GAGAGTGGTAAGAAGGAGACAACAGATAGTGGTCAGTCTTGGTCCTCGTCGTCTTCAAAA 
CCATCGGTCTTGGGGAGAGGAC^TTGGAGACCAGCTGAAGATGTTAAACTCAAAGAGCTT 
GTCTCCATTTACGGCCCACAAAACTGGAACCTCAT^^ 

GGGAAGAGCTGTAGACTACGATGGTTTAACCAATTGGACCCGAGGATAAACCGAAGAGCT 
TTCACAGAAGAAGAAGAGGAGAGG CTGATG CAAGCACATAGG CTTTATGGTAACAAATGG 
GCAATGATTGCGAGGCTTTTCCCTGGTAGAACTGATAATTCAGTGAAGAACCATTGGCAT 
GTTGTCATGGCTCGTAAGTATAGAGAACACTCnTCTC 

AGTAATAATCCACTTAAACCTCACCTCACCAATAATCATCATCCTAACCCTAACCCTAAT 
TACCACTCTTTTATCTCCACTAATCATTACTTCGCTCAGCCTTO 

ACTCATCACCTGGTTAATAATGCCCCTATCACGAGTGACCATAACCAGCTTGTGTTGCCT 
TTCCATTGCTTTCAAGGTTATGAGAACAATGAACCTCCGATGGTTGTGAGTATGTTTGGC 
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AACCAAATGATGGTCGGCGATAACGTTGGTGCCACGTCAGACGCGTTATGCAATATTCCG 
CACATTGACCCTAGTAACCAAGAGAAACCGGAGCCAAATGATGCAATGCATTGGATCGGA 
ATGGACGCGGTAGATGAGGAGGTGTTCGAAAAGGCTAAGCAGCAACCACATTTTTTCGAT 
TTTCTTGGCTTGGGGACGGCGTGAATGTTGAACAAATTGGTGTTAATCAGATAACGACAG 

TGGC 

>G1645 Amino Acid Sequence (domain in AA coordinates: 90-210) 

MFITEKQTOMDEIVARRASSSWDFPFNDINIHQHHHRHCNTSHEFEILKSPLGDVAVHEE 

ESNNNNPNFSNSESGKKETTDSGQSWSSSSSKPSVLGRGHWRPAEDVKLKELVSIYGPQN 

WNLIAEKLQGRSGKSCRLRWFNQLDPRINRRAFTEEEEERLMQAHRLYGNKWAMIARLFP 

GRTDNSVKNHWHVVMARKYREHSSAYRRRKLMSlTOPLKPHLTNiniHPNPNPNYHSFIST^ 

HYFAQPFPEFNLTHHLVNNAPITSDHNQLVLPFHCFQGYENNEPPMVVSMFGNQMMVGDN 

VGATSDALCNIPHIDPSNQEKPEPNDAMHWIGMDATOEEVFEKAKQQPHFFDFLGLGTA* 

>G1038 (240. .1574) 

GCTCGTTTTCAAATTAAAAACAGGGAGAAATTTGGAAATTCCAGTACGACGGGAGATAAA 

ACCTAACATACGCCATGGTGACCGTTATCTAAACTACGCCAAAATATTTGAAGTGTCGTC 

GTTTCATAATAAAACGCAAACAAAAACCCACTCCCACTTTCTCCTTTCCAAAAAAAGAAC 

TCTCGCCACTTTCTCTGCTCTTTTCTTTCTCTCTCTCTTTCTTGTTTTCGCCGGCGATCA 

TGGAGAAAAGCGGCTTCTCTCCCGTCGGTCTAAGGGTTCTTGTCGTAGACGATGATCCAA 

CTTGGCTCAAGATTCTCGAGAAAATGCTCAAGAAGTGTTCTTACGAAGTAACGACCTGTG 

GATTAGCTAGAGAGGCTTTGAGGTTGCTGAGGGAGCGTAAAGATGGATATGATATCGTGA 

TCAGCGATGTGAACATGCCTGACATGGATGGTTTCAAGCTTCTTGAGCATGTTGGTCTTG 

AATTAGACCTCCCTGTAATAATGATGTCGGTGGACGGCGAAACAAGCCGAGTGATGAAGG 

GAGTGCACACGGGAGCTTGTGATTACCTCTTGAAGCCGATAAGAATGAAGGAGTTAAAGA 

TTATATGGCAACATGTTCTGAGAAAGAAGCTTCAAGAAGTGAGAGATATCGAAGGCTGTG 

GATACGAAGGAGGAGCGGATTGGATCACTCGATACGATGAAGCACATTTTCTTGGAGGTG 

GTGAAGATGTTTCTTTTGGGAAAAAGAGAAAAGACTTTGACTTTGAGAAGAAGCTTCTTC 

AAGATGAGAGTGATCCATCATCTTCTTCTTCCAAGAAAGCTAGAGTTGTTTGGTCTTTTG 

AGCTTCATCATAAGTTTGTCAACGCCGTTAACCAAATCGGATGCGATCACAAAGCTGGTC 

CCAAGAAGATATTGGATCTCATGAATGTTCCATGGCTCACTAGAGAAAATGTTGCAAGCC 

ACCTTCAGAAATATAGACTTTACCTGAGCAGATTAGAGAAAGGAAAGGAGCTCAAGTGTT 

ATTCAGGTGGCGTGAAGAATGCGGATTCATCTCCAAAAGATGTCGAAGTGAATTCAGGCT 

ACCAAAGCCCTGGGAGGAGCAGCTATGTATTCTCTGGAGGAAATTCTCTGATCCAAAAAG 

CAACAGAGATTGATCCAAAGCCACTTGCTTCAGCTTCTTTGTCTGACCCCAACACCGATG 

TGATCATGCCTCCGAAAACAAAAAAGACGCGTATAGGATTTGATCCTCCCATTTCCTCCT 

CTGCGTTTGACTCTCTGCTTCCTTGGAATGATGTTCCAGAGGTCCTTGAATCGAAGCCGG 

TTCTGTATGAGAATAGCTTTCTCCAGCAACAACCATTGCCAAGTCAAAGTTCCTATGTTG 

CAATTTCTGCACCATCTCTCATGGAGGAGGAAATGAAGCCTCCTTATGAGACACCAGCAG 

GAGGCAGTAGTGTGAATGCAGATGAGTTTCTCATGCCACAAGACAAGATCCCTACTGTAA 

CCCTTC^GATTTGGATCCCTCTGCC^TGAAGCTGCAGGAGTTCAACACAGAAGGCGA 

CTGAAGAAGCTTGAACTGGGGAACTTCCAGAATCACATCATTCTGTTTCTTTAGACACTG 

ACTTAGACTTGACTTGGCTTCAAGGCGAGCGTTTCTTGCAAACACCGACTCCAGTTTCAA 

GATACAGTAGTAGCCCATCACTCCTATCTGAGCTCCC^GCCCACCTTAATTGGTATGGAA 

ATGAGCGGCTGCCTGACCCTGACGAGTATTCCTTCATGGTAGACCAAGGTTTATTCATAT 

CTTAACCTTGTTCCAATAACTTCTTTTCGTATATTGGTTGGTGTAATGCAGAAAGATTTT 

GTGGGTATACCTGAAAATAATCTTGCTTTCCCAAGAACCTTCCATGATCGGATGCATTGT 

ACAATAATCCACGAGTGTCGTAGGCTAATTACACCAAACAGGTTGATGACAGTGATAAGG 

CCACATGTTTCACACCGTCGCTTAAGATCTTTACTGTCACCTGGAAGGAAA 

>G1038 Amino Acid Sequence (domain in AA coordinates: 198-247) 

MEKSGFSPVGLRVLVVDDDPTWLKILEKMLKKCSYEVTTCGLAREALRLLRERKDGYDIV 

ISDVNMPDI^GFKLLEHVGLELDLPVIMMSW 

1 1 WQHVLRKKLQEVRD IEGCGYEGGADW ITRYDEAHFLGGGEDVS FGKKRKDFDFEKKLL 
QDESDPSSSSSKKARVWSFELHHKFWAWQIGOT^ 

HLQKYRLYLSRLEKGKELKCTSGGVKNADSSPKDVEVNSGYQSPGRSSYVFSGGNSLIQK 
ATEIDPKPLASASLSDPNTDVIMPPKTKKTRIGFDPPISSSAFDSLLPWNDVPEVLESKP 
VLYENSFLQQQPLPSQSSYVAISAPSLMEEEMKPPYETPAGGSSVNADEFLMPQDKIPTV 

TLQDLDPSAMKLQEFNTEGDSEEA* 
>G1073 (62.. 874) 
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CCCCCCGACCTGCCTCTACAGAGACCTGAAGATTCCAGAACCCCACCTGATCAAAAATAA 

CATGGAACTTAACAGATCTGAAGCAGACGAAGCAAAGGCCGAGACCACTCCCACCGGTGG 

AGCCACCAGCTCAGCCACAGCCTCTGGCTCTTCCTCCGGACGTCGTCCACGTGGTCGTCC 

TGCAGGTTCCAAAAACAAACCCAAACCTCCGACGATTATAACTAGAGATAGTCCTAACGT 

CCTTAGATCACACGTTCTTGAAGTCACCTCCGGTTCGGACATATCCGAGGCAGTCTCCAC 

CTACGCCACTCGTCGCGGCTGCGGCGTTTGCATTATAAGCGGCACGGGTGCGGTCACTAA 

CGTCACGATACGGCAACCTGCGGCTCCGGCTGGTGGAGGTGTGATTACCCTGCATGGTCG 

GTTTGACATTTTGTCTTTGACCGGTACTGCGCTTCCACCGCCTGCACCACCGGGAGCAGG 

AGGTTTGACGGTGTATCTAGCCGGAGGTCAAGGACAAGTTGTAGGAGGGAATGTGGCTGG 

TTCGTTAATTGCTTCGGGACCGGTAGTGTTGATGGCTGCTTCTTTTGCAAACGCAGTTTA 

TGATAGGTTACCGATTGAAGAGGAAGAAACCCCACCGCCGAGAACCACCGGGGTGCAGCA 

GCAGCAGCCGGAGGCGTCTCAGTCGTCGGAGGTTACGGGGAGTGGGGCCCAGGCGTGTGA 

GTCAAACCTCCAAGGTGGAAATGGTGGAGGAGGTGTTGCTTTCTACAATCTTGGAATGAA 

TATGAACAATTTTCAATTCTCCGGGGGAGATATTTACGGTATGAGCGGCGGTAGCGGAGG 

AGGTGGTGGCGGTGCGACTAGACCCGCGTTTTAGAGTTTTAGCGTTTTGGTGACACCTTT 

TGTTGCGTTTGCGTGTTTGACCTCAAACTACTAGGCTACTAGCTATAGCGGTTGCGAAAT 
GCGAATATTAGGTT 

>G1073 Amino Acid Sequence (domain in AA coordinates: 33-42 78-175) 
MELNRSEADEAKAETTPTGGATSSATASGSSSGRRPRGRPAGSKNKPKPPTIITRJDSPNV 
LRSHVLEVTSGSDISEAVSTYATRRGCGVCIISGTGAVTNVTIRQPAAPAGGGVITLHGR 
FDILSLTGTALPPPAPPGAGGLTVYLAGGQGQWGGNVAGSLIASGPWLMAASFANAVY 

DRLPIEEEETPPPRTTGVQQQQPEASQSSEVTGSGAQACESNLQGGNGGGGVAFYNLGMN 

MNNFQFSGGDIYGMSGGSGGGGGGATRPAF* 

>G1146 (129.. 3095) 

cttctctagcgtcactcttcttcttcattggtcggtagaataaggccaaggaagggatca 

gttttaagttttgtttcattctttttgtagtggagaaaaagagtttttgaaaatcaaaac 
aacaaaaaatqccoattaaapsaahnaaanaf=nni- n i.„. ■ ... . . . 



— ~~-"'-'-a'-=a'-aa«a<"aaagagct:t;t:cgaaaatcaaaac 

aacaaaaaatgccgattaggcaaatgaaagatagctctgagactcacttagttatcaaaa 
cccaacctttaaagcaccacaatccaaaaaccgttcaaaacggtaaaatccctcctcctt 
ctccttctccggtgacggtgactactccggcgacggttactcagagtcaagcttcttcac 
cttcaccaccgtcaaagaatcgtagccggaggagaaaccgtggtggaagaaaatctgatc 
aaggagatgtttgtatgagacctagctctcgtcctcgtaaaccgccaccgccaagtcaaa 
ccacttcctccgccgtctccgtcgccaccgccggtgagattgtcgctgtgaatcatcaga 
tgcagatgggtgttcgtaaaaactcaaactttgctccaagacctggatttggaacacttq 
gaactaaatgcattgttaaagctaaccactttctcgctgatttgcctaccaaggatttqa 
atcagtatgatgttacaattactcctgaagtgtcatcaaagagtgttaacagagct— - 
ttgctgagttagttagactttacaaaqaqtctaatctcaaaarrr.^^^r.t-^^^ ^ 



ttgctgagttagttagactttacaaagagtctgatctcgggaggagacttccggcttacg 
atggccggaaaagtctttacactgctggagaacttccttttacttggaaggagttcaqtq 
ttaagattgttgatgaagatgacggtatcatcaatggccctaaaagggagagatcatata 
aggtggcaatcaagtttgttgcacgggcaaatatgcatcacttaggcgagtttctaqctq 
gtaaacgggcagattgtccgcaagaggcggtgcagattcttgatattgtactcagggagt 
tgtcggttaagaggttttgtcccgttggaagatctttcttttcgcctgatattaaaacac 
cgcagcgactcggtgaagggttagagtcatggtgtgggttttaccagagtattagaccaa 
ctcaaatgggtttatcactaaatatcgatatggcttcagctgcattcatcgagcctcttc 
cagtgatagagtttgtagcacagcttcttggaaaggatgtcttgtcgaagccattgtcqq 
attctgatcgcgtcaagattaagaagggtcttagagqaqtqaaaqtaqaaat:i- ; ,^^n^ 



attctgatcgcgtcaagattaagaagggtcttagaggagtgaaagtagaggttactcaca 
gagcgaatgtaagaaggaaataccgtgttgcgggtttaacaactcaaccaacaagagaqc 
taatgtttccagtagatgagaactgtactatgaagtcagttattgagtatttccaagaga 
tgtatggattcacgatccagcacacgcatttgccatgtctccaagttggaaaccaaaaqa 
aggcaagctatttgccgatggaggcatgcaaaattgtcgagggacaacggtacacgaaaa 
ggttgaatgagaagcagattactgctctcttgaaagttacatgccaaagggccgaggqac 
agagaaacgatattttgcggactgtccaacacaacgcatatgatcaagatccatatgcaa 
aggagtttggcatgaacataagcgaaaagttagcttctgttgaagctcgtattcttccaq 
ctccatggcttaagtatcacgagaacgggaaagaaaaagattgtctcccgcaagttgqtc 
agtggaatatgatgaacaagaaaatgatcaacgggatgactgtgagcagatgggcctgtg 
ttaacttctcacgcagcgttcaagaaaacgttgctcgtggattttgtaatgaacttqatc 
agatgtgtgaagtctcaggcatggagtttaatccagaacccgtgataccaatatataqtq 
cgaggcccgatcaagtcgagaaagctctaaagcatgtttatcacacttcaatgaacaaaa 
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ccaaaggcaaagagttagagcttctgctggcaatattacctgataacaacggttcacttt 

atggtgatcttaagagaatctgtgaaaccgagcttggtttgatatctcaatgttgtctca 

caaaacatgtgttcaagattagcaaacagtatctggcagatgtatcccttaaaatcaacg 

taaagatgggaggaaggaacacagttctagtagacgccataagctgtagaattccactgg 

ttagcgatataccgacaatcatttttggcgcagacgtgactcacccagagaacggggaag 

agtcaagcccttcaatcgctgctgttgttgcttctcaagactggcctgaagtgacaaaat 

atgcgggtttagtttgtgctcaagctcacaggcaagaacttatacaagatttgtataaaa 

catggcaagatcctgttcgcggtactgttagtggcggtatgatcagggaccttcttatct 

catttagaaaagcaacagggcaaaaaccgcttcgaattatcttttatcgtgatggagtaa 

gcgaagggcaattctatcaagttttactctatgagttggatgcaattcgaaaggcttgtg 

catcgcttgaaccgaattatcagccaccggtgacattcatagttgtacagaagcgtcacc 

acactcgtttgtttgctaataatcaccgagacaaaaacagtactgaccgaagcggaaata 

tcttaccaggtactgtagttgacactaaaatatgtcatccaactgaattcgacttctacc 

tttgtagccatgcgggtattcagggaacaagcaggcctgcacattaccatgttctttggg 

acgagaacaatttcacagcagatggtattcaatctctgactaacaatctctgttatacct 

atgcgcggtgcactcggtcggtctctatagttcctccagcgtattatgctcatcttgcag 

catttcgagcacgtttctacctggaacctgagataatgcaagacaacggatcaccgggta 

aaaagaacacgaaaacaacaactgtcggagacgtaggtgtgaagcctttaccagccttga 

aggagaatgtgaagagagtaatgttctactgctaaaaatccaaacattccttaatcagtt 

ttaataagtagtttggttgtttgcttgtagttcggctttagatttaccaatgtttttctt 

atgtaaattttgtcggtttggtttaagcctttaggaattagtgtattagggtttttctaa 

agttgtactttagctgatgataacgttgatgcagtgactttgttaaaacctcctcttcta 
cagtagtgtttacgtcgttcctc 

>G1146 Amino Acid Sequence (domain in AA coordinates: 886-896) 

MPIRQMKDSSETHLVIKTQPLKHHNPKTVQNGKIPPPSPSPVTVTTPATVTQSQASSPSP 

PSKNRSRRRNRGGRKSDQGDVCMRPSSRPRKPPPPSQTTSSAVSVATAGEIVAVNHQMQM 

GTOKNSNFAPRPGFGTLGTKCIVKAmFLADLPTKDLNQYDVTITPEVSSKSVNRAIIAE 

LVRLYKESDLGRRLPAYDGRKSLYTAGELPFTWKEFSVKIVDEDDGIINGPKRERSYKVA 

IKFVARANMHHLGEFLAGKRADCPQEAVQILDIVLRELSVKRFCPVGRSFFSPDIKTPQR 

LGEGLESWCGFYQSIRPTQMGLSLNIDMASAAFIEPLPVIEFVAQLLGKDVLSKPLSDSD 

RVKIKKGLRGVKVEVTHRANVRRKYRVAGLTTQPTRELMFPVDENCTMKSVIEYFQEMYG 

FTIQHTHLPCLQVGNQKKASYLPMEACKIVEGQRYTKRLNEKQITALLKVTCQRAEGQRN 

DILRTVQHNAYDQDPYAKEFGMNISEKLASVEARILPAPWLKYHENGKEKDCLPQVGQWN 

MMNKKMINGMTVSRWACVNFSRSVQENVARGFCNELGQMCEVSGMEFNPEPVIPIYSARP 

DQVEKALKHVYHTSMNKTKGKELELLLAILPDNNGSLYGDLKRICETELGLISQCCLTKH 

VFKISKQYLADVSLKINVKMGGRNTVLVDAISCRIPLVSDIPTIIFGADVTHPEWGEESS 

PSIAAWASQDWPEVTKYAGLVCAQAHRQELIQDLYKTWQDPVRGTVSGGMIRDLLISFR 

KATGQKPLRIIFYRDGVSEGQFYQVLLYELDAIRKACASLEPNYQPPVTFIVVQKRHHTR 

LFANNHRDKNSTDRSGNI LPGTWDTKI CHPTEFDF YLCSHAG I QGTSRP AHYHVLWDEN 

NFTADGIQSLTNWLCYTYARCTRSVSIVPPAYYAHLAAFRARFYLEPEIMQDNGSPGKKN 
TKTTTVGDVG VKPLPALKENVKRVMF YC * 
>G1267 (152.. 967) 

AAGTAGAGAATAATAATCACATC^^GATTGTTTATAACCCTCCCCNTAATCACCTTCTTA 

NTNACCACCCTCTCCGGCTCTCAACAGAAC71ACAAC7VAAAAAACAGCTT 

TTCCGGCGAAATCGGACGGTCGAGATCAATCATGCATCGTAGAGCAGCAATTCAAGAATC 

GGATGACGAAGAAGATGAGACTTACAACGACGTCGTTCCTGAATCTCCTTCTTCTTGTGA 

AGACTCAAAGATCTCAA7\ACCAACTCCAAAGAAAAGGAGGAACGTAGAGAAGAGAGTTGT 

CTCAGTTCCGATAGCTGACGTGGAAGGATCTAAGAGCAGAGGCGAAGTATATCCACCGTC 

CGATTCATGGGCCTGGAGAAAGTACGGACAAAAACCGATCAAAGGCTCGCCTTATCCCAG 

GGGATATTACAGATGTAGTAGCTCAAAAGGATGTCCGGCGAGGAAGCAGGTGGAGAGAAG 

CCGTGTGGACCCTTCTAAGCTTATGATTACTTACGCCTGCGACCACAATCACCCTTTCCC 
TTCCTCCTCCGCTAACACO^TCC^ 

GAAAGAGGAAGAATACGAAGAGGAGGAAGAAGAACTAACCGTCACCGCCGCAGAGGAACC 
ACCGGCGGGACTTGATCTAAGCCACGTAGACTCACCGTTGCTATTAGGCGGCTGCTACAG 
CGAAATCGGAGAGTTCGGGTGGTTCTACGACGCGTCGATCTCATCATCATCTGGTTCTTC 
GAATTTCCTCGACGTAACTCTAGAGAGAGGTTTTTCAGTAGGCCAAGAGGAAGATGAGTC 
TTTGTTCGGTGATCTCGGTGATTTACCTGATTGCGCCTCCGTGTTCCGCCGTGGGACTGT 
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TGCGACGGAGGAGCAACATCGAAGATGTGATTTTGGCGCCATTCCTTTCTGTGATAGTTC 

TAGATGAGTTTGTGTGTGTAGCCAAAACCAAAAGAAAAAAACACAATTTTTTTATTTTCC 

ACTGTAAAGGTGTATCAATGGTGGATTCATTTTTTTAAAAAAAAAAAAAAAA 

>G1267 Amino Acid Sequence (domain in AA coordinates: 70-127) 

MHRRAAIQESDDEEDETTODWPESPSSCEDSKISKPTPKKRRNVEKRWSVPIADVEGS 

KSRGEVYPPSDSWAWRKYGQKPIKGSPYPRGYYRCSSSKGCPARKQVERSRVDPSKLMIT 

YACDHNHPFPSSSANTKSHHRSSVVLKTAKKEEEYEEEEEELTVTAAEEPPAGLDLSHVD 

S PLLLGGC YS E I GE FG WF YD AS I S S S S G S SNFLD VTLERG FS VGQEEDES LFGDLGDLPD 

CASVFRRGTVATEEQHRRCDFGAIPFCDSSR* 

>G1269 (88.. 951) 

AACAATTCTCTCTCTCTTTATTCTTCTTCTTCAGCTTCAGATTTCAGATCTTAAATCTTC 

AAGTCTTCTTCTTCTTCTTCTGCAACCATGGCTATGCAGGAACGTTGTGAGAGTTTATGT 

TCTGATGAACTTATATCTTCCTCAGATGCCTTTTACCTCAAGACAAGAAAGCCTTATACC 

ATCACTAAACAAAGAGAGAAATGGACAGAAGCAGAGCATGAGAAGTTTGTAGAAGCATTG 

AAACTCTATGGCAGAGCTTGGAGACGAATCGAAGAACATGTTGGAACAAAAACTGCAGTT 

CAGATTCGAAGCCATGCGCAGAAGTTCTTTACTAAGGTTGCTCGCGATTTTGGTGTTAGC 

TCTGAGTCCATTGAGATCCCGCCTCCAAGGCCAAAGAGAAAGCCGATGCATCCTTACCCT 

AGAAAGCTTGTGATTCCTGATGCAAAAGAGATGGTATACGCTGAACTAACCGGATCCAAG 

CTGATTCAGGATGAAGATAACCGATCTCCAACATCGGTTTTATCAGCTCATGGCTCAGAT 

GGATTAGGTTCCATTGGTTCAAATTCACCTAACTCTTCTTCAGCTGAGTTATCATCTCAC 

ACAGAGGAATCATTGTCTCTAGAAGCAGAGACCAAACAGAGCCTTAAGCTCTTTGGAAAA 

ACTTTTGTAGTTGGTGATTACAACTCTTCAATGAGTTGTGATGATTCTGAAGATGGCAAG 

AAGAAGCTATACTCAGAAACACAGTCTCTTCAATGTTCTTCTTCTACTTCAGAAAACGCT 

GAAACAGAAGTGGTAGTGTCGGAGTTCAAAAGAAGTGAGAGATCAGCTTTCTCTCAGTTA 

AAATCGTCGGTGACTGAGATGAACAACATGAGAGGGTTCATGCCTTACAAAAAGAGAGTA 

AAGGTGGAAGAAAACATTGACAATGTAAAATTATCATATCCTTTGTGGTGAAGTGTTCGT 

TTGTGTCAAGTCAGTTGTGTAAACTCTTTTGATCTCAACATCAGATTATGTGTATAATGT 

CAGAGTATTAGGGAAAGTTTTTTTGGATTAGATTCGTAAGATCACTCCAAAGTTTCGTGT 

CTTTCCATATAACCAGTTAGAAATTGAGATCCTTGTACTTAAACATTTTTATTTGATCAA 
TCAAATCTTCTTGATGAAAAAAAAAA 

>G1269 Amino Acid Sequence (domain in AA coordinates: 27-83) 

MAMQERCESLCSDELISSSDAFYLKTRKPYTITKQREKWTEAEHEKFVEALKLYGRAWRR 

IEEHVGTKTAVQIRSHAQKFFTKVARDFGVSSESIEIPPPRPKRKPMHPYPRKLVIPDAK 

EMVYAELTGSKLIQDEDNRSPTSVLSAHGSDGLGSIGSNSPNSSSAELSSHTEESLSLEA 

ETKQSLKLFGKTFWGDYNSSMSCDDSEDGKKKLYSETQSLQCSSSTSENAETEWVSEF 

KRSERSAFSQLKSSVTEMNWMRGFMPYKKRVKVEENIDNVK^ 

>G1452 (175. .1296) 

ATTTATTAAGCATCAATGAGAGAACTTGAGAGCTGGGTTTGAGTTCTGTCCAATAATACA 
TAACCACGTTATCATTTTTGTCCTTTACTATCTCATT^ 

TTCTTAGAGTCATTACTCTCTATAGGGCTCGAGCGGCCGCCCGGGCAGGTTTCTATGCAG 
ATGGTTC^CACTTCCCGCTCCATTGCCCAGATTGGGTTCGGTGTTAAGTCGCAATTAGTA 
CTCACTATAGGGCTCGAGCGGCCGCCCGGGCAGGTAAAAGATCAAACAATGTCTAAAGAA 
GCTGAGATGTCGATCGCGGTGTCGGCTTTGTTCCCTGGTTTTAGATTCTCTCCTACTGAT 
GTTGAACTTATCTCGTACTATCTTCGTCGTAAAATCGATGGTGATGAGAACTOTGTTGCT 
GTGATTGCTGAGGTCGAGATTTACAAGTTCGAGCCGTGGGACTTGCCAGAGGAATCGAAA 
CTGAAATCGGAGAACGAGTGGTTTTACTTCTGCGCGAGGGGGAGGAAGTACCCGCACGGG 
TCACAAAGCCGGCGAGCCACACAGCTAGGATATTGGAAAGCGACCGGTAAAGAGCGGAGT 
GTTAAATCCGGGAACCAAGTTGTTGGAACCAAGAGAACGCTTGTATTTCATATCGGTC^ 
GCTCCTCGTGGCGAGAGAACGGAGTGGATTATGCATGAATACTGCATCCATGGAGCCCCA 
CAGGATGCATTAGTGGTGTGCCGGTTAAGAAAAAATGCTGATTTTCGGGCTAGTTCGACC 
CAAAAAATTGAGGATGGTGTTGTGCAAGACGATGGCTACGTTGGCCAAAGAGGTGGTTTG 
GACAAGGAGGACAAATCCTACTATGAATCTGAGCATCAGATACCAAATGGTGACATCGCA 
GAAT(^TCAAATGTTGTTGAGGATCAGGCCGATACCX3ATGATGATTGTTACGCCGAGATO 
CTGAACGATGATATAATAAAGCTCGACGAAGAAGCGTTGAAAGCTAGCCAAGCGTTTCGA 
CCAACTAATCCAACTCATCAAGA7UVCAATATCAAGCGAGTCATCGAGTAAGAGGTCAAAA 
TGTGGTATAAAAAAAGAATCAACGGAAACAATGAA^ 

AACGTTGCCGGAACCGACTCCAGOTGGAGATTCCCGAACCCGTTCAAAATCAAGAAAGAT 
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GATAGCCAGAGATTGATGAAGAATGTTCTGGCCACTACTGTTTTCTTGGCTATCTTATTT 

TCTTTCTTTTGGACTGTATTAATAGCTAGGAACTAAAGCTAGTTACGACATACATATTAT 

TTATACATAAATAAATATAGTATTTTGTCTATGGCAAAAAAAAAAAAAAAA 

>G1452 Amino Acid Sequence (domain in AA coordinates: 30-177) 

MQMVHTSRSIAQIGFGVKSQLVLTIGLERPPGQVKDQTMSICEAEMSIAVSALFPGFRFSP 

TDVELISYYLRRKIDGDENSVAVIAEVEIYKFEPWDLPEESKLKSENEWFYFCARGRKYP 

HGSQSRRATQLGYWKATGKERSVKSGNQWGTKRTLVFHIGRAPRGERTEWIMHEYCIHG 

APQDALWCRLRKNADFRASSTQKIEDGWQDDGYVGQRGGLDKEDKSYYESEHQIPNGD 

IAESSNVVEDQADTDDDCYAEILNDDIIKLDEEALKASQAFRPTNPTHQETISSESSSKR 

SKCGIKKESTETMNCYALFRIIQJVAGTDSSWRFPNPFKIKKDDSQRLMKKTVLATTVFLAI 
LFS FFWTVL I ARN * 

>G1494 (114. .1406) 

TCGACAGAGTTGTGTTGGGCGTGGAACTTGGACTAGTTCCACATATCAGGTTATATAGAT 
CTTCTCTTTCAACTTCTGATTCGTCCAGAAGCTTTCCTAATCTGAGATCTGACATGGAAC 
ACCAAGGTTGGAGTTTTGAGGAGAATTATAGTTTGTCCACTAATAGAAGATCTATCAGGC 
CACAAGATGAACTAGTGGAGTTATTATGGCGAGATGGACAAGTGGTTCTGCAGAGCCAAA 
CTCATAGAGAACAAACCCAAACCCAGAAACAAGATCATCATGAAGAAGCCCTAAGATCCA 
GCACCTTTCTTGAAGATCAAGAAACTGTCTCTTGGATCCAATACCCTCCAGATGAAGACC 
CATTCGAACCCGACGACTTCTCCTCCCACTTCTTCTCAACCATGGATCCCCTCCAGAGAC 
CAACCTCAGAGACGGTTAAGCCTAAGTCCAGTCCTGAACCTCCTCAAGTCATGGTTAAGC 
CTAAGGCCTGTCCTGACCCTCCTCCTCAAGTCATGCCTCCTCCAAAATTTAGGTTAACAA 
ATTCATCATCGGGGATTAGGGAAACAGAAATGGAACAGTACTCGGTAACGACCGTTGGAC 
CTAGCCATTGCGGAAGCAACCCATCACAGAACGATCTCGATGTCTCAATGAGTCATGATC 
GAAGCAAAAACATAGAAGAAAAGCTTAATCCGAACGCAAGTTCCTCATCAGGTGGCTCCT 
CTG GTTG CAG CTTTGG CAAAG ATATCAAAGAAATG G CTAGTGGAAG ATGCATCACAACCG 
ACCGTAAGAGAAAACGTATAAATCACACTGACGAATCTGTATCTCTATCAGATGCAATCG 
GTAACAAGTCGAACCAACGATCAGGATCAAACCGAAGGAGTCGAGCAGCTGAAGTTCATA 
ATCTCTCCGAAAGGAGGAGGAGAGATAGGATCAATGAGAGAATGAAGGCTTTGCAAGAAC 
TAATACCTCACTGCAGTAAAACTGATAAAGCTTCGATTTTAGACGAAGCCATAGATTATT 
TGAAATCACTTCAGTTACAGCTTCAAGTGATGTGGATGGGGAGTGGAATGGCGGCGGCGG 
CGGCTTCGGCTCCGATGATGTTCCCCGGAGTTCAACCTCAGCAGTTCATACGTCAGATAC 
AGAGCCCGGTACAGTTACCXCGATTTCCGGTTATGGATCAGTCTGCAATTCAGAACAATC 
CCGGTTTAGTTTGCCAAAACCCGGTACAAAACCAGATCATCTCCGACCGGTTTGCTAGAT 
ACATCGGTGGGTTCCCACACATGCAGGCCGCGACTCAGATGCAGCCGATGGAGATGTTGA 
GATTTAGTTCACCGGCGGGACAGCAAAGTCAACAACCGTCGTCTGTGCCGACGAAGACCA 
CCGACGGTTCTCGTTTGGACCACTAGGTTGGTGAGCCACTTTGC 

>G1494 Amino Acid Sequence (domain in aa coordinates : 261-311) 

MEHQGWSFEENYSLSTNRRSIRPQDELVELLWRDGQVVLQSQTHREQTQTQKQDHHEEAL 

RSSTFLEDQETVSWIQYPPDEDPFEPDDFSSHFFSTMDPLQRPTSETVKPKSSPEPPQVM 

VKPKACPDPPPQVMPPPKFRLTNSSSGIRETEMEQYSVTTVGPSHCGSNPSQNDLDVSMS 

HDRSKNIEEKLNPNASSSSGGSSGCSFGKDIKEMASGRCITTDRKRKRINHTDESVSLSD 

AIGNKSNQRSGSNRRSRAAEVHNLSERRRRDRINERMKALQELIPHCSKTDKASILDEAI 

DYLKSLQLQLQVMWMGSGMAAAAASAPMMFPGVQPQQFIRQIQSPVQLPRFPVMDQSAIQ 

NNPGLVCQNPVQNQIISDRFARYIGGFPHMQAATQMQPMEMLRFSSPAGQQSQQPSSVPT 
KTTDGSRLDH * 

>G1548 (1. .2511) 

ATGGCAATGTCTTGCAAGGATGGTAAGTTGGGATGTTTGGATAATGGGAAGTATGTGAGG 
TATACACCTGAACAAGTTGAAGCACTTGAGAGGCTTTATCATGACTGTCCTAAACCGAGT 
TCTATTCGCCGTCAGCAGTTGATCAGAGAGTGTCCTATTCTCTCTAACATTGAGCCTAAA 
CAGATC^GTGTGGTTTCAGAACCGAAGATGTAGAGAGAAACAAAGGAAAGAGGCTTCA 
CGGCTTCAAGCTGTGAATCGGAAGTTGACGGCAATGAACAAGCTCTTGATGGAGGAGAAT 
GAC^GGTTGCAGAAGCAAGTGTCACAGCTGGTCCATGAAAACAGCTACTTCCGTCAACAT 
ACTCCAAATCCTTCACTCCCAGCTAAAGACACAAGCTGTGAATCGGTGGTGACGAGTGGT 
CAGCACCAATTGGCATCTCAAAATCCTCAGAGAGATGCTAGTCCTGCAGGACTTTTGTCC 
ATTGCAGAAGAAACTTTAGCAGAGTTTCTTTCAAAGGCAACTGGAACCGCTGTTGAGTGG 
GTTCAGATGCCTGGAATGAAGCCTGGTCCGGATTCCATTGGAATCATCGCTATTTCTCAT 
GGTTGCACTGGTGTGGCAGCACGCGCCTGTGGCCTAGTGGGTCTTGAGCCTACAAGGGTT 



41 



BNSDOCID: <WO_03013227A2JA> 



WO 03/013227 



PCT/US02/25805 



GCAGAGATTGTCAAGGATCGTCCTTCGTGGTTCCGCGAATGTCGAGCTGTTGAAGTTATG 

AACGTGTTGCCAACTGCCAATGGTGGAACCGTTGAGCTGCTTTATATGCAGCTCTATGCA 

CCAACTACATTGGCCCCACCACGCGATTTCTGGCTGTTACGTTACACCTCTGTTTTAGAA 

GATGGCAGCCTTGTGGTGTGCGAGAGATCTCTTAAGAGCACTCAAAATGGTCCTAGTATG 

CCACTGGTTCAGAATTTTGTGAGAGCAGAGATGCTTTCCAGTGGGTACTTGATACGGCCT 

TGTGATGGTGGTGGCTCAATCATACACATAGTGGATCATATGGATTTGGAGGCTTGTAGC 

GTGCCTGAGGTCTTGCGCCCGCTCTATGAGTCACCCAAAGTACTTGCACAGAAGACAACA 

ATGGCGGCACTGCGTCAGCTCAAGCAAATAGCTCAGGAGGTTACTCAGACTAATAGTAGT 

GTTAATGGGTGGGGACGGCGTCCTGCTGCCTTAAGAGCTCTCAGCCAGAGGCTAAGCAGA 

GGCTTCAATGAAGCTGTAAATGGTTTCACTGATGAAGGATGGTCAGTGATAGGAGATAGC 

ATGGATGATGTCACAATCACTGTAAACTCTTCTCCAGACAAGCTAATGGGTCTAAATCTT 

ACATTTGCCAATGGCTTTGCTCCTGTAAGCAATGTTGTTTTATGCGCAAAAGCATCAATG 

CTTTTACAGAATGTTCCTCCGGCGATCCTGCTTCGGTTTCTGAGGGAGCATAGGTCAGAA 

TGGGCTGACAACAACATTGATGCGTATCTAGCAGCAGCAGTTAAAGTAGGGCCTTGTAGT 

GCCCGAGTTGGAGGATTTGGAGGGCAGGTTATACTTCCACTTGCTCATACTATTGAGCAT 

GAAGAGTTTATGGAAGTCATCAAATTGGAAGGTCTTGGTCATTCCCCTGAAGATGCAATC 

GTTCCAAGAGATATCTTCCTTCXTCAACTTTGTAGCGGAATGGATGAAAATGCTGTAGGA 

ACCTGTGCGGAACTTATATTTGCTCCAATCGATGCTTCGTTTGCGGATGATGCACCTCTG 

CTTCCTTCTGGTTTTCGTATTATCCCTCTTGATTCCGCAAAGGAAGTATCTAGCCCAAAC 

CGAACCTTGGATCTTGCTTCGGCACTGGAAATTGGTTCAGCTGGAACAAAAGCCTCAACT 

GATCAATCAGGAAACTCCACATGTGCAAGATCTGTGATGACAATAGCATTTGAGTTTGGT 

ATCGAGAGCCATATGCAAGAACATGTAGCATCCATGGCTAGGCAGTATGTTCGAGGTATC 

ATATCATCGGTGCAGAGAGTAGCATTGGCTCTTTCTCCTTCTCATATCAGCTCACAAGTT 

GGTCTACGCACTCCTTTGGGTACTCCTGAAGCCCAAACACTTGCTCGTTGGATTTGCCAG 

AGTTACAGGGGCTACATGGGTGTTGAGCTACTTAAATCAAACAGTGACGGCAATGAATCT 

ATTCTTAAGAATCTTTGGCATCACACTGATGCTATAATCTGCTGCTCAATGAAGGCCTTG 

CCCGTCTTCACATTTGCAAACCAGGCGGGACTTGACATGCTGGAGACTACATTAGTTGCT 

CTTCAAGACATCTCTTTAGAGAAGATATTTGATGACAATGGAAGAAAGACTCTTTGCTCT 

GAGTTCCCACAGATCATGCAACAGGGCTTCGCGTGCCTTCAAGGCGGGATATGTCTCTCA 

AGCATGGGGAGACCAGTTTCGTATGAGAGAGCAGTTGCTTGGAAAGTACTCAATGAAGAA 

GAAAATGCTCATTGCATCTGCTTTGTGTTCATCAATTGGTCCTTTGTGTGA 

>G1548 Amino Acid Sequence (domain in AA coordinates: 17-77) 

MAMSCKDGKLGCLDNGKYVRYTPEQVEALERLYHDCPKPSSIRRQQLIRECPILSNIEPK 

QI KVWFQKTRRCREKQRKEASRLQAViraKLTAIW QLVHENS YFRQH 

TPNPSLPAKDTSCES WTSGQHQLASQNPQRDAS PAGLLS I AEETLAEFLSKATGTAVEW 

VQMPGMKPGPDSIGIIAISHGCTGVAARACGLVGLEPTRVAEIVKDRPSWFRECRAVEVM 

mTLPTANGGTVELLYMQLYAPTTIAPPRDFWLLRYTSVLEDGSLVVCERSLKSTQNGPSM 

PLVQNFVRAEMLSSGYLIRPCDGGGS I IHIVDHMDLEACSVPEVLRPLYESPKVIjAQKTT 

MAALRQLKQIAQEVTQTNSSWGWGRRPAALRALSQRLSRGFNEAVNGFTDEGWSVIGDS 

MDDVTITVNSSPDKLMGLNLTFANGF 

WADWIDAYLAAAVKVGPCSARVGGFGGQVILPLAHTIEHEEFMEVIKLEGLGHSPEDAI 
VPRDIFLLQLCSGMDENAVGTCAELIFAPIDASFADDAPLLPSGFRIIPLDSAKEVSSPN 
RTLDLASALEIGSAGTKASTDQSGNSTCARSVOT 

I SS VQRVALALSPSHI SSQVGIiRTPLGTPEAQTLARW I CQS YRGYMGVELLKSNSDGNES 
ILKNLWHHTDAIICCSMKALPVFTFANQAGLDMLETTLVAIiQDISLEKIFDDNGRKTLCS 
EFPQIMQQGFACLQGGICLSSMGRPVSYERAVAWKVLNEEENAHCICFVFINWSFV* 

>G1574 (1..1962) 

ATGGATGATACAATOGACATGAGTTCAGGTAGTGATGAAGAAGTACAAGAAGAGAAGACC 

ACTGTTAACGAGAGGGTCATCTATCAGGCTGCATTACAAGATCTGAAGCAACCCAAGACC 

GAAAAGGATCTACCTCCTGGTGTTCTTACAGTTCCTCTTATGAGGCATCAGAAAATTGCA 

TTGAACTGGATGCGTAAGAAAGAAAAAAGAAGCAGGCACTGTTTGGGAGGGATATTAGCA 

GATGATCAGGGACTTGGTAAAACGATCTCGACGATCTCTCTTATCCTGTTACAAAAGTTG 

AAGTCA(^TCAAAGCAGAGAAAGCGAAAAGGTCAAAACTCTGGTGGTACATTGATTGOT 

TGTCCAGCAAGTGTTGTAAAACAATGGGCAAGAGAAGTT^ 

CACAAACTCTCTGTTTTAGTCCACC^TGGA^ 

GGAATATATGATGTGGTCATGACAACTTACGCCATTC 

CCTATGCTGAATCGTTATGATAGTATGAGAGGCAGAGAAAGCCTTGACGGATCGAGTTTG 
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ATTCAGCCTCACGTTGGTGCACTAGGAAGAGTTAGGTGGTTGAGAGTAGTATTAGATGAA 
GCTCATACAATTAAAAACCATAGAACCCTAATTGCAAAAGCTTGTTTTAGCCTTAGAGCC 
AAAAGGAGATGGTGTTTGACTGGAACGCCGATTU^GAACAAAGTAGACGATCTTTATAGC 
TATTTCAGATTTCTTAGATATCATCCATATGCCATGTGCAATTCATTTCACCAAAGAATC 
AAAGCTCCAATTGATAAAAAGCCTCTTCATGGTTACAAGAAGCTTCAAGCTATTCTAAGG 
GGTATAATGTTGCGCCGCACCAAAGAATGGTCTTTCTACAGGAAGCTTGAATTGAATTCA 
CGTTGGAAGTTTGAGGAATATGCTGCTGATGGGACTTTGCATGAACACATGGCTTATCTT 
TTGGTGATGCTTTTGCGACTACGCCAAGCTTGTAACCATCCACAACTTGTTAACGGATAT 
AGTCACTCAGATACTACAAGAAAAATGTCAGATGGAGTTCGAGTAGCCCCTAGAGAGAAT 
CTAATCATGTTCCTCGATCTCTTGAAATTATCCTCAACCACCTGCTCTGTTTGTAGTGAT 
CCACCAAAAGACCCTGTTGTTACTTTGTGTGGCCATGTGTTTTGTTATGAGTGTGTGTCT 
GTAAACATTAACGGGGATAACAATACGTGCCCTGCACTTAATTGCCACAGCCAGCTTAAA 
CATGATGTTGTTTTCACTGAATCTGCAGTTAGAAGTTGCATCAACGATTATGATGATCCT 
GAAGATAAAAATGCTTTAGTTGCATCAAGGCGAGTTTATTTCATCGAAAATCCGAGCTGT 
GATAGAGATTCTTCAGTCGCTTGCAGAGCAAGGCAGTCCAGACACTCCACCAATAAAGAC 
AATAGTATCAGTGGACTGAATCTCATTTTTACGTTTCTCAAAGACAAATGTAATGATTAT 
GAAACAGGTGCGATGTTGATGTCTCTTAAAGGTGGAAACCTTGGATTGAATATGGTAGCT 
GCAAGTCATGTCATTCTACTGGACCTATGGTGGAATCCAACAACAGAGGATCAAGCTATT 
GATCGAGCTCATCGTATCGGACAAACTCGAGCTGTTACGGtCACTCGTATTGCCATCAAA 
AATACCGTTGAGGAACGAATTTTGACTCTTCATGAACGTAAAAGGAACATTGTTGCATCT 
GCATTGGGTGAAAAAAACTGGCAAAAGTTCTGCGATTCAACTAACACTAGAAGATCTGGA 
ATATCTGTTTTTTGGTGTGTAGAATATCCCAGAGTTTTTATTGATAAGAGGAATAAAACC 
TTTAGCTATTTAATAAGTCACAAGTGTGAATGTAATGAATAA 

>G1574 Amino Acid Sequence (domain in AA coordinates: 28-350) 

MDDTMDMSSGSDEEVQEEKTTVNERVIYQAALQDLKQPKTEKDLPPGVLTVPLMRHQKIA 

LNWMRKKEKRSRHCLGGIl^DQGLGKTISTISLILLQKLKSQSKQRKRKGQNSGGTLIV 

CPASWKQWAREVKECTSDEHKLSVLVHHGSHRTI<I)PTEIAiyDVVMTTYAIVTNEVPQN 

PMLNRYDSMRGRESLDGSSIjIQPHVGALGRVRWI^VVIjDEAHTIKNHRTLI 

^WCLTGTPIKNKVDDLYSYFRFLRYHPYM 

GIMLRRTKEWSFYRKLELNSRWKFEEYAADGTLHEHMAYLLVMLLRLRQACNHPQLVNGY 
SHSDTTRKMSDGVRVAPRENLIMFLDLLKLSSTTCSVCSDPPKDPVVTLCGHVFCYECVS 
VNINGDNNTCPALNCHSQLKHDWFTESATOSCITOYDDPEDKNALVASRRVYFIENPSC 
DRDS S VACRARQS RHSTNKDN S I SGLNL I FTFLKDKCND YETGAMLMSLKAGNLGLNMVA 
ASHVILLDLWWNPTTEDQAIDRAHRIGQTRAVTVTRIAIIO^TVEERILTLHERKRNIVAS 

ALGEKNWQKFCDSTNTRRSRISVFWCVEYPRVFIDKRNKTFSYLISHKCECNE* 
>G1586 (1..807) 

ATGAATCAAGAAGGTGCTTCACATAGCCCATCCTCCACTTCCACCGAACCAGTCCGGGCA 

CGTTGGTCACCTAAACCGGAGCAAATCTTGATACTCGAATCCATCTTCAACAGTGGTACT 

GTTAACCCACCAAAAGATGAAACGGTGAGGATAAGAAAGATGCTTGAGAAATTCGGTGCT 

GTGGGAGACGCAAACGTCTTCTACTGGTTTCAAAACCGACGGTCAAGATCTCGCCGGAGA 

CACCGGCAGCTTTTAGCAGCCACCACCGCAGCCGCCACCTCCATAGGAGCTGAAGACCAC 

CAGCACATGACGGCCATGAGCATGCATCAATATCCTTGCAGCAACAACGA^ 

GGGTTTGGAAGTTGTAGCAACTTATCAGCTAATTACTTCCTTAATGGATCGTCGTCATCT 

CT^AATCCCTTCCTTTTTCCTCGGCCTCTCTTCTTCAAGTGGTGGGTGTGAGAACAACAAT 

GGTATGGAGAATCTCTTCAA7y\TGTATGGCCATGAATCTGATCATAATCATCAGCAGCAG 

CATCATAGCTCAAATGCTGCATCAGTTTTAAACCCATCTGATCAAAACTCC^^ 

TACGAACAAGAAGGGTTTATGACGGTGTTTATAAACGGAGTTCCTATGGAAGTAACAAAA 

GGAGCAATAGACAXGAAAACAATGTTCGGTGATGATTCGGTGTTACTTCATTCCTCTGGT 

CTTCCTCTTCCCACTGATGAGTTTGGTTTCTTGATGCATTCTTTACAA^ 

TATTTCCTGGTACCGAGACAGACATGA 

>G1586 Amino Acid Sequence (domain in AA coordinates: 21-81) 

MNQEGASHSPSSTSTEPVRARWSPKPEQILILESIFNSGTVNPPKDETVRIRKMLEKFGA 

VGDANVFYWFQNRRSRSRRRHRQLIiAATTAAATSIGAEDHQHMTAMSMHQYPCSNNEIDL 

GFGSCSNLSANYFLNGSSSSQIPSFFLGLSSSSGGCENWNGMENLFKMYGHESDHNHQQQ 

HHS SNAAS VLNPSDQNSNSQ YEQEGFMTVF INGVPMEVTKGAI DMKTMFGDDSVLLHS SG 

LPLPTDEFGFLMHSLQHGQTYFLVPRQT* 

>G1786 (1..1170) 
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ATGATCGTGTACGGTGGGGGAGCATCCGAGGACGGTGAAGGTGGAGGGGTGGTTCTGAAG 
AAAGGGCCATGGACGGTGGCCGAGGACGAGACACTGGCGGCTTACGTACGGGAATACGGT 
G7VAGGGAACTGGAATTCTGTTCAGAAGAAGACATGGCTGGCTAGGTGTGGCAAGAGCTGC 
CGCCTCCGCTGGGCTAACCACTTACGACCTAATCTCAGGAAAGGCTCCTTCACCCCCGAG 
GAAGAACGTCTCATCATACAACTCCACTCTCAGCTAGGCAACAAATGGGCTCGCATGGCT 
GCTCAGTTACCAGGCAGAACAGATAACGAGATCAAGAACTACTGGAACACGAGGTTGAAA 
CGCTTCCAACGCCAAGGCCTCCCTCTCTACCCTCCAGAATATTCCCAAAACAATCATCAA 
CAACAAATGTATCCTCAACAGCCCTCCTCACCTCTCCCGTCCCAAACACCTGCTTCTTCC 
TTTACCTTTCCTCTCCTCCAACCGCCTTCTCTGTGTCCCAAACGTTGTTATAACACTGCC 
TTCTCTCCCAAGGCCTCATATATTTCTTCTCCAACCAATTTCCTTGTCTCGTCTCCGACC 
TTTCTTCACACCCATTCCTCTCTTTCCTCCTATCAGTCTACCAATCCGGTTTACTCCATG 
AAACATGAGCTCTCTTCAAACCAAATTCCATACTCTGCCTCTTTAGGAGTCTATCAAGTA 
AGCAAGTTCTCAGACAATGGGGATTGTAACCAAAACCTGAACACCGGTTTGCATACAAAT 
ACCTGTCAGCTGTTAGAGGATCTTATGGAGGAGGCCGAGGCTCTAGCTGATAGCTTTCGT 
GCTCCTAAGCGGAGACAAATCATGGCTGCGCTTGAGGACAACAACAACAACAACAACTTT 
TTCTCGGGAGGTTTCGGACATCGTGTTTCTTCCAACAGTCTATGTTCCTTGCAAGGTTTA 
ACACCAAAGGAAGATGAGTCTCTCCAGATGAACACAATGCAAGATGAGGACATAACAAAG 
CTTCTTGACTGGGGAAGTGAAAGTGAAGAAATCTCAAACGGGCAATCCTCTGTGAT/y\CA 
ACAGAGAACAACCTTGTCCTTGACGATCACCAGTTCGCTTTTCTGTTTCCAGTTGATGAT 
GACACCAACAACTTGCCAGGGATCTGCTAG 

>G1786 Amino Acid Sequence (domain in AA coordinates: TBD) 
MIWGGGASEDGEGGGVVLKKGPWTVAEDETLAAYTOEYGEG 

RLRWANHLRPNLRKGSFTPEEERLIIQLHSQLGNK^ARMAAQLPGRTDNEIKNYWNTRLK 

RFQRQGLPLYPPEYSQNNHQQQMYPQQPSSPLPSQTPASSFTFPLLQPPSLCPKRCYNTA 

FSPKASYISSPTNFLVSSPTFLHTHSSLSSYQSTNPVYSMKHELSSNQIPYSASLGVYQV 

S KFSDNGDCNQI^NTGLHTNTCQLIjEDLMEEAEAL^ F 

FSGGFGHRVSSNSLCSLQGLTPKEDESLQMNTMQDEDITKLLDWGSESEEISNGQSSVIT 

TENNLVLDDHQFAFLFPVDDDTNNLPGIC* 

>G1792 (77.. 496) 

AATCCATAGATCTCTTATTAAATAACAGTGCTGACCAAGCTCTTACAAAGCAAACCAATC 
TAGAACACCAAAGTTAATGGAGAGCTCAAACAGGAGCAGCAACAACCAATCACAAGATGA 
CAAGCAAGCTCGTTTCCGGGGAGTTCGAAGAAGGCCITGGGGAAAGTTTGCAGCAGAGAT 
TCGAGACCCGTCGAGAAACGGTGCCCGTCTTTGGCTCGGGACATTTGAGACCGCTGAGGA 
GGCAGCAAGGGCTTATGACCGAGCAGCCTTTAACCTTAGGGGTCATCTCGCTATACTCAA 
CTTCCCTAATGAGTATTATCCACGTATGGACGACTACTCGCTTCGCCCTCCTTATGCTTC 
TTCTTCTTCGTCGTCGTCATCGGGTTCAACTTCTACTAATGTGAGTCGACAAAACCAAAG 
AGAAGTTTTCGAGTTTGAGTATTTGGACGATAAGGTTCTTGAAGAACTTCTTGATTCAGA 
AGAAAGGAAGAGATAATCACGATTAGTTTTGTTTTGATATTTTATGTGGCACTGTTGTGG 
CTACCTACGTGCATTATGTGCATGTATAGGTCGCTTGATTAGTACTTTATAACATGCATG 
CCACGACCATAAATTGTAAGAGAAGACGTACTTTGCGTTTTCATGAAATATGAATGTTAG 
ATGGTTTGAGTACAAAAAAAAAAAAAAAAAAAAAAA 

>G1792 Amino Acid Sequence (domain in aa coordinates: 17-85) 

MESSNRSSNNQSQDDKQARFRGVRRRPWGKFAAEIRDPSRNGARLWLGTFETAEEAARAY 

DRAAFNLRGHLAILNFPNEYYPRMDDYSLRPPYASSSSSSSSGSTSTNVSRQNQREVFEF 

EYLDDKVLEELLDSEERKR* 

>G1865 (48.-899) 

AAGAAGAGGACATGAAGCACAGAGATTCTGCAGACTGCAGGTGACCAATGGACACTTTAT 
CAATAAAAACATACCTACTACTCTCTTACACT 

TTAATCTCTCTTTCTTCTTCATCTCTCTTTCTCTTTCTCTCTTCATGGCTACAAGGATTC 
CATTCACAGAATCACAATGGGAAGAACTTGAAAACCAAGCTCTTGTGTTCAAGTACTTAG 
CTGCAAATATGCCTGTTCCACCTCATCTTCTCTTCCTCATCAAAAGACCCTTTCTCT 
CTTCTTCTTCITCTTCATCTTCTTCTC 

TTGGGTGGAATGTGTATGAGATGGGAATGGGAAGAAAGATAGATGCAGAGCCAGGAAGAT 
GTAGAAGAACTGATGGCAAGAAATGGAGATGCTCTAAAGAAGCTTACCCTGACTCTAAGT 
ACTGTGAGAGACATATGCATAGAGGCAAGAACCGTTCTTCCTCAAGAAAGCCTCCTCCTA 
CTCAATTCACTCC7Uy\TCTCTTTCTCGACTCTTCTTCCAG7VAGAAGAAGAAGTGGATA^ 
TGGATGATTTCTTCTCCATAGAACCTTCCGGGTCAATC^^ 
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TGGAAGATAATGATGATGGCTCATGTAGAGGCATCAACAACGAGGAGAAGCAGCCGGATC 

GACATTGCTTCATCCTTGGTACTGACTTGAGGACACGTGAGAGGCCATTGATGTTAGAGG 

AGAAGCTGAAACAAAGAGATCATGATAATGAAGAAGAGCAAGGAAGCAAGAGGTTTTATA 

GGTTTCTTGATGAATGGCCTTCTTCTAAATCTTCTGTTTCTACTTCACTCTTCATTTGAT 

CATCTTTTGTTCTTATAACCTTGTATTTCTTGTTAAGATGGTAATGCAAATT 

>G1865 Amino Acid Sequence (domain in AA coordinates- 124-149) 

MDTLSIKTYLLLSYTFNFPIQIPIPNLSFFFISLSLSLFMATRIPFTESQWEELENQALV 

FKYLAANMPVPPHLLFLIKRPFLFSSSSSSSSSSSFFSPTLSPHFGWNVYEMGMGRKIDA 

EPGRCRRTDGKKWRCSKEAYPDSKYCERHMHRGKNRSSSRKPPPTQFTPNLFLDSSSRRR 

RSGYMDDFFSIEPSGSIKSCSGSAMEDNDDGSCRGINNEEKQPDRHCFILGTDLRTRERP 

LMLEEKLKQRDHDNEEEQGSKRFYRFLDEWPSSKSSVSTSLFI* 

>G1886 (43 . . 909) 

AGGAAACATAAGTAATCGTTGCTTCGATCCTTTGTTACATGGATGGATCCTGAACAGGAA 

ATCTCAAACGAGACTTTGGAAACTATATTGGTAAGTTCAACAAAAGGAAGCAATAATAAC 

AATAAGAAAATGGAAGAAGAAATGAAGAAGAAAGTATCAAGAGGAGAATTAGGAGGTGAA 

GCTCAAAATTGTCCAAGATGTGAATCTCCAAACACAAAGTTTTGTTACTACAACAACTAT 

AGTCTCTCACAACCTCGTTACTTCTGCAAATCTTGTCGGAGATATTGGACTAAAGGCGGT 

ACTCTTCGTAACGTTCCCGTCGGTGGTGGTTGCCGTCGAAACAAACGATCCTCTTCCTCA 

GCTTTCTCCAAGAACAACAACAATAAGTCTATTAATTTCCATACTGATCCACTTCAGAAC 

CCTTTAATTACGGGAATGCCACCATCATCTTTTGGTTATGATCACTCCATTGATCTCAAC 

CTCGCTTTCGCTACTCTCCAAAAGCATCATTTATCCTCTCAAGCTACTACGCCTTCTTTT 

GGGTTTGGAGGTGATCTTTCTATTTATGGAAACTCAACGAATGATGTAGGGATCTTCGGA 

GGGCAAAACGGTACTTATAACAATAGTTTGTGTTATGGGTTTATGTCCGGAAATGGTAAT 

AATAATCAAAATGAAATCAAGATGGCTTCTACATTGGGGATGTCTTTGGAAGGAAACGAG 

AGAAAGCAAGAGAATGTGAACAATAACAATAATAACTCAGAGAATCCTAGCAAGGTGTTC 

TGGGGGTTTCCATGGCAGATGACCGGAGATTCCGCCGGAGTTGTACCGGAGATTGATCCC 

GGAAGGGAAAGCTGGAATGGGATGGTTTCATCTTGGAATAATGGTTTACTCAACACTCCT 
TTGGTCTAGCAGATCATTAA 

>G1886 Amino Acid Sequence (domain in aa coordinates: 17-59) 
MDPEQEISNETLETILVSSTKGSNNNMKKMEEEMKKKVSRGELGGEAQNCPRCESPNTKF 
CT^YSLSQPRYFOTSCRRYWTKGGTLRIJVPVGGGCRRNKRSSSSAFSKNNNWKSINFH 
3^r lTCMPPSSFGYDHSIDLNLAFATLQKHHLSS QATTPSFGFGGDLSIYGNSTN 
DVGIFGGQNGTY^SLCYGFMSGNGNraQNEIKMASTLGMSLEGNERKQEimJWHNNNSE 
NPSKVFWGFPWQMTGDSAGWPEIDPGRESWNGMVSSWNNGLLNTPLV* N ™ SE 
>G1933 (33.. 1418) 

AATTCA^TTAAAGTAATTTATCTTTCAGAAAATGGCGGTTGAAGACGATGTATCTTTGA 
TAAGAACGACGACGTTAGTGGCACCAACAAGACCCACGATTACAGTTCCTCATAGACCTC 
CGGCGATCGAAACGGCGGCGTATTTCTTTGGCGGTGGAGATGGGCTTAGTCTAAGCCCAG 
GGCCACTTTCTTTTGTCTCTTCTTTGTTTGTTGATAACrrCCCTGACGTCTTGACGCCGG 
ATAACCAACGGACGACGTCGTTTACTCAGCTTCTTAACGGAACTATGTCGOTGTCTCCTG 
GTGGCGGAGGACGTTCAACGGCGGGGATGTTCGCCGGAGGAGGTCCGATGTTTACAATCC 
CTTCTGGTTTCAGCCCTTCTAGTCTTCTCACCTCGCCCATGTTCTTTCCCCCGCAGTCGT 
^™^ CGGCmATO ^ C ^ CGGCAGCAGT ™ CMCCGC ^ C ^CAACGACCAG 
ACACGTTTCCTCACCATATGCCACCATCGACATCCGTCGCCGTCCATGGTCGTCAATCTT 

ATAATAACCGGTCGTATAACGTTGTGAACGTTGATAAACCGGCGGATGACGGTTATAACT 
^ A ^ AAGTACGGACAA ^ GCCTATCAAAGGGTGTG ^ 

gtacacatgttaactotccggtgaagaagaaagtcgaacggtc^tcggatggacagatS 

SSSp^ acaaaggtcaacatgatcacg ^ 

™ GG Jji AGAG , A ^T GA ^ 

cggcttcaaagataagaagaatagacggtgtotcgacgactcaccggacggtgaccgagc 

CGGCAGCAACCGCAGCTGCGGTGGGGCCGTCTGACCACCATCGTATGAGATCAATGTCGG 
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GGAACAATATGCAACAACATATGAGTTTCGGTAACAATAATAACACAGGCCAATCTCCGG 
TTCTTTTGAGGTTGAAAGAAGAGAAAATCACAATTTGACTTTTAAGAACCAAAGATTTCG 
AGATTGATATT 

>G1933 Amino Acid Sequence (conserved domain in AA coordinates : 205-263 , 344-404) 

MAVEDDVSLIRTTTLVAPTRPTITVPHRPPAIETAAYFFGGGDGLSLSPGPLSFVSSLFV 

DNFPDVLTPDNQRTTSFTQLLNGTMSVSPGGGGRSTAGMFAGGGPMFTIPSGFSPSSLLT 

SPMFFPPQSSAHTGFIQPRQQSQPQPQRPDTFPHHMPPSTSVAVHGRQSLDVSQVDQRAR 

NHYKNPGNNNNNRSYNVTOVDKPADDGTO 

VERSSDGQITQIIYKGQHDHERPQNRRGGGGRDSTEVGGAGQMMESSDDSGYRKDHDDDD 
DDDEDDEDLPASKIRRIDGVSTTHRTVTEPKIIVQTKSEVDLLDDGYRWRKYGQKWKGN 
PHPRS YYKCTTPNCTVR KHVERASTD AKAVI TT YEGKHNHDVPAARNGTAAATAAAVG PS 
DHHRMRSMSGNNMQQHMS FGNNNNTGQSPVLLRLKEEKITI * 
>G2059 (58.. 1089) 

TTAAGAACAGGCTTCATTCTCTGGACAAACACTCAAAAAACAAACAAAAAAAGGAACATG 
GAAGATCAGTTTCCTAAAATAGAAACTAGCTTCATGCACGACAAGCTCTTGTCTTCTGGA 
ATCTACGGGTTCTTGAGTTCTTCGACGCCGCCACAACTTCTCGGTGTTCCAATATTTTTG 
GAAGGTATGAAATCTCCTCTTCTTCCTGCTTCTTCGACTCCGAGCTACTTTGTGTCGCCT 
CATGATCATGAGCTCACATCTTCTATTCATCCATCTCCGGTAGCTTCTGTTCCTTGGAAC 
TTTCTAGAATCTT1TCCTCAGTCTCAACATCCTGATCATCATCCTTCTAAACCTCCAAAC 
CTTACTTTGTTCCTTAAAGAACCAAAGCTACTAGAACTTTCTCAATCCGT^AAGCAACATG 
AGCCCTTACCATAAATACATCCCAAACTCCTTTTATCAATCAGACCAAAACAGAAACGAA 
TGGGTAGAGATC^TAAAACTCTAACCAACTATCCCTCGAAAGGTTTTGGAAACTATTGG 
CTAAGTACCACCAAGACTCAACCCATGAAGTCAAAAACAAGAAAGGTTGTTCAGACGACG 
ACCCCAACAAAACTGTATAGAGGAGTGAGACAAAGACACTGGGGCAAATGGGTCGCAGAG 
ATTAGGCTTCCAAGGAACAGAACCCGTGTTTGGCTCGGCACTTTTGAAACCGCTGAGCAA 
GCAGCAATGGCTTACGATACAGCAGCTTATATCCTTCGTGGCGAATTCGCACACCTCAAC 
TTTCCTGATCTTAAACACCAGCTCAAGTCCGGTTCTTTGCGATGCATGATCGCCTCACTT 
TTGGAGTCCAAGATTCAACAGATCTCATCTTCCCAAGTAAGTAACTCTCCTTCTCCTCCT 
CCTCCAAAAGTGGGAACACCGGAGCAAAAGAATCATCACATGAAGATGGAGTCAGGAGAA 
GACGTGATGATGAAGAAACAGAAAAGCCATAAGGAAGTGATGGAAGGAGATGGTGTACAA 
TTGAGTAGGATGCCTTCTTTGGATATGGATCTCATTTGGGATGCTCTCTCATTTCCTCAT 
TCTTCTTGACTTCAAATTAATATTTGTCAAACTTATTTTACTTACTTCTACCC1TTTTTA 
TATCAAAAGTTTCCACCAAAGAAAGAAATTCATATTATGATGCCAAGATTGGTTTGCATT 
TGGGGTTGAACACATTGTAATTCTTCTTACGACCACATAATCAAGTGGTTCTCCTTTTTT 
TGTCTGCTAA 

>G2059 Amino Acid Sequence (conserved domain in AA coordinates : 184-254 ) 

MEDQFPKIETSFMHDKLLSSGIYGFLSSSTPPQLLGVPIFLEGMKSPLLPASSTPSYFVS 

PHDHELTSSIHPSPVASVPWNFLESFPQSQHPDHHPSKPPNLTLFLKEPKLLEIiSQSESN 

MS PYHKYI PNS FYQSDQNRNE WEINKTLTNYPSKGFGNYWLSTTKTQPMKSKTRKVVQT 

TTPTKLYRGTOQRHWGKWAEIRLPRNRTRWLGTFETAEQAAMAYDTAAYILRGEFA^ 

NFPDLKHQLKSGSLRC^IASLLESKIQQISSSQVSNSPSPPPPKVGTPEQKNHHMKMESG 

ED vTiflMKKQKSHKEVMEGDGVQLSRMPSLDMDL I WDALS FPHS S * 

>G2105 (42.. 1487) 

CTCTCTGACTTGAACTCTTCTCTTCTAC 

ATCCACAGTACGGTATAGAACAACCATCTTCTCAATTCTCCTCTGATCTCTTCGGCnTC^ 

ACCTCGTTTCAGCGCCGGACCAGCACCATCGTCTTCATTTCACCGACC^ 

TATTGCCACGTGGAATACAAGGGCTTACGGTGGCTGGAAACAACAGTAACACTATTACAA 

CGATCCAGAGTGGTGGCTGTGTTGGTGGGTTTAGTGGCTTTACGGACGGCGGAGGAACAG 

GGAGGTGGCCGAGGCAAGAGACGTTGATGTTGTTGGAGGTCAGATCTCGTCTTGATCACA 

AGTTCAAAGAAGCTAATCAAAAGGGTCCTCTCTGGGATGAAGTTTCTAGGATTATGTCGG 

AGGAAO^TGGATACACTAGGAGTGGCAAGAAGTGTAGAGAGAAGTTCGAGAATCTCTACA 

AGTACTATAAAAAAACAAAAGAAGGCAAATCCGGTCGGCGACAAGATGGTAAAAACTATA 

GATTTTTCCGGCZAGCTTGAAGCGATATACGGCGAATCCAAAGACTCGGTTTCTTGCTATA 

ACAA(^CGCAGTTCAT7^TGACCAATGCTCTTCATAGTAATTTCCGCGCTTCTAACATTC 

ATAACATCGTCCCTCATCATCAGAATCCCTTGATGACCAATACCAATACTCAAAGTCAAA 

GCCTTAGCATTTCTAACAATTTCAACTCCTCCTCCGATTTGGATCTAAC 

AAGGAAACGAAACTACTAAAAGAGAGGGGATGCATTGGAAGGAAAAGATCAAGGAATTCA 
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TTGGTGTTCATATGGAGAGGTTGATAGAGAAGCAAGATTTTTGGCTTGAGAAGTTGATGA 
AGATTGTGGAAGACAAAGAACATCAAAGGATGCTGAGAGAAGAGGAATGGAGAAGGATTG 
AAGCGGAAAGGATCGATAAGGAACGTTCGTTTTGGACAAAAGAGAGGGAGAGGATTGAAG 
CTCGGGATGTTGCGGTGATTAATGCCTTGCAGTACTTGACGGGAAGGGCATTGATAAGGC 
CGGATTCTTCGTCTCCTACAGAGAGGATTAATGGGAATGGAAGCGATAAAATGATGGCTG 
ATAATG7AATTTGCTGATGAAGGAAATAAGGGCAAGATGGATAAAAAACAAATGAATAAGA 
AAAGGAAGGAGAAATGGTCAAGCCACGGAGGGAATCATCCAAGAACCAAAGAGAATATGA 
TGATATACAACAATCAAGAAACTAAGATTAATGATTTTTGTCGAGATGATGACCAATGCC 
ATCATGAAGGTTACTCACCTTCAAACTCCAAGAACGCAGGAACTCCGAGCTGCAGCAATG 
CCATGGCAGCTAGTACAAAGTGCTTTCCATTGCTTGAAGGAGAAGGAGATCAGAACTTGT 
GGGAGGGTTATGGTTTGAAGCAAAGGAAAGAAAATAATCATCAGTAAGCTACATTTTTCA 
TTCTCAAAATGAAGAATAAGAGAACTTAGAAACGAT 

>G2105 Amino Acid Sequence (domain in AA coordinates: 100-153) 

MEDHQNHPQYGIEQPSSQFSSDLFGFNLVSAPDQHHRLHFTDHEISLLPRGIQGLTVAGN 

NSNTITTIQSGGCVGGFSGFTDGGGTGRWPRQETLMLLEVRSRIiDHKFKEANQKGPLWDE . 

VSRIMSEEHGYTRSGIOCCREKFENLYKYYKKTKEGKSGRRQDGKNYRFFRQLEAIYGESK 

DSVSCYimTQFIMTNALHSNFRASNIHNIVPHHQNPLMTNTNTQSQSLSISNNFNSSSDL 

DLTSSSEGNETTKREGMHWKEKIKEFIGVHMERLIEKQDFWLEKLMKIVEDKEHQRMLRE 

EEWRRIEAERIDKERSFWTKERERIEARDVAVINALQYLTGRALIRPDSSSPTERINGNG 

SDKMMADNEFADEGNKGKMDKKQ^ 

RDDDQCHHEGYSPSNSKNAGTPSCSNAMAASTKCFPLLEGEGDQNLWEGYGLKQRKENNH 
Q* 

>G2117 (49.. 465) 

ATACTTGTCAACAAAAATTTTCTTAAAGAACGCATAACTGTTTTTTTCATGGCTGGTTCT 
GTCTATAACCTTCCAAGTCAAAACCCTAATCCACAGTCTTTATTCCAAATCTTTGTTGAT 
CGAGTACCACTTTCAAACTTGCCTGCCACGTCAGACGACTCTAGCCGGACTGCAGAAGAT 
AATGAGAGGAAGCGGAGAAGGAAGGTATCGAACCGCGAGTCAGCTCGGAGATCGCGTATG 
CGGAAACAGCGTCACATGGAAGAACTGTGGTCCATGCTTGTTCAACTCATCAATAAGAAC 
7AAATCTCTAGTCGATGAGCTAAGCCAAGCCAGGGAATGTTACGAGAAGGTTATAGAAGAG 
AACATGAAACTTCGAGAGGAAAACTCCAAGTCGAGGAAGATGATTGGTGAGATCGGGCTT 
AATAGGTTTCTTAGCGTAGAGGCCGATCAGATCTGGACCTTCTAATCGTCTCGTAAGCTT 
GTTGGTTTTTTGTTGTTTATTTAAAG 

>G2117 Amino Acid Sequence (conserved domain in AA coordinates: 
MAGSVYNLPSQNPNPQSLFQIFVDRVPLSNLPATSDDSSRTAEDNERKRRRKVSNRESAR 

RSRMRKQRHMEELWSMLVQLINKNKS 
EIGLNRFLSVEADQIWTF* 
>G2124 (87.. 923) 

GAACAGCAAAACCCTAGATTTCCTGTTCAAGCTCAAGACCGTACAAAACTTTGGAACTCA 
TATATAAAGATCTCGAGAATAGCATTATGAATATCGTCTCTTGGAAAGATGCAAACGACG 
AAGTTGCAGGCGGCGCTACGACAAGACGTGAAAGAGAAGTAAAAGAGGATCAAGAAGAAA 
CCGAAGTCAGAGCCACCAGTGGCAAAACCGTAATTAAAAAGCAGCCTACATCGATCTCTT 
CTTCTTCTTCTTCGTGGATGAAATCCAAGGATCCGAGGATTGTTAGGGTTTCACGCGCCT 
TTGGAGGCAAAGACCGTCACAGCAAAGTGTGTACGTTACGTGGACTACGTGACAGACGCG 
TGAGATTATCAGTCCCAACGGCTATTCAGCTCTACGATCTTCAAGAACGGCTCGGTGTTG 
ACCAGCCTAGCAAAGCCGTTGACTGGTTGCTTGATGCAGCTAAAGAGGAGATCGACGAGC 
TACCTCCGTTACCTATCTCGCCGGAAAATTTCAGCATCTTCAACCATCATCAGTCCTTCT 
TGAATCTTGGTCAACGGCCCGGTCAAGATCCGACCCAACTCGGGTTTAAAATCAATGGAT 
GTGTACAAAAGTCTACTACTACTAGCCGCGAAGAAAACGATAGAGAGAAAGGAGAAAACG 
ATGTCGTTTACACAAACAATC^TCATGTTGGGT^ 

ATCATCATCATCATCACCAACATTTGAGTTTACAGGCAGATTATCATAGTCATCAACTAC 
ATAGTCTTGTCCCATTTCCATCACAAATTTTC 

CTACAACTATACAATCTTTGTTTCCATCATCATCGTCAGCTGGTTCAGGGACTATGGAGA 
CATTAGATCCGAGGCAAATGTAGCAACAATGGTGGTAGAGACATTGATAATCGGATGTCG 
TCGGTCCAATTCAACCGAACTAATAGCACTACAACGGCTAACATGTCGAGGCATCTAGGC 
TCGGAGCGTTGTACAAGTAGAGGAAGTGATCACCATATGTGAAGTTAGATTATTGAAACG 

ATATAATTGTTGTTTGATGTGTTCAGAAATAAGGGGACAC 

>G2124 Amino Acid Sequence (domain in AA coordinates: 75-132) 
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lldaakeeidelpplpispenfsifnhhqsflnlgqrpgqdptqlgfSSqkSSs 

>G2140 (148.. 125^) 



ACTCTCTTAACTTTCGTTTCrrCTCCTACOT 

'GAAGAC 

G ^ GAAGACAG ^^ 



CACATATATATATACATATATAGAGAGAGAGAAGAGGACAAAGAGTTGAAAGATGAAf' P C 



CACCAACTCATCTCTCATCATCACCATCATCATCATGATCCTTCTCAATCTGAAAC1TTG 

ggagcatccggtaacgttggatctggtttcactatcttctctcaaga^ccg^ctcS 

ATATGGTCTCTACCTCCACCTACCTCGATCCAACCACCATTTGAT^OTTTCCTCCTCCT 

tc^cttctccagcatctttc^^ 

GGA ™ ACAG ™ TGGGT ^^^ 

gaacaacitcggatcttgtcggaagctttaggtccggtagtacaagccgggS 

™™ G TCAAA ^ GTCATM ^^ 

catctcgctaagctccgtagcatattacccaacaccaccaaaacggataaagcgtcg™ 

ALAAATCTTGTCCCAACGGAAAGCGATGAGTTAACGGTAGCTTTCACGGAGGArraanaa 

accggagatggcagatttgtaattaaagcgtcgctttgctgtcSgS^ 

TTGCCTGACATGATTAAAACATTGAAAGCTATGCGTCTCAAAATOCTC^GGCGGAGATA 

accaccgttgggggacgagtcaagaacgtt^gtttgtta^SgSS 

GAGGAAGTGGAGGAAGAGTACTGTATAGGGACGATT^^ 
G ^ GAGC ^ TGTAGAGGAATCATCTTCTTCTGG ^G OT 
™ ACAACACTATCACTATCGTCG ^^AACAAC^^^ 
^CTTAAATCGCTTTTTITTC^^ 

ggttatggaaatgaatgttgtacgtcacgttatactatagatatatctgS 

t^™ GAAGTATOTGTAT ^ 



RSRAHHQGLQFGYEGF 
SIMDAKALAASKSHSE 

£fZ^ tg ^ 



QQQQYNQR* 
>G2144 (102. .1241) 

ATCTCGOTCACGAATTT(^MCflACTAGTGACGACTlTACCTCOCMGftfircCCTGTO 



S^^^^^^^^cSkAOAAOAOAGAACJATAA 



^^acattok^^tx^aacSa^^^^ 

^CGGAAA' 
'CGACAGi 
AGCGATj 
TTTCGAi 
GCTGCCJ 
AGAACG/ 
AGGCTG1 

TATCAATGAGACTTGCTGCGGTAAACCCCAGAMCGACT 



rCGACJ 

:agcg; 

^TTTCC 
U3CTGC 
-AGAAC 
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CTTCAGAAAACGGTTCTTTAATGGATGGGAGCTTCAATGCCGCACCAATGCAGCTTGCTT 
GGCCTCAGCAAGCCATTGAGACCGAACAGTCCTTTCATCACCGGCAACTGCAACAACCAC 
CAACACAACAATGGCCTTTTGACGGCTTGAACCAGCCGGTATGGGGAAGAGAAGAGGATC 
AAGCTCATGGCAATGATAACAGCAATTTGATGGCAGTTTCTGAAAATGTAATGGTGGCTT 
CTGCTAATTTGCACCCAAATCAGGTCAAAATGGAGCTGTAAGTTGGGAAAACGGTAGAGA 
TCATGAATGTGTATATACATCGTATAAGCTCGTTTCTCTCTATATAAATATAATCATAAA 
TATAGATATCTGTTAAGAAGGTATCAGTCATTTGATTCAGAGAGACAACACTGGTATGAT 
TGTTTCTTATTCTTGTACCAGATTTCGACAATGTAGAATTTAGTAGGATATGATCATTTT 
GATCTCGTTATATATA 

>G2144 Amino Acid Sequence (domain in AA coordinates : 203-283 ) 

MDLTGGFGARSGGVGPCREPIGLESLHLGDEFRQLVTTLPPENPGGSFTALLELPPTQAV 

ELLHFTDSSSSQQAAVTGIGGEIPPPLHSFGGTLAFPSNSVLMERAARFSVIATEQQNGN 

ISGETPTSSVPSNSSANLDRVKTEPAETDSSQRLISDSAIENQIPCPNQNNRNGKRKDFE 

KKGKS STKKNKS S EENE KLP YVHVRARRGQATDSHSLAERARREKINARMKLLQELVPG C 

DKIQGTALVLDEIINHVQSLQRQVEMLSMRLAAVNPRIDFNLDTILASENGSLMDGSFNA 

APMQLAWPQQAIETEQSFHHRQLQQPPTQQWPFDGLNQPVWGREEDQAHGNDNSNLMAVS 

ENVMVAS ANLHPNQVKMEL * 

>G2431 (47. .1057) 

CCCTTTCGTTTTTATTTAAATTTCTTGGGTCGTTTCTTAAATTTGTATGTGTTTATTAAT 
GGAGATCAACAATAATGCCMCAATACTAATACTACTATTGATAATCACAAGGCAAAGAT 
GAGCCTTGTGTTGTCAACGGATGCTAAGCCAAGGTTGAAATGGACTTGTGATCTTCATCA 
CAAATTCATCGAAGCCGTTAATCAACTTGGAGGACCTAACAAAGCAACACCTAAGGGTTT 
GATGAAGGTTATGGAGATTCCTGGGCTTACCTTATACCATCTCAAGAGCCATTTACAGAA 
ATATCGGTTAGGGAAGAGCATGAAGTTCGATGATAACAAGCTAGAAGTTTCCTCTGCATC 
AGAGAATCAAGAAGTTGAGAGTAAAAACGATTCAAGAGATCTCCGAGGCTGCAGTGTCAC 
CGAAGAAAACAGCAATCCAGCTAAAGAAGGGCTACAAATCACAGAGGCTTTACA7VATGCA 
GATGGAAGTTCAGAAGAAACTTCATGAACAAATCGAAGTTCAGAGGCATTTGCAGGTGAA 
GATTGAGGCACAAGGAAAGTATCTACAGTCCGTTTTAATGAAAGCTCAACAAACTCTCGC 
TGGCTACTCATCTTCAAATCTCGGCATGGATTTTGCGAGGACCGAGCTCTCTAGATTAGC 
TTCAATGGTGAACAGAGGCTGTCCAAGCACTTCGTTCTCAGAGCTAACGCAAGTAGAAGA 
AGAAGAAGAAGGTTTCTTGTGGTACAAGAT^ACCAGAAAACAGAGGAATTAGTCAGCTGAG 
ATGTTCAGTAGAGAGCTCGTTGACATCTTCAGAGACCTCAGAGACAAAACTGGATACTGA 
CAATAACCTTAATAAATCGATTGAACTTCCGTTGATGGAGATCAACTCGGAAGTGATGAA 
GGGGAAGAAGAGAAGCATAAACGACGTCGTTTGCGTGGAGCAGCCTCTAATGAAGAGAGC 
TTTTGGAGTTGATGATGATGAGCATTTGAAGTTGAGTTTGAATACTTACAAGAAAGACAT 
GGAGGCGTGTACGAACATAGGACTAGGGTTTAATTAAAAAAAAT^ACATTTTACTAAAGTT 
ATATAAAAATGTTTTAAAAGAATCCA 

>G2431 Amino Acid Sequence (conserved domain in AA coordinates : 38-88) 

MCLLMEINNNANNTNTTIDNH 

TPKGLMKVMEIPGLTLYHLKSHLQKYRLGKSM 

GCSVTEENSNPAKEGLQITEALQMQMEVQKKLHEQIEVQRHLQV1CIEAQGKYLQSVLMKA 

QQTLAGYSSSNLGMDFARTELSRLASMViniGCPSTSFSELTQvFIEEEEGFLWYKKPENRG 

ISQLRCSV^SSLTSSETSETKLDTDimjNKSIELPLMEINSEVMKGKKRSIl^ 

LMKRAFGTODDEHLKLSLNTYKKDMEACTNIGLGFN* 

>G2465 (86.. 1150) 

CAATATTCTTTCTCCATTGAGATTAAGC^ 

GGTTCTTAGTCCCTTTTGAATAATAATGATGGTGGAGATGGATTACGCTAAGAAAATGCA 
GAAATGTCATGAATACGTTGAAGCACTTGAAGAAGAACAGAAGAAAATCCAAGTCTTTCA 
ACGCGAGCTTCCTTTATGTTTAGAGCTTGTCACTCAAGCGATCGAAGCTTGTCGGAAGGA 
GTTATCTGGTACGACGACAACTA(^TCAGAACAGTGTTCAGAACAGACCACAAGTGTTTG 
TGGTGGTCCTGTCTTTGAAGAGTTTATTCCTATCAAGAAAATTAGTTCCTTGTGTGAAGA 
AGTACAAGAAGAAGAAGAAGAAGATGGTGAACATGAATCTTCTCCAGAACTTGTGAATAA 
TAAGAAATCAGATTGGCTTAGATCTGTTC^GCTATGGAATC^TTCACCGGATCTAAATCC 
AAAAGAGGAGCGTGTAGCTAAGAAAGCGAAAGTGGTGGAGGTGAAACCAAAAAGCGGTGC 
GTTTCAGCCGTTTCAAAAGCGCGTTTTGGAGACTGATTTGCAACCGGCGGTGAAAGTAGC 
TAGTTCGATGCCAGCGACGACGACGAGTTCTACGACGGAAACTTGTGGTGGTAAAAGTGA 
TTTGATTAAAGCTGGAGATGAGGAAAGACGGATAGAGCAGCAGCAATCGCAGTCGCATAC 
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GCATAGAAAACAAAGGCGGTGCTGGTCGCCGGAATTACACCGTCGATTCCTAAACGCGCT 
TCAGCAGCTTGGAGGATCTCATGTTGCTACACCAAAGCAAATCAGGGATCACATGAAGGT 
TGATGGATTAACAAACGACGAAGTTAAAAGCCATTTACAGAAATATAGACTTCACACAAG 
AAGGCCAGCAGCAACATCCGTGGCGGCACAAAGTACCGGGAATCAGCAACAACCACAATT 
TGTGGTGGTTGGAGGCATATGGGTACCATCGTCACAAGATTTTCCACCACCGTCCGATGT 
AGCCAACAAGGGTGGTGTATATGCTCCGGTTGCGGTGGCGCAATCTCCAAAACGTTCGTT 
GGAGAGAAGTTGCAACTCGCCGGCGGCATCTTCCTCTAC7VAATACAAATACTTCTACTCC 
TGTGTCATAATCTGATAGTCATACTATAATCATCTCCTGATGTTGATTTTGGTGTAGGTT 
TGAAAATGTTTATGTGAATGTAA 

>G2465 Amino Acid Sequence (conserved domain in AA coordinates : 219-269) 

MMVEMDYAKKMQKCHEYV^ALEEEQKKIQVFQRELPLCLELVTQAIEACRKELSGTTTTT 

SEQCSEQTTSVCGGPVFEEFIPIKJCISSLCEEVQEEEEEDGEHESSPELVTsTNKKSDWLRS 

VQLWNHSPDLNPKEERVAKKAKVVEVKPKSGAFQPFQKRVLETDLQPAVKVASSMPATXT 

SSTTETCGGKSDLIKAGDEERRIEQQQSQSHTHRKQRRCWSPELHRRFLNALQQLGGSHV 

ATPKQIRDHMKTOGLTNDEVKSHLQKyRLHTRRPAATSVAAQSTGNQQQPQFVVVGGIWV 

PSSQDFPPPSDVAJIKGGVYAPVAVAQSPKRSLERSCNSPAASSSTNTNTSTPVS* 

>G2583 (38.. 607) 

CAAATCAGAAAATATAGAGTTTGAAGGAAACTAAAAGATGGTACATTCGAGGAAGTTCCG 

AGGTGTCCGCCAGCGACAATGGGGTTCTTGGGTCTCTGAGATTCGCCATCCTCTATTGAA 

GAGAAGAGTGTGGCTTGGAACTTTCGAAACGGCAGAAGCGGCTGCAAGAGCATACGACCA 

AGCGGCTCTTCTAATGAACGGCCAAAACGCTAAGACCAATTTCCCTGTCGTAAAATCAGA 

GGMGGCTCCGATCACGTTAAAGATGTTAACTCTCCGTTGATGTCACCAAAGTCATTATC 

TGAGCTTTTGAACGCTAAGCTAAGGAAGAGCTGCAAAGACCTAACGCCTTCTTTGACGTG 

TCTCCGTCTTGATACTGACAGTTCCCACATTGGAGTTTGGCAGAAACGGGCCGGGTCGAA 

AACAAGTCCGACTTGGGTCATGCGCCTCGAACTTGGGAACGTAGTCAACGAAAGTGCGGT 

TGACTTAGGGTTGACTACGATGAACAAACAAAACGTTGAGAAAGAAGAAGAAGAAGAAGA 

AGCTATTATTAGTGATGAGGATCAGTTAGCTATGGAGATGATCGAGGAGTTGCTGAATTG 

GAGTTGACTTTTGACTTTAACTTGTTGCAAGTCCACAAGGGGTAAGGGTTTTC 

>G2583 Amino Acid Sequence (domain in AA coordinates : 4-71) 

MVHSRKFRGVRQRQWGSWVSE I RHPLLKRRVWLGTFETAEAAARAYDQAALLMNGQNAKT 

NFPWKSEEGSDHVKDTOSPIiMSPKSLSELLNAKLRKSCKDIiTPSLTCLRLDTDSSHIGV 

WQKRAGSKTSPTWVMRLELGNVVNESAVDLGLTTMNKQNVEK^ 

MIEELLNWS* 

>G2724 (1..651) 

ATGGAAATAGAAATAAGGAGAGGTCCATGGACTGTGGAAGAAGACATGAAGCTCGTCAGT 
TACATTTCTCTTCACGGTGAAGGAAGATGGAACTCCCTCTCTCGTTCTGCTGGACTGAAT 
AGAACGGGGAAAAGTTGCAGATTGCGGTGGCTAAATTATCTCCGGCCGGATATCCGCCGT 
GGAGACATATCCCITCAAGAACAATTTATCATCCTTGAACTCCATTCTCGTTGGGGAAAT 
CGGTGGTCAAAGATTGCTCAACATTTACCGGGAAGT^CAGATAACGAGATAAAGAATTAT 
TGGAGAACACGTGTTCAAAAGCATGCAAAACTTCTAAAATGTGACGTGAACAGCAAGCAA 
TTCAAAGACACCATCAAACATCTCTGGATGCCTCGTCTCATCGAGAGAATCGCCGCCACT 
CAAAGTGTCCAATTTACCTCTAACCACrACTCGCCTGAGAACTCCAGCGTCGCCACCGCC 
ACGTCATCAACGTCGTCGTCTGAGGCTGTGAGATCGAGTTTCTACGGTGGTGATCAGGTG 
GAATTTGGAACGTTGGATCATATGACAAATGGTGGTTATTGGTTCAACGGCGGAGATACG 
TTTGAAACTTTGTGTAGTTTTGACGAGCTCAACAAGTC 

>G2724 Amino Acid Sequence (conserved domain in AA coordinates : 7-113) 

MEIEIRRGPWTVEEDMKLVSYISLHGEGRWNSLSRSAGLNRTGKSCRLRWLIJYLRPDIRR 

GDI SLQEQF I ILELHSRWGNRWS KT AQHLPGRTDNEIKNYWRTRVQKHAKLLKCDVNS KQ 

FKDTIKHLWMPRLIERIAATQSVQFTSNHYSPENSSVATATSSTSSSEAVRSSFYGGDQV 

EFGTLDHMTNGGYWFNGGDTFETLCSFDELNKWLIQ* 

>G377 (1..396) 

atgggtctctcgcattttccaacagcgtcagaaggagtactaccacttctggtgatgaac 
acggttgtttcaatcactctgttgaagaacatggtgaggtctgtttttcaaattgttgca 
tccgagactgaatcttccatggagatagacgacgagcctgaagatgattttgttactaga 
agaatctcgataacacagttcaagtctctatgtgagaacatagaagaggaagaagaagag 
aaaggtgtggagtgttgtgtgtgcctttgtgggtttaaagaggaagaggaagtgagtgag 
ttggtttcttgcaagcatttcttccacagagcttgtctagacaactggtttggtaataac 
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cacaccacatgccctctttgcaggtccattctctag 

>G377 Amino Acid Sequence (domain in AA coordinates : 85- 128) 

MGLSHFPTASEGVLPIiLVMNTWSITLLKNMVRSVFQIVASETESSMEIDDEPEDDFVTR 

RISITQFKSLCENIEEEEEEKGVECCVCLCGFKEEEEVSELVSCKHFFHRACLDNWFGNN 
HTTCPLCRSIL* 

>G428 (97. . 1032) 

TTACTTTTGTGTTTCTTCATATTCTTCAGAAGCAAGCACAAGGCTAGGGATCGAAGAAGC 

GGCGATCACTGATCGTATCTCACTACGATCACATTAATGGATAGAATGTGTGGTTTCCGC 

TCGACGGAAGACTATTCGGAGAAAGCGACGTTGATGATGCCGTCCGATTATCAGTCTTTG 

ATTTGTTCAACCACCGGAGACAATCAAAGACTGTTTGGATCCGACGAACTCGCTACCGCT 

TTGTCCTCGGAGTTGCTTCCGCGTATTCGAAAAGCTGAGGATAATTTCTCTCTTAGTGTC 

ATCAAATCCAAAATCGCTTCTCATCCTTTGTATCCTCGCTTACTCCAAACCTACATCGAT 

TGCCAAAAGGTGGGAGCGCCTATGGAAATAGCGTGTATATTGGAAGAGATTCAGCGAGAG 

AACCATGTGTACAAGAGAGATGTTGCTCCATTATCTTGCTTTGGAGCTGATCCTGAGCTT 

GATGAATTCATGGAAACCTACTGTGATATATTGGTTAAATACAAAACCGATCTTGCGAGG 

CCGTTCGACGAGGCTACAACTTTCATAAACAAGATTGAAATGCAGCTTCAGAACTTGTGC 

ACTGGTCCAGCGTCTGCTACAGCTCTTTCAGATGATGGTGCGGTTTCATCTGACGAGGAA 

CTGAGAGAAGATGATGACATAGCAGCGGATGACAGCCAACAAAGAAGCAATGACCGCGAT 

CTGAAGGACCAGCTACTACGCAAATTTGGTAGCCATATCAGTTCATTGAAACTCGAGTTC 

TCTAAAAAGAAGAAGAAAGGGAAGCTACCAAGAGAAGCAAGACAAGCGTTGCTCGATTGG 

TGGAATGTTCATAATAAATGGCCTTACCCTACTGAAGGCGACAAAATAGCTCTGGCTGAA 

GAAACAGGTTTGGATCAAAAACAAATCAACAATTGGTTTATAAACCAAAGGAAACGCCAT 

TGGAAGCCTTCGGAGAACATGCCGTTTGATATGATGGACGATTCTAATGAAACATTCTTT 

ACCGAGGAATGAAAAGAGAGACATGGGATTGTGCATTGTATAATTTTTACACTGTTTTCC 

CAAGAAAAGAAAACAGTAAAAAGCTTTTGGTAAATGGGACATCATCGCGAATGAATGGAA 

CCAGTTAGCCAAAACGGTCAAGGGCGTGGCGTAACGAGACATTGTATTGGAAATAGTGGC 

AATATTATGTC^CTAATCTTCGAATGGTCCAAAATGATAGATTTCTTATTTGTATXGAAC 

CTTACTTAGATAGCTGATGTGTCAACTAAATAATTTATTTTCATCCTTATACTACTTGTA 

TCAATGTCTCTAATTGATCAATTGTTGCTTGCTATTCAAAAAAAAAAAAAAAAAAAAAA 

>G428 Amino Acid Sequence (domain in AA coordinates: 229-292) 

MDRMCGFRSTEDYSEKATLMMPSDYQSLICSTTGDNQRLFGSDELATALSSELLPRIRKA 

EDNFSLSVIKSKIASHPLYPRLLQTYIDCQKVGAPMEIACILEEIQRENHVYKRDVAPLS 

CFGADPELDEFMETYCDILVKYKTDLARPFDEATTFINKIEMQLQNLCTGPASATALSDD 

GAVSSDEELREDDDIAADDSQQRSNDRDLKDQLLRKFGSHISSLKLEFSKKKKKGKLPRE 

ARQALLDWWNVHNKWPYPTEGDKIAIAEETG^ 

DDSNETFFTEE* 

>G447 (241.. 3501) 

CTTTTTAAGAGCTTAAAAATTTGCTTTGAAGCTTCAAATATTCTTATGAACTAAAAAGAA 
GAAAAAAGCTTTTGTTTCTTTTC 

ACTATTTAGTTTCTCTCGTGCTCTTCTCTTGAGCAAATACAGATTCGTTAATTTTGCTGA 

AGAAGAAGAACTCTGTTTCTTCCCTGCACCAAACCAATTTTTTCGTTCTTTCTATAAACG 

ATGAAAGCTCCATCAAATGGATTTCTTCCAAGTTCCAACGAAGGAGAGAAGAAGCCAATC 
AATTCTCAACTATGGC^CGCTTGTG 

CTTGTGGTTTACTTCCCTCAAGGACACAGCGAGCAAGTTGCAGCATCGATGCAGAAGCAA 

ACAGATTTTATACCAAATTACCCAAATCTTCCTTCTAAGCTGATTTGCTTGCTTCACAGT 

GTTACATTACATGCTGATACCGAAACAGATGAAGTCTATGCACAAATGACTCTTCAACCT 

GTGAATAAGTATGATAGAGAAGCATTGCTAGCTTCTGATATGGGCTTGAAGCTAAACAGA 
CAACCTACTGAGTTTTTTTC 

TTCTCTGTACCGCGTCGTGCAGCTGAGAAAATATTCCCTCCTCTTGATTTCTCGATGCAA 

CCGCCTGCGCAAGAGATTGTAGCTAAAGATTTACATGATACTACATGGACTTTCAGACAT 

ATCTATCGAGGCCT^CCAAAAAGACACTTGCTTACCACAGGTTGGAGCGTTT^ 

A(^^GAGACTATTTGCGGGTGATTCAGTTTTGTTTGTAAGAGATGAGAAATCACAGCTG 

ATGTTGGGTATAAGACGTGCAAATAGACAAACTCCGACTCTTTCCTCATCGGTCATATCC 

AGCGAC^GTATGCACATTGGGATACTTGCAGCTGCAGCTCATGCTAATGCCAATAGTAGC 

CCTTTTACCATCTTCTTCAATCCAAGGGCAAGTCCTTCAGAGTTTGTAGTTCCTTTAGCC 

AAATACAACAAAGCCTTATACGCTCAAGTATCTCTAGGAATGAGATTCCGGATGATGTTT 

GAGACTGAGGATTGTGGGGTTCGTAGATATATGGGTACAGTCACAGGTATTAGTGATCTT 
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GACCCTGTAAGATGGAAAGGCTCACAATGGCGTAATCTTCAGGTAGGATGGGATGAATCA 

ACAGCTGGAGATAGGCCAAGCCGAGTATCCATATGGGAAATCGAACCCGTCATAACTCCT 

TTTTACATATGTCCTCCTCCATTTTTCAGACCTAAGTACCCGAGGCAACCCGGGATGCCA 

GATGATGAGTTAGACATGGAAAATGCTTTCAAAAGAGCAATGCCTTGGATGGGAGAAGAC 

TTTGGGATGAAGGACGCACAGAGTTCGATGTTCCCTGGTTTAAGTCTAGTTCAATGGATG 

AGTATGCAGCAAAACAATCCATTGTCAGGTTCTGCTACTCCTCAGCTCCCGTCCGCGCTC 

TCATCTTTTAACCTACCAAACAATTTTGCTTCCAACGACCCTTCCAAGCTGTTGAACTTC 

CAATCCCCAAACCTCTCTTCCGCAAATTCCCAATTCAACAAACCGAACACGGTTAACCAT 

ATCAGCCAACAGATGCAAGCACAACCAGCCATGGTGAAATCTCAACAACAACAACAACAA 

CAACAACAACAACACCAACACCAACAACAACAACTGCAACAACAACAACAACTACAGATG 

TCACAGCAACAGGTGCAGCAACAAGGGATTTATAACAATGGTACGATTGCTGTTGCTAAC 

CAAGTCTCTTGTCAAAGTCCAAACCAACCTACTGGATTCTCTCAGTCTCAGCTTCAGCAG 

CAGTCAATGCTCCCTACTGGTGCTAAAATGACACACCAGAACATAAATTCTATGGGGAAT 

AAAGGCTTGTCTCAAATGACATCGTTTGCGCT^AGAAATGCAGTTTCAGCAGCAACTGGAA 

ATGCATAACAGTAG CCAGTTATTAAG AAACCAG CAAGAACAGTCCTCTCTCCATTCATTA 

CAACAAAATCTGTC CC AAAATC CTCAG CAACTCC AAATG CAACAACAATCATCAAAACCA 

AGTCCTTCACAACAGCTTCAGTTGCAGCTACTGCAGAAGCTACAGCAGCAGCAACAGCAG 

CAGTCGATTCCTCCAGTAAGCTCATCCTTACAGCCACAATTATCAGCGTTGCAGCAGACA 

CAAAGCCATCAATTGCAACAACTTCTGTCGTCTCAAT^ATCAACAGCCCTTGGCACATGGT 

AATAACAG CTTCCCAG CTTCAACTTTCATGCAGCCTCCACAGATTCAGGTGAGTCCTCAG 

CAGCAAGGACAGATGAGTAACAAAAATCTTGTAGCCGCTGGAAGATCACATTCTGGCCAC 

ACAGATGGAGAAGCTCCTTCTTGTTCAACCTCACCTTCCGCCAATAACACGGGACATGAT 

AATGTTTCACCGACAAATTTCCTGAGCAGAAATCAACAGCAAGGACAAGCTGCATCTGTA 

TCTGCATCTGATTCAGTCTTTGAGCGCGCAAGCAATCCGGTCCAAGAGCTTTATACAAAA 

ACTGAGAGCCGGATCAGTCAAGGCATGATGAATATGAAGAGTGCTGGTGAACATTTCAGA 

TTTAAAAGCGCGGTAACAGATCAAATCGATGTATCCACAGCGGGAACGACGTACTGTCCT 

GATGTTGTTGGCCCTGTACAGCAGCAACAAACTTTCCCACTACCATCATTTGGTTTTGAT 

GGAGACTGCCAATCTCATCATCCAAGAAACAACTTAGCTTTCCCTGGTAATCTCGAAGCC 

GTAACTTCTGATCCACTCTATTCTCAAAAGGACTTTCAAAACTTGGTTCCCAACTATGGC 

AACACACGAAGAGACATTGAGACGGAGCTGTCCAGTGCTGCAATCAGTTCTCAGTCATTT 

GGTATTCCCAGCATTCCCTTTAAGCCCGGATGTTCAAATGAGGTTGGCGGCATCAATGAT 

TCAGGAATCATGAATGGTGGAGGACTGTGGCCCAATCAGACTCAACGAATGCGAACATAT 

ACATkAGGTTCAAAAACGAGGGTCAGTAGGTAGATCAATAGATGTTACCCGTTATAGCGGC 

TATGATGAACTTAGGCATGACTTAGCGAGAATGTTTGGCATCGAAGGACAGCTCGAAGAT 

CCGCTAACCTCTGATTGGAAACTCGTCTACACCGATCACGAAAACGATATTTTACTAGTT 

GGTGATGATCCTTGGGAAGAGTTTGTGAACTGCGTGCAGAACATAAAGATACTATCATCA 

GTAGAAGTTCAGCAAATGAGCTTAGACGGAGATCTTGCAGCTATCCC7VACCACAAACCAA 

GCCTGCAGCGAAACAGACAGCGGAAATGCTTGGAAAGTAC^CTATGAAGACACTTCTGCT 

GCAGCTTCTTTCAACAGATAGAAATAAAAAGATGCAAATATACCAAGTCAACTTACATTA 

TCATTCGAGGCCATCGCAAAGTACATGTTTTTTTTTC 

ACTGAGAAGAAGAAGATACTGCACGGTATATAAACATTTTTATAGGACAGTGATTTGATT 

TTTCATTCTAACTTGATGTTGTTGTACTTTCTTGTTTCCATATTTGTATAA 

TGCTTGACAAGTCTATGAGGAGCTVTATC^ 

TAACTAAACAATTACCTTCATTAATCATGAATCCTTTGGTCGTTTAAAA 

>G447 Amino Acid Sequence (conserved domain in AA coordinates : 22-356) 

MKAPSNGFLPSSNEGEKKPINSQLWHACAGPLVSLPPVGSLWYFPQGHSEQVAASMQKQ 

TDFI PWYPNLPS KL I CLLHS VTLHADTETDEVYAQMTLQP VNKYDREALLASDMGLKLNR 

QPTEFFCKTLTASD^THGGFSVPRRAAEKIFPPIJDFSMQPPAQ 

IYRGQPKRHLLTTGWSVFVSTKRLFAGDSVLFVTO 

SDSMHIGILAAAAHANANSSPFTIFFNPRASPSEFVTOLAKTO 

ETEDCGVRRYMGTVTGISDLDPVRWKGSQWRNLQVGWDESTAGDRPSRVSIWEIEPVITP 

FYICPPPFFRPKYPRQPGMPDDELDMENAFKRAMPWMGEDFGMKDAQSSMFPGLSLVQWM 

SMQQNNPLSGSATPQLPSALSSFNLPNNFASNDPSKLLNFQSPNLSSAN^ 

I SQQMQAQPAMVKSQQQQQQQQQQHQHQQQQLQQQQQLQMS QQQVQQQG I YNNGT I AVAN 

QVSCQSPNQPTGFSQSQLQQQSMLPTGAKMTHQNINSMGNKGLSQMTSFAQEMQFQQQLE 

MHNSSQLLRNQQEQSSLHSLQQNLSQNPQQLQMQQQSSKPSPSQQLQLQLLQKLQQQQQQ 

QSIPPVSSSLQPQLSALQQTQSHQLQQLLSSQNQQPLAHGNNSFPASTFMQPPQIQVSPQ 
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QQGQMSNKNLVAAGRSHSGHTDGEAPSCSTSPSANWTGHDNVSPTNFLSRNQQQGQAASV 
SASDSVFERASNPVQELYTKTESRISQGMMNMKSAGEHFRFKSAVTDQIDVSTAGTTYCP 
DWGPVQQQQTFPLPSFGFDGDCQSHHPRNNLAFPGNLEAVTSDPLYSQKDFQNLVPNYG 
NTPRDIETELSSAAISSQSFGIPSIPFKPGCSNEVGGINDSGIMNGGGLWPNQTQRMRTY 
TKVQKRGSVGRS IDVTRYSGYDELRHDLARMFG I EGQLED PLTS D WKLVYTDHEND I LLV 
GDDPWEEFWCVQNIKILSSVEVQQMSLDGDIiAAIPTTNQACSETDSGNAWKVHYEDTSA 

AASFNR* 

>G464 (41.. 760) 

CTCTGCTGGTATCATTGGAGTCTAGGGTTTTGTTATTGACATGCGTGGTGTGTCAGAATT 
GGAGGTGGGGAAGAGTAATCTTCCGGCGGAGAGTGAGCTGGAATTGGGATTAGGGCTCAG 
CCTCGGTGGTGGCGCGTGGAAAGAGCGTGGGAGGATTCTTACTGCTAAGGATTTTCCTTC 
CGTTGGGTCTAAACGCTCTGCTGAATCTTCCTCTCACCAAGGAGCTTCTCCTCCTCGTTC 
AAGTCAAGTGGTAGGATGGCCACCAATTGGGTTACACAGGATGAACAGTTTGGTTAATAA 
CCAAGCTATGAAGGCAGCAAGAGCGGAAGAAGGAGACGGGGAGAAGAAAGTTGTGAAGAA 
TGATGAGCTCAAAGATGTGTCAATGAAGGTGAATCCGAAAGTTCAGGGCTTAGGGTTTGT 
TAAGGTGAATATGGATGGAGTTGGTATAGGCAGAAAAGTGGATATGAGAGCTCATTCGTC 
TTACGAAAACTTGGCTCAGACGCTTGAGGAAATGTTCTTTGGAATGACAGGTACTACTTG 
TCGAGAAAAGGTTAAACCTTTAAGGCTTTTAGATGGATCATCAGACTTTGTACTCACTTA 
TGAAGATAAGGAAGGGGATTGGATGCTTGTTGGAGATGTTCCATGGAGAATGTTTATCAA 
CTCGGTGAAAAGGCTTCGGATCATGGGAACCTCAGAAGCTAGTGGACTAGCTCCAAGACG 
TCAAGAGCAGAAGGATAGACAAAGAAACAACCCTGTTTAGCTTCCCTTCCAAAGCTGGCA 
TTGTTTATGTATTGTTTGAGGTTTGCAATTTACTCGATACTTTTTGAAGAAAGTATTTTG 
GAGAATATGGATAAAAGCATGCAGAAGCTTAGATATGATTTGAATCCGGTTTTCGGATAT 
GGTTTTGCTTAGGTCATTCAATTCGTAGTTTTCCAGTTTGTTTCTTCTTTGGCTGTGTAC 
CAATTATCTATGTTCTGTGAGAGAAAGCTCTT 

>G464 Amino Acid Sequence (domain in AA coordinates: 20-28, 71-82, 126-142, 
224) 

MRGVSELEVGKSNLPAESELELGLGLSLGGGAWKERGRILTAKDFPSVGSKRSAESSSHQ 

GASPPRSSQWGWPPIGLHRMNSLVintfQAMKAARAEEGDGEK 

VQGLGFVKVNMDGVGIGRKTOMRAHSSYENIiAOT^ 

SDFVLTYEDKEGDWMLVGDVPWRMFINSVKRLRIMGTSEASGLAPRRQEQKDRQRNNPV* 
>G557 (192 .-.698) 

CAGAGATCTGACGGCGGTAGCAGAGTAATCTATTCCTTCCCAAAATGTCTCGCAATTAGA 
TTCTTTCCAAGTTCTTCTGTAAATCCCAAGTCCCGCTCTTTTCCTCTTTATCCTTTTCAC 
CAGCTTCGCTACTAAGACAACAAATCTTTCCCTCTCTCTCTCGCCTGATCGATCTTCAAA 
GAGTAAGAAAAATGCAGGAACAAGCGACTAGCTCTTTAGCTGCAAGCTCTTTACCATCAA 
GCAGCGAGAGGTCATCAAGCTCTGCTCCACATTTGGAGATCAAAGAAGGAATTGAAAGCG 
ATGAGGAGATACGGCGAGTGCCGGAGTTTGGAGGAGAAGCTGTCGGAAAAGAAACTTCCG 
GTAGAGAATCTGGATCGGCGACCGGTCAGGAGCGGACACAGGCGACTGTCGGAGAAAGTC 
AAAGGAAGCGAGGGAGGACACCGGCGGAGAAAGAGAACAAGCGGCTGAAGAGGTTGTTGA 
GGAACAGAGTTTCAGCTCAGCAAGCAAGAGAGAGGAAAAAGGCTTACTTGAGCGAGTTGG 
AAAACAGAGTGAAAGACTTGGAGAACAAAAACTCTGAACTTGAAGAGCGACTCTCTACTC 
TTCAGAACGAGAACCAGATGCTTAGACATATTCTGAAGAACACAACAGGAAACAAGAGAG 
GAGGTGGTGGTGGTTCTAATGCTGATGCAAGCCTTTGATCTC 

TATTTTTGTGGATAAAATTTACAGAGAATTGTATCAATAATTATCATGTTAAAATTATAT 
GGGATGTGAGAGCTAATATTGCAATTGTAGACCAAGTTCTCTTAA7^AAAAAAAAAAAAAA 

AA 

>G557 Amino Acid Sequence (domain in AA coordinates: 90-150) 

MQEQATSSLAASSLPSSSERSSSSAPHLEIKEGIESDEEIRRVPEFGGEAVGKETSGRES 

GSATGQERTQATVGESQRKRGRTPAEKENKRLKRLLRNRVSAQQARERKKAYIiSELENRV 

KDLENKNSELEERLSTLQNENQMLRHILKNTTGNKRGGGGGSNADASL* 

>G577 (44.. 2155) GATGCACGGCCTC 

CAAGCTCGACAGTGAGGATGCTGTCCGTCGCTGCAAGGAGCGGCGCCGTCTTATGAAGGA 

CGCCGTCTACGCTCGTCACCATCTCGCCGCCGCTCACTCTGACTACTGCCGCTCCCTTCG 

TCTCACTGGCTCTGCCCTCTCCTCCTTCGCCGCCGGCGAGCCCCTCTCCGTCTCCGAGAA 

TACTCCCGCTGTTTTTCTCCGCCCTTCCTCCAGTCAGGACGCGCCACGTGTCCCTTCTTC 
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CCATTCCCCAGAACCCCCTCCTCCGCCCATCCGCAGCAAGCCTAAGCCTACTAGGCCTAG 

GAGGCTTCCACACATTCTCTCCGACTCCTCTCCTTCTTCCTCTCCTGCCACCAGTTTCTA 

TCCCACTGCTCACCAGAACTCTACTTACTCTCGCTCTCCATCTCAAGCTTCCTCTGTCTG 

GAACTGGGAGAATTTCTACCCTCCCTCTCCCCCCGACTCCGAGTACTTCGAACGCAAAGC 

TCGCCAGAACCACAAGCACCGTCCTCCTTCCGACTACGACGCCGAAACTGAAAGATCCGA 

CCACGATTACTGCCACTCACGGAGAGATGCCGCCGAGGAAGTTCACTGCAGCGAGTGGGG 

CGACGACCACGACCGTTTCACTGCCACCTCTTCGTCCGACGGAGATGGGGAGGTCGAAAC 

TCACGTTTCCAG ATC CGGTATTGAAG AAGAG CCTGTGAAACAACCACATCAAG ACCCAAA 

TGGCAAAGAGCACTCTGACCATGTTACCACTTCTTCCGACTGCTACAAGACCAAATTGGT 

GGTAAGGCACAAGAATTTGAAGGAGATCCTTGACGCCGTTCAAGACTACTTCGACAAGGC 

TGCCTCCGCTGGGGACCAGGTCTCCGCCATGCTTGAGATCGGCCGGGCTGAGCTCGACCG 

CAGCTTCAGCAAGCTGAGGAAGACGGTGTATCATTCAAGCAGTGTGTTCAGCAACTTGAG 

CGCAAGCTGGACCTCAAAACCCCCATTGGCAGTCAAATACAAGCTCGATGCATCTACCCT 

GAATGATGAACAAGGCGGCCTCAAGAGCCTCTGCTCCACTCTAGACCGACTCCTCGCTTG 

GGAAAAGAAGCTTTATGAGGATGTCAAGGCAAGAGAAGGAGTTAAGATTGAGCACGAGAA 

GAAGCTGTCTGCGCTGCAGAGTCAGGAGTATAAGGGAGGTGATGAATCCAAGCTAGACAA 

GACTAAAACTTCCATAACCAGACTGCAATCACTCATCATTGTTTCTTCAGAAGCTGTTTT 

AACCACGTCTAATGCCATTCTCCGCCTCCGGGACACTGACCTTGTCCCTCAGCTTGTTGA 

ACTCTGCCACGGATTAATGTACATGTGGAAGTCAATGCACGAGTATCACGAAATCCAGAA 

CAACATCGTGCAACAAGTCCGTGGCCTGATCAACCAAACAGAGAGAGGTGAGTCAACATC 

AGAGGTACACCGGCAGGTGACGCGGGACCTAGAGTCAGCTGTGTCCTTGTGGCATTCGAG 

CTTCTGTCGCATCATTAAATTCCAGAGGGAGTTCATATGCTCTCTCCACGCATGGTTCAA 

GCTGAGCCTGGTTCCCCTGAGCAACGGAGACCCAAAGAAACAG CGGCCAGACTCATTTGC 

CTTGTGCGAGGAGTGGAAGCAGAGCCTGGAACGGGTGCCTGACACAGTGGCGTCAGAAGC 

CATAAAGAGCTTTGTAAACGTGGTACATGTGATATCAATAAAGCAGGCGGAAGAGGTGAA 

GATGAAGAAACGCACGGAGAGTGCAGGAAAGGAGCTGGAGAAGAAAGCATCCTCACTGAG 

GAGCATAGAGAGGAAGTACTACCAGGCATACTCGACGGTTGGGATAGGCCCTGGACCGGA 

GGTGTTGGACTCACGGGACCCGCTATCTGAGAAGAAATGTGAGCTGGCGGCATGTCAGAG 

GCAGGTGGAGGATGAGGTAATGAGGCACGTGAAGGCTGTGGAGGTGACACGAGCTATGAC 

TCTCAACAATCTACAAACCGGCCTGCCCAATGTATTCCAGGCCTTGACCAGCTTCTCATC 

TCTCTTCACTGAATCTCTCCAGACTGTCTGTTCTCGTTCCTACTCCATCAACTGATTATG 

TCCAAGTTTCTCATTTATTTTTAAGCTCTCATTACGTTGGTATCATGTAAATTTGAGGAT 

TGATTAAATTGAGTCTTGTGGTTTTGTGAGGACTCACAATCTTTCTCATTTAAAAAAAAA 
AAAAAAAAAA 

>G577 Amino Acid Sequence (domain in AA coordinates: TBD) 

MGCTASKLDSEDAVRRCKERRI^MKDAVYARmLAAAHSDYCRSLRLTGSALSSFAAGEP 

LSVSENTPAVFLRPSSSQDAPRVPSSHSPEPPPPPIRSKPKPTRPRRLPHILSDSSPSSS 

PATSFYPTAHQNSTYSRSPSQASSVWNWENFYPPSPPDSEYFERKARQNHKHRPPSDYDA 

ETERSDHDYCHSRRDAAEEVHCSEWGDDHDRFTATSSSDGDGEVETHVSRSGIEEEPVKQ 

PHQDPNGKEHSDHVTTSSDCYKTKLVVRHKNLKEILDAVQDYFDKAASAGDQVSAMLEIG 

RAELDMFSKI*KTVYHSSSVFSNLSAS^ 

DRLLAWEKKL YEDVKAREGVKIEHEKKLSALQSQEYKGGDESKLDKTKTS ITRLQSLI IV 
SSEAVLTTSNAILRLRDTDLVPQLVBLCHGW^ 

RGESTSEVHRQVTRDLESAVSLWHSSFCRIIKFQREFICSIiHAWFKIiSLVPLSNGDPKXQ 
RPDSFALCEEWKQSLERVPDWASEAIKSFVNVVHVISIKQAEEVKMKKRTESAGKELEK 
KASSLRS I ERKYYQAYSTVGIGPGPEVLDSRDPLSEKKCELAACQRQVEDEVMRHVKAVE 
VTRAMTLNNLQTGLPNVFQALTSFSSLFTESLQTVCSRSYSIN* 
>G674 (1..786}- 

ATGGTGTTTAAATCAGAAAAATCAAACCGGGAAATGAAATCAAAGGAGAAGCAAAGGAAG 

GGATTATGGTCACCCGAGGAAGATGAGAAGCTTAGGAGTCATGTCCTCAAATATGGCCAT 

GGATGCTGGAGTACTATTCCTCTTCAAGCTGGATTGCAGAGGAATGGGAAGAGTTGTAGA 

TTAAGGTGGGTTAATTATTTAAGACCTGGACTTAAGAAGTCTTTATTCACTAAAC^ 

GAAACTATACTTCTTTCACTTCATTCCA^ 

TTCTTACCAGGAAGAACCGACAACGAGATCAAAAACTATTGGCATTCTAATCTAAAGAAG 
GGTGTAACTTTGAAACAACATGAAACCACAAAAAAACAT 

TCACTTGAGGCCTTGCAGAGTTCAACTGAAAGATCTTCTTCATCTATCAATGTCGGAGAA 
ACGTCTAATGCTCAAACCTCAAGCTTTTCGCCAAATCrCGTGTTCTCGGAATGGTTAGAT 
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CATAGTTTGCTTATGGATCAGTCACCTCAAAAGTCTAGCTATGTTC7UVAATCTTGTTTTA 
CCGGAAGAGAGAGGATTCATTGGACCATGTGGCCCTCGTTATTTGGGAMCGACTCTTTG 
CCTGATTTCGTGCCAAATTCAGAATTTTTGTTGGA.TGATGAGATATCATCTGAGATCGAG 
TTCTGTACTTCATTTTCAGACAACTTTTTGTTCGATGGTCTCATCAACGAGCTACGACCA 
ATGTAA 

>G674 Amino Acid Sequence (domain in AA coordinates: 20-120) 

MVFKSEKSNREMKSKEKQRKGLWSPEEDEKLRSHVLKYGHGCWSTIPLQAGLQRNGKSCR 

LRVAmYLRPGLKKSLFTKQEETILLSLHSMLGNKWSQISKFLPGRTDNEIKNYWHSNLKK 

GVTLKQHETTKKHQTPLITNSLEALQSSTERSSSSINVGETSNAQTSSFSPNLVFSEWLD 

HSLLMDQSPQKSSYVQNLVLPEERGFIGPCGPRYLGNDSLPDFVPNSEFLLDDEISSEIE 

FCTSFSDNFLFDGLINELRPM* 

>G736 (1..513) 

ATGGCGACTCAAGATTCTCAAGGGATTAAACTCTTTGGCAAAACTATTGCATTTAACACT 
CGAACAATAAAAAATGAAGAAGAGACACACCCGCCGGAGCAAGAAGCCACAATAG C CGTT 
AGATCATCATCATCATCGGATCTGACGGCCGAGAAGCGTCCGGATAAGATCATAGCATGT 
CCAAGATGCAAGAGCATGGAGACAAAGTTCTGTTACTTCAACAACTACAACGGTAATCAG 
CCTCGACACTTTTGTAAAGGCTGCCACCGTTACTGGACCGCCGGTGGTGCACTCCGGAAC 
GTTCCCGTCGGCGCCGGTCGTCGGAAGTCCAAACCACCTGGTCGTGTCGTGGTTGGTATG 
CTTGGAGATGGAAATGGTGTTCGCCAAGTCGAGCTTATAAATGGCTTGCTCGTTGAGGAG 
TGGCAGCATGCCGCAGCCGCAGCTCACGGTAGTTTCCGGCATGATTTTCCCATGAAGCGG 
CTCCGGTGTTACTCCGACGGTCAATCGTGCTGA 

>G736 Amino Acid Sequence (domain in AA coordinates: 54-111) 

MATQDSQGIKLFGKTIAFNTRTIKNEEETHPPEQEATIAVRSSSSSDLTAEKRPDKIIAC 

PRCKSMETKFCYFNNYNGNQPRHFCKGCHRYWTAGGALRNVPVGAGRRKSKPPGRVWGM 

LGDGNGVRQVELINGLLVEEWQHAAAAAHGSFRHDFPMKRLRCYSDGQSC* 

>G903 (96.. 1496) 

CCCGGGTCGACCCACGCGTCCGCTCTCTCTCTCTGAACTATACAAAAACCTACTTTTAAT 
TTCTCTTCCAAGAAGTCAAGAACCCAGAAGAAGACATGACAAGTGAAGTTCTTCAAACAA 
TCTCAAGTGGATCAGGTTTTGCTCAGCCACAGAGCTCATCAACCCTGGATCATGATGAAT 
CTCTCATCAATCCTCCTCTTGTT/^AGAAAAAGAGAAATCTCCCTGGAAATCCTGATCCGG 
AAGCTGAAGTGATAGCTTTATCCCCCACGACCTTGATGGCTACGAACCGGTTCCTATGTG 
AGGTATGTGGCAAAGGTTTCCAAAGAGACCAAAACTTACAGCTTCATCGGCGAGGACATA 
ATCTTCCATGGAAGTTGAAGCAGAGGACAAGCAAAGAAGTGAGAAAACGTGTCTACGTTT 
GCCCCGAGAAGACATGTGTCCACCATCACTCCTCTAGAGCTCTAGGCGATCTCACTGGAA 
TCAAAAAGCATTTTTGCCGGAAACACGGGGAGAAGAAGTGGACGTGCGAGAAATGTGCTA 
AGAGATACGCAGTCCAATCTGATTGGAAAGCTCATTCCAAGACTTGTGGTACTAGAGAGT 
ACCGTTGCGATTGTGGCACCATTTTCTCAAGGCGAGACAGCTTTATCACTCATAGAGCTT 
TCTGCGATGCCTTAGCGGAAGAAACCGCTAAGATAAACGCAGTGTCTCATCTCAACGGTT 
TAGCCGCGGCTGGAGCCCCAGGATCAGTTAATCTCAACTATCAATATCTCATGGGAACAT 
TCATCCC^CCGCTTCAACCATTTGTACCACAACCGCAAACAAATCCAAACCATCATCATC 
AACATTTTCAGCCACCAACTTCTTCGTCGCrCTCTCTATGGATGGGACAAGATATCGCGC 
CGCCTCAACCGCAACCGGACTACGATTGGGTTTTTGGAAACGCTAAGGCAGCGTCTGCTT 
GCATTGATAATAATAATACTCACGATGAGCAGATTACGC^ 

CC^CTACCACTACTCTCTCTGCCCCTTCTTTATTC^G<^GCGACCAACCACAAAACGC^ 
ACGCAAATTCAAACGTGAATATGTCCGCGACAGCTTTACTACAGAAAGCTGCTGAAATTG 
GCGCTACTTCTACAACAACCGCAGCGACCAATGACCGATCAACGT 

CGCTTAAATCCACGGATCAAACCACCAGTTATGACAGTGGCGAAAAGTTTTTTGCTTTGT 

TCGGGTCTAACAAGAAGATTGGGTTAATGAGTCGTAGTCATGATCATCAAGAGATCGAGA 

ACGCTAGAAATGACGTTACGGTTGCGTCTGCCTTGGATGAATTACAGAATTACCCTTGGA 

AACGTAGAAGAGTTGATGGTGGAGGTGAAGTGGGTGGAGGAGGGCAAACTCGGGATTTCC 

TCGGGGTTGGTGTACAAACGTTGTGCCATCCATCGTCTATCAATGGATGGATTTGAAAGA 

GTTTAAAAATTTCGGGGTTAATGCATAAATTACGTAAAAGT^ 

TTCCACCATTTTCTAAGATAACATATGTATATC^ 

TTCAATATTCTAAAACTTATGATATATGTATAATGAATGTGTTTATCTTCAAA 
>G903 Amino Acid Sequence (domain in AA coordinates: 68-92) 
MTSEVLQTISSGSGFAQPQSSSTLDHDESLINPPLVKKKRNLPGNPDPEAEVIALSPTTL 
MATNRFLCEVCGKGFQRDQNLQLHRRGHNLPWKLKQRTSKEVRKRVYVCPEKTCVHHHSS 
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RALGDLTGIKKHFCRKHGEKKWTCEKCAKRYAVQSDWKAHSKTCGTREYRCDCGTIFSRR 

DSFITHRAFCDALAEETAKINAVSHIiNGLAAAGAPGSVNLNYQYLMGTFIPPLQPFVPQP 

QTNPNHHHQHFQPPTSSSLSLWMGQDIAPPQPQPDYDWVFGNAKAASACIDNNNTHDEQI 

TQNANASLTTTTTLSAPSLFSSDQPQNANANSNVNMSATALLQKAAEIGATSTTTAATND 

PSTFLQSFPLKSTDQTTSYDSGEKFFALFGSNNNIGLMSRSHDHQEIENARNDVTVASAL 

DELQNYPWKRRRVDGGGEVGGGGQTRDFLGVGVQTLCHPSSINGWI* 

>G917 (32.. 679) 

TTAGGGTTTTAGAAAGATAGATCGATTGAAGATGAGGAAAGGTAAGAGAGTGATAAAAAA 
GATAGAGGAGAAAATAAAGAGACAAGTGACATTCGCAAAGAGAAAGAAGAGTCTAATCAA 
GAAGGCATATGAACTCTCTGTTCTCTGCGATGTCCACCTTGGTCTCATCATCTTCTCTCA 
CTCCAACAGGCTCTACGATTTCTGCTCCAACTCTACCAGCATGGAGAATCTCATCATGAG 
ATACCAAAAGGAAAAAGAAGGTCAAACCACTG CAGAACACAGTTTC CACTCGG ATCAGTG 
TTCAGATTGCGTGAAGACGAAGGAATCAATGATGAGAGAGATAGAGAATCTTAAGCTGAA 
TCTTCAATTGTACGACGGACATGGCTTGAATCTCTTGACCTACGACGAGCTCCTTTCTTT 
TGAGCTCCATCTCGAATCTTCTCTACAACATGCTCGAGCTCGCAAGTCTGAGTTCATGCA 
TCAGCAGCAGCAGCAACAAACAGATCAAAAGCTTAAGGGAAAAGAAAAGGGTCAAGGAAG 
CTCTTGGGAGCAGCTGATGTGGCAAGCAGAGAGACAGATGATGACGTGTCAAAGACAAAA 
AGATCCTGCGCCGGCGAATGAAGGAGGAGTTCCTTTTTTACGGTGGGGAACAACCCACCG 
ACGTTCTTCACCTCCTTAAGCTACCACAACCAGGCCCAAATACAGGCCCATAACTTCTCT 
CTATCTATAAAAAACAACTGATAGTAAAAAGTATTGACCCGGTTTGGTTCGGTTATGTTG 
ATACCAGACTATTAATTAACTTCGGTTAGACGTATTTACGACTTGATGCTATCTAGACCT 
TTTTGCCCTTCAAAAAAA 

>G917 Amino Acid Sequence (conserved domain in AA coordinates : 2-57) 

MRKGK^VIKKIEEKIKRQVTFAKRKKSLIKKAYELSVLCDVHLGLIIFSHSNRLYDFCSN 

STSMENLIMRYQKEKEGQTTAEHSFHSDQCSDCVKTKESMMREIENLKXNLQLYDGHGLN 

LLTYDELLSFELHLESSLQHARARKSEFMHQQQQQQTDQKLKGKEKGQGSSWEQLMWQAE 

RQMMTCQRQKDPAPANEGGVPFLRWGTTHRRSSPP* 

>G921 (116.. 1024) 

CCAAGATCGACTCTTACTTCGAATCTCTCTCAACTTTCTTCCTCAGCTTACGGGAACTTC 
CACACATATACATCCACAAGAACCCATATCGAAGATTCATCCTACATATATTTACATGGA 
TCAGTACTCATCCTCTTTGGTCGATACTTCATTAGATCTCACTATTGGCGTTACTCGTAT 
GCGAGTTGAAGAAGATCCACCGACAAGTGCTTTGGTGGAAGAATTAAACCGAGTTAGTGC 
TGAGAACAAGAAGCTCTCGGAGATGCTAACTTTGATGTGTGACAACTACAACGTCTTGAG 
GAAGCAACTTATGGAATATGTTAACAAGAGCAACATAACCGAGAGGGATCAAATCAGCCC 
TCCCAAGAAACGCAAATCCCCGGCGAGAGAGGACGCATTCAGCTGCGCGGTTATTGGCGG 
AGTGTCGGAGAGTAGCTCAACGGATCAAGATGAGTATTTGTGTAAGAAGCAGAGAGAAGA 
GACTGTCGTGAAGGAGAAAGTCTCAAGGGTCTATTACAAGACCGAAGCTTCTGACACTAC 
CCTCGTTGTGAAAGATGGGTATCAATGGAGGAAATATGGACAGAAAGTGACTAGAGACAA 
TCCATCTCCAAGAGCTTACTTCAAATGTGCTTGTGCTCCAAGCTGTTCTGTCAAAAAGAA 
GGTTCAGAGAAGTGTGGAGGATCAGTCCGTGTTAGTTGCAACTTATGAGGGTGAACACAA 
CCATCCAATGCCATCGCAGATCGATTCAAACAATGGCTTAAACCGCCACATCTCTCATGG 
TGGTTCAGCTTCAACACCCGTTGCAGCAAACAGAAGAAGTAGCTTGACTGTGCCGGTGAC 
TACCGTAGATATGATTGAATCGAAGAAAGTGACGAGCCCAACGTCAAGAATCGATTTTCC 
CCAAGTTCAGAAACTTTTGGTGGAGCAAATGGCTTCTTCCTTAACCAAAGATCCTAACTT 
TACAGCAGCTTTAGC&GCAGCTGTTACCGGAAAATTGT^ 

ATAGTTTAGCTTCAAATTCCGTTAGAGTTTTTAGATTTGAATTTGTCATG^ 

AGAGAGTAGATTATAATCCNTTGTGATACTGAAAAAAAAAAAAAAAAAAA 

>G921 Amino Acid Sequence (domain in AA coordinates: 146-203) 

MDQYSSSLTOTSLDLTIGVTRmV^EDPPTSALVEELNRVSAENKKLSEM 

LRKQLMEYVNKSNITERDQIS PPKKRKSPAREDAFS CAVI GGVSESS STDQDEYXCKKQR 

EETWKEKVSRVITfKTEASDTTLWKDGYQWRKYGQKVTRDNPSPRAYFK^ 

KK^QRSVEDQSVLVATYEGEHNHPMPSQIDSNNGLNRHISHGGSASTPVAANRRSSLTVP 

VTTTOMIESKKVTSPTSRIDFPQVQKLLra^ 

EK* 

>G922 (1,.1449) 

ATGGTGGCTATGTTTCAAGAAGATAATGGAAGATC 

GTCTTCTCAACTATGTCACTCAACAGACCGACTCTCCTCGCTTCTTCATCTCCG 
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TGTCTCAAAGATCTCAAACCAGAGGAGCGTGGTCTCTACTTAATCCACCTCTTGCTAACT 
TGTGCCAACCACGTGGCTTCAGGTAGCCTCCAAAACGCTAACGCAGCGCTCGAGCAGCTC 
TCTCACCTCGCTTCTCCTGACGGCGACACGATGCAGCGAATCGCTGCTTACTTCACCGAA 
GCGCTTGCTAACAGAATCCTTAAGTCCTGGCCTGGTCTTTACAAGGCTCTTAACGCAACT 
CAGACAAGAACTAACAATGTCTCTGAGGAGATTCATGTTAGAAGACTCTTCTTTGAGATG 
TTCCCGATACTCAAAGTCTCTTACTTGCTCACTAATCGAGCTATACTCGAGGCTATGGAA 
GGAGAGAAGATGGTTCATGTGATTGATCTCGATGCTTCTGAGCCAGCTCAATGGCTTGCT 
TTGCTTCAAGCTTTTAACTCTAGGCCTGAAGGTCCACCTCATTTGAGAATCACTGGTGTT 
CATCACCAGAAGGAAGTGCTTGAACAAATGGCTCATAGACTCATTGAGGAAGCAGAGT^AA 
CTCGATATCCCGTTTCAGTTTAATCCCGTTGTGAGTAGGTTAGACTGTTTAAATGTAGAA 
CAGTTGCGGGTTAAAACAGGAGAGGCCTTAGCCGTTAGCTCGGTTCTTCAATTGCATACC 
TTCTTGGCCTCTGATGATGATCTCATGAGAAAGAACTGCGCTTTACGGTTTCAGAACAAC 
CCTAGTGGAGTTGACTTGCAGAGAGTTCTAATGATGAGCCATGGCTCTGCAGCTGAGGCA 
CGTGAGAATGATATGAGTAACAACAATGGGTATAGCCCTAGCGGTGACTCGGCCTCATCT 
TTGCCTTTACCAAGTTCAGGAAGGACTGATAGCTTCCTCAATGCTATTTGGGGTTTGTCT 
CCAAAGGTCATGGTGGTCACTGAGCAAGACTCAGACCACAACGGCTCCACACTAATGGAG 
AGGCTATTAGAATCACTTTACACCTACGCAGCATTGTTTGATTGCTTGGAAACAAAAGTT 
CCAAGAACGTCTCAAGATAGGATCAAAGTGGAGAAGATGCTCTTCGGGGAGGAGATCAAG 
AACATCATATCCTGCGAGGGATTTGAGAGAAGAGAAAGACACGAGAAGCTTGAGAAATGG 
AGCCAGAGGATCGATTTGGCTGGTTTTGGGAATGTTCCTCTTAGCTATTATGCGATGTTG 
CAGGCTAGGAGATTGCTTCAAGGGTGCGGTTTTGATGGGTATAGAATCAAGGAAGAGAGC 
GGGTGCGCAGTAATTTGCTGGCAAGATCGACCTCTATACTCGGTATCAGCTTGGAGATGC 

AGGAAGTGA 

>G922 Amino Acid Sequence (conserved domain in AA coordinates : 225-242) 

MVAMFQEDNGTSSVASSPLQVFSTMSLNRPTIiLASSSPFHCLKDLKPEERGLYLIHLLLT 

CAl^ASGSLQNANAALEQLSHLASPDGDTMQRIAAYFTEALANRILKSWPGLYICALNAT 

QTRTNNVSEEIHVRRLFFEMFPILKVSYLLTNRAILEAMEGEKMTOVIDLDASEPAQWLA 

LLQAFNSRPEGPPHLRITGVHHQKEVLEQMAH^ 

QLRVTCTGEALAVSSVLQLHTFLASDDDL^ 

RENDMSNNNGYSPSGDSASSLPLPSSGRTDSFLNAIWGLSPKVMVVTEQDSDHNGSTLME 
RLLESLYTYAALFDCLETKVPRTSQDRIKVEKMLFGEEIKNIISCEGFERRERHEKLEKW 
SQRIDLAGFGNVPLSYYAMLQARRLLQGCGFDGYRIKEESGCAVICWQDRPLYSVSAWRC 

RK* 

>G932 (206.. 1213) 

CCACGCGTCCGACCACTTGTACCTCTTTGTCTTAAGTACTCTTTAACCCTACAATTTCCT 
AAGCTCTCAAGCCACAAAAAACCACAAACCGTTCTTCACCAATATATATATCTGATCATC 
ATCAAAGTCCTTCTCTCTGCTCATACCACAAACCGTTCCATTCTTCCCCTAATCACAAAG 
TGATATTTACATAGAGAAGATAGAGATGGGAAGACCACCATGCTGTGACAAGATTGGAGT 
GAAGAAAGGACCATGGACACCAGAGGAAGATATCATCTTGGTTTCTTACATCCAAGAACA 
TGGTCCTGGAAACTGGAGATCTGTGCCTACTCACACAGGTTTGAGGAGATGTAGCAAAAG 
CTGTAGATTGAGGTGGACTAATTATCTTCGACCTGGGATCAAGCGTGGAAATTTCACCGA 
GCATGAAGAGAAGATGATTCTCCATCTTCAAGCTCTTTTGGGAAACAGGTGGGCAGCTAT 
AG CATCATATCTTC CAGAAAGG ACAG ACAATG ATATAAAGAACTATTGG AACACTCATTT 
GAAGAAAAAGCTC^GAAGATGAATGATTCTTGTGATAGTACTATCAACAATGGCCTTGA 
TAATAAAGACTTCTCCATATCAAACAAAAACACT 

TAAAGGTCAATGGGAGAGAAGGCTTCAGACAGATATCAACATGGCTAAACAAGCTCTTTG 
TGATGCCTTGTCTATTGACAAACCACAAAACCCAACTAATTO 

TTATGGTCCATCAT€TTCTTCGTCCTCTACCACCACCACCACCACCACCACCACCACGAG 
AAACACTAATCCATACCCATCTGGGGTCTATGCTTCAAGTGCTGAGAACATTGCTCGTTT 
GCTrrCAGAATTTTATGAAAGACACACCAAAGACCTCGGTGCCCTTGCCGGTTGCAGCCAC 
CGAGATGGCTATCACCACGGCAGCTTCGAGCCCTAGCACAACCGAAGGAGACGGAGAAGG 
GATTGACCATTCTTTGTTCAGCTTCAACTCCATAGATGAAGCTGAAGAGAAGCCTAAACT 
AATAGACCATGACATTAATGGTCTAATTACACAAGGCTCTCTTTCTTTGTTCGAGAAATG 
GCTCTTTGATGAGCAAAGCCACGATATGATCATCAATAACATGTCACTAGAGGGTCAGGA 
AGTGTTGTTCTAGAAAGCATTAAAGTTTGACGATTTGCTTGAGGAACCACGAGGCTTAGT 
TATAAACAATTTGTATAATTAAGTACTCTTTAGTTTC 

TATTGCAGTAATTAGGGATTTTAGTCTTTAGTAGTAACTCTTAAGT^TAACACATTTTT 
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CTCTATCTTTTTAGTAGTAACTCTTTATTTTTTCCTTAAATCTTTGTCGACGTGGAGATG 
ATATCTTCTATGTAGTAGAAACTCAAAAGTGTACATCATCTTTATTAATGTAACGTCTTT 
TTAAAAAAAAAAAAAAAAA 

>G932 Amino Acid Sequence (domain in AA coordinates: 12-118) 
MGRPPCCDKIGVKKGPWTPEEDIILVSYIQEHGPGNWRSVPTHTGLRRCSKSCRLRWTNY 
LRPGIICRGNFTEHEEKMILHLQALLGNRWAAIASYL^ 

DSCDSTINNGLDNKDFSISNKNTTSHQSSNSSKGQWERRLQTDINMAKQALCDALSIDKP 

PKTSVPLPVAATEMAITTAASSPSTTEGDGEGIDHSLFSFNSIDEAEEKPKLIDHDINGL 

ITQGSLSLFEKWLFDEQSHDMIINNMSLEGQEVLF* 

>G599 (152.. 1579) 

TCGACAGAACAGCTTCGTTGTCACTTGTCATTCTATAAATCGCATCCCCATTGACAACCT 
TTCACTTCCATCAAAACTCTCTCTCTATATCTCTCTCTCTATATATCTCTCTCTATATCT 
CTCTCTCTCTTCACTCTCTCTTTCTTTCAAAATGGAAAAACTCATGGTTCCGACATGGAG 
ACCCGACCCGGTTTACCGTCCACCGGAAACACCACTCGAACCGATGGAGTTTTTAGCTCG 
TTCATGGAGCGTCTCTGCTCTCGAAGTCTCCAAGGCTCTAACACCACCCAACCCTCAGAT 
TCTCCTCTCCAAAACCGAAGAAGAAGAAGAAGAAGAACCCATCTCCTCTGTCGTAGACGG 
CGACGGCGACACGGAAGACACCGGACTTGTCACCGGAAACCCATTCTCCTTCGCTTGTTC 
AGAAACTTCTCAAATGGTCATGGATCGTATCTTGTCTCACTCTCAAGAAGTATCACCAAG 
AACATCTGGTCGGCTATCTCACAGTAGTGGTCCACTTAATGGTTCTTTGACCGACAGTCC 
TCCTGTGTCTCCTCCCGAATCCGACGACATTAAGCAATTTTGCAGAGCGAACAAAAATTC 
ATTGAACAGTGTAAATTCTCAGTTCCGTTCAACGGCGGCAACTCCGGGACCTATAACCGC 
TACAGCTACACAGTCCAAGACGGTGGGACGGTGGCTTAAGGACCGGAGAGAGAAAAAGAA 
AGAGGAGACTCGGGCTCATAACGCTCAGATTCACGCTGCTGTCTCTGTCGCCGGCGTTGC 
TGCAGCTGTTGCTGCTATTGCAGCAGCCACCGCTGCGTCTTCTAGCTGTGGTAAGGATGA 
GCAGATGGCTAAAACTGACATGGCCGTTGCTTCTGCTGCGACCCTTGTGGCTGCTCAGTG 
TGTGGAAGCTGCTGAAGTTATGGGAGCTGAGAGAGAGTATTTGGCTTCTGTTGTTAGCTC 
CGCCGTCAATGTTCGTTCTGCCGGAGATATTATGACTCTCACCGCCGGAGCAGCTACAGC 
TTTAAGAGGAGTGCAAACATTGAAGGCAAGGGCAATGAAGGAAGTGTGGAACATAGCATC 
AGTGATACCAATGGATAAAGGACTCACTTCTACAGGAGGAAGCAGCAATAATGTTAATGG 
TAGCAATGGAAGCTCAAGCAGTAGTCACAGTGGTGAACTTGTACAACAGGAGAATTTC^ 
GGGAACTTGTAGTAGAGAATGGCTCGCTAGAGGTTGTGAACTCCTCAAACGCACTCGCAA 
AGGTGATCTCCACTGGAAGATAGTATCTGTTTACATCAACAAAATGAATCAGGTTATGTT 
GAAGATGAAGAGCAGG(^TGTTGGAGGAACCTTCACCAAGAAGAAAAAGAACATTGTGCT 
TGATGTGATCAAG7VATGTCCCGGCCTGGCCTGGACGACATTTGCTAGAGGGAGGAGATGA 
TCTAAGATACTTCGGTTTGAAGACGGTTATGCGAGGTGATGTTGAATTCGAGGTCAAGAG 
CCAAAGGGAATATGAAATGTGGACACAAGGTGTCTCAAGGCTTCTTGTTCTTGCTGCTGA 
GAGGAAGTTTAGGATGTGAATAAACGTTCAATGGCTGCTTGG1TTAAGTGTGAGTTTTTT 
TTTAACTTATGTGGTCAAATTTCATTAGTAGGGGTTCT^ 

TTGGGTATAGGATAAAATGGACCTACCAGTCAAGGTGAGGAAGCATTTGGGTAAACAAAA 

CTTAGTGGGGGTGATCTGTAATATCTATGTTCTTAGTTTTTTTTTGGTTGTTGGTGGTCT 

TTTTGTATAAAAAAACAAAGTTGAAGTAATAGATATATAGTATGTTTTAATTTTAAA 

>G599 Amino Acid Sequence (domain in AA coordinates: 187-219, 264-300) 

MEKIjMVPTWRPDPVYRPPETPLEPMEFLARSWSVSALEVSKALTPPNPQILLSKTEEEEE 

EEPISSWDGDGDTEDTGLVTGNPFSFACSETSQMVMDRILSHSQEVSPRTSGRLSHSSG 

PLMGSLTDSPPVSPPESDDIKQFCRANKNSLNSVNSQFRSTAATPGPITATATQSKTVGR 

WLKDRREKKKEETRAHItoQIHAAVSVAGVAAAVAAI^ 

S AATLVAAQCVEAAB VMGAERE YLAS WS S AVNVRS AGD IMTLTAGAATALRGVQTLKAR 
AMKEVV^IASVIPMDKGLTSTGGSSNNVNGSNGSSSSSHSGELVQQENFLGTCSREWI^ 
GCELLKRTRKGDIJIWKIVSVYINKMNQVMLmKSRHVGGTFTKKKKN 
GRHLLEGGDDLRYFGLKTVMRGDVEFEVKSQREYEMWTQGVSRLLVIiAAERKFRM* 
>G804 (114.. 1139) 

ATACTCCAAGAATTTATAGGTTATAAGTAAAAATTCAGTACAAGTTTGTTTGTTT 

TTCCATTTTCTTGTGTGTTTTTTTCCCCATAATTTATAAATTOT 

CCCACAACAACAACC^GAGCAACAACAACACCACTGGT^ 

TGGGACCAATCTCCGGTTCAGTCTCATTAACCACCACrGCTCCAAACTCCACTACCACCA 
CCGTCACCGCCGGTAAAACACCCGCAAAACGACCGTCCAAGGACCGTCACATCAMGTAG 
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ACGGACGTGGCCGGAGGATACGTATGCCGGCTATCTGCGCAGCACGTGTCTTCCAACTAA 
CACGTGAGTTACAACACAAATCGGACGGCGAGACTATAGAGTGGCTGCTCCAACAAGCGG 
AGCCAGCTATCATCGCAGCCACCGGAACTGGAACCATACCGGCGAATATCTCTACTTTGA 
ACATCTCTCTTCGAAGCAGTGGCTCTACTCTTTCAGCTCCACTGTCTAAATCTTTCCACA 
TGGGAAGAGCGGCTCAAAACGCTGCCGTTTTTGGGTTCCAGCAACAGCTTTATCATCCTC 
ATCATATCACGACAGATTCTTCTTCTTCTTCTCTTCCCAAAACATTCCGTGAAGAAGATC 
TTTTTAAAGATCCTAATTTTCTAGATCAAGAACCCGGTTCAAGATCACCTAAACCGGGAT 
CCGAAGCTCCTGATCAAGATCCGGGTTCGACCCGGTCAAGAACACAAAATATGATACCGC 
CGATGTGGGCACTAGCGCCAACGCCAGCCTCCACAAACGGAGGTAGTGCTTTTTGGATGT 
TACCAGTCGGAGGAGGAGGAGGTCCGGCTAACGTTCAGGATCCATCACAGCACATGTGGG 
CGTTTAATCCGGGTCATTACCCGGGTCGAATCGGGTCGGTTCAGCTAGGGTCTATGTTAG 
TGGGAGGTC7VACAGTTAGGGTTAGGTGTTGCAGAAAATAACAATTTGGGGCTATTTTCCG 
GCGGAGGAGGAGACGGTGGTCGGGTTGGTCTCGGAATGAGTCTTGAGCAAAAGCCTCAAC 
ATCAAGTGAGTGATCATGCTACTAGAGACCAAAATCCTACTATAGATGGTTCTCCTTGAA 
AGACTTCATGATTTCTTTGGTTTTTAAAAAGTGTGAATGTGTGATTTATTGCAACTTTTG 
TTGAGGACTCCAATGTTAATATGGGTTTTAGGGTTGGCTTTTCGGGATTGCCAAATTGTT 
ATT 

>G804 Amino Acid Sequence (domain in AA coordinates: 54-117) 

MESHNNNQSNl^TTGSAHLVPSMGPISGSVSLTTTAPNSTTTTVTAAKTP 

KVDGRGRRIRMPAI CAARVFQLTRELQHKSDGETI EWLLQQAEPAI IAATGTGTIPANI S 

TLNISLRSSGSTLSAPLSKSFHMGRAAQNAAVFGFQQQLYHPHHITTDSSSSSLPKTFRE 

EDLFKDPNFLDQEPGSRSPKPGSEAPDQDPGSTRSRTQNMIPPMWALAPTPASTNGGSAF 

WMLPVGGGGGPAOTQDPSQHMWAFNPGHYPGRIGSVQLGSMLVGGQQLGLGVAENNNLGL 

FSGGGGDGGRVGLGMSLEQKPQHQVSDHATRDQNPTIDGSP* 

>G1062 (297.. 1781) 

CAAAAAAAAAGTTTCAATTTTTGAAAGCTCTGAGAAATGAAATCTATCATTCTCTCTCTC 
TATCTCTATCTTCCTTTTCAGATTTCGCTTCTTCAATTCATGAAATCCTCGTGATTCTAC 

' GTTTTCAAACTTTTGCAGAATTGTCTTCAAGCTTCCAAATTTCAGTTAAAGGTCTCAACT 
TTGCAGAATTTTCCTCTAAAGGTTCAGACTTTGGGGTAAAGGTGTCAACTTTGGCG 
GTCTTGACGGAAACAATGGTGGAGGGGTTTGGTTA7UVCGGTGGTGGTGGAGAAAGGGAAG 
AGAACGAGGAAGGTTCATGGGGAAGGAATCAAGAAGATGGTTCTTCTCAGTTTAAGCCTA 
TGCTTGAAGGTGATTGGTTTAGTAGTAACCAACCACATCCACAAGATCTTCAGATGTTAC 
AGAATCAGCCAGATTTCAGATACTTTGGTGGTTTTCCTTTTAACCCTAATGATAATCTTC 
TTCTTCAACACTCTATTGATTCTTCTTCTTCTTGTTCTCCTTCTCAAGCTTTTAGTCTTG 
ACCCTTCTCAGCAAAATCAGTTCTTGTCAACTAACAAGAACAAGGGTTGTCTTCTC^^TG 
TTCCTTCTTCTGCAAACCCTTTTGATAATGCTTTTGAGTTTGGCTCTGAATCTGGTTTTC 
TTAACCAAATCCATGCTCCTATTTCGATGGGGTTTGGTTCTTTGACACAATTGGGGAACA 
GGGATTTGAGTTCTGTTCCTGATTTCTTGTCTGCTCGGTCACTTCTTGCGCCGGAAAGCA 
AC^^CT^CAACACAATGTTGTGTGGTGGTTTCACAGCTCCGTTGGAGTTGGAAGGTTTTG 
GTAGTCCTGCTAATGGTGGTTTTGTTGGGAACAGAGCXSAAAGTTCTGAAGCCrrTAGAGG 
TGTTAGCATCGTCTGGTGCACAGCCTACTCTGTTCCAGAAACGTGCAGCTATGCGTCAGA 
GCTCTGGAAGC^AAATGGGAAATTCGGAGAGTTCGGGAATGAGG 

GAGATATGGATGAGACTGGGATTGAGGTTTCTGGGTTGAACTATGAGTCTGATGAGATAA 

ATGAGAGCGGTAAAGCGGCTGAGAGTGTTCAGATTGGAGGAGGAGGAAAGGGTAAGAAGA 

AAGGTATGCCTGCTAAGAATCTGATGGCTGAGAGGAGAAGGAGGAAGAAGCTTAATGATA 

GGCTTTATATGCTTAGATCAGTTGTCCCCAAGATCAGCAAAATGGATAGAGCATCAATAC 

TTGGAGATGCAATT«ATTATCTGAAGGAACTTCTAG?U\AGGATCAATGATCTT 

AAC1TGAGTCAACTCCTCCTGGATCTTTGCCTCGAACTTCATCAAGCTTCCATCCGOT 

CACCTACACCGCAAAGTCTTTCTTGTCGTGTCAAGGAAGAGTTGTGTCCCTCTTCTTTAC 

CAAGTCCTAAAGGCCAGCAAGCTAGAGTTGAGGTTAGATTAAGGGAAGGAAGAGCAGTGA 

ACATTCATATGTTCTGTGGTCGTAGACCGGGTCTGTTGCTCGCTACCATGAAAGCTTTGG 

ATAATCTTGGATTGGATGTTCAGCAAGCTGTGATCAGCTGTTTTAATGGGTTTGCCTTGG 

ATGTTTTCCGCGCTGAGCAATGCCAAGAAGGACAAGAGATACTGCCTGATCAAATCAAAG 

CAGTGCTTTTCGATACAGCAGGGTATGCTGGTATGATCTGATCTGATCCTGACTTCGAGT 

CCATTAAGCATCTGTTGAAGCAGAGCTAGAAGAACTAAGTCCCTTTAAATCTGCAATTTT 

CTTCTCAACTTTTTTTCTTATGTCATAACTTCAATCTAAGCATGTAATGCAATTGCAAAT 
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GAGAGTTGTTTTTAAATTAAGCTTTTGAGAACTTGAGGTTGTTGTTGTTGGATACATAAC 

TTCAACCTTTTATTAGCAATGTTAACTTCCATTTATGTTTCATCTT 

>G1062 Amino Acid Sequence (domain in AA coordinates: 308-359) 

MGLDGNNGGGVWLNGGGGEREENEEGSWGRNQEDGSSQFKPMLEGDWFSSNQPHPQDLQM 
LQNQPDFRYFGGFPFNPNDNLLLQHSIDSSSSCSPSQAFSLDPSQQNQFLSTNNNKGCLL 
NVPSSANPFDNAFEFGSESGFLNQIHAPISMGFGSLTQLGNRDLSSVPDFLSARSLLAPE 
S1TOINNTMLCGGFTAPLELEGFGSPANGGFVGNRAKVLKPLEVLASSGAQPTLFQKRAAMR 
QSSGSKMGNSESSGMRRFSDDGDMDETGIEVSGLNYESDEINESGKAAESVQIGGGGKGK 
KKGMPAKNLMAERRRRKKLNDRLYMLRSWPKISKMDRASILGDAIDYLKELLQRINDLH 
NELESTPPGSLPPTSSSFHPLTPTPQTLSCRVKEELCPSSLPSPKGQQARVEVRLREGRA 

SvlfdtagyagmS 

>G1322 (213.. 833) 

AAAGTTATTGATAGTTTCTGTTACTTATTAATTTTTAAGGTTATGTGTATTATTACCAAT 

TGGAGGACTATATAGTCGCAAGTCTCAACCCTATAAAAGAAAACATTCGTCGATCATCTT 

CCCGCCTCGAGTATCTCTCTCTCTCTCTCTCTTCTCTGTTTTCTTTATTGATTGCATAGA 

CAAAAATACACACATACACAACAGAAAGAAAGATGGAGACGACGATGAAGAAGAAAGGGA 

GAGTGAAAGCGACAATAACGTCACAGAAAGAAGAAGAAGGAACAGTGAGAAAAGGACCTT 

GGACTATGGAAGAAGATTTCATCCTCTTTAATTACATCCTTAATCATGGTGAAGGTCTTT 

GGAACTCTGTCGCCAAAGCCTCTGGTCTAAAACGTACTGGAAAAAGTTGTCGGCTCCGGT 

GGCTGAACTATCTCCGACCAGATGTGCGGCGAGGGAACATAACCGAAGAAGAACAGCTTT 

TGATCATTCAGCTTCATGCTAAGCTTGGAAACAGGTGGTCGAAGATTGCGAAGCATCTTC 

CGGGAAGAACGGACAACGAGATAAAGAACTTCTGGAGGACAAAGATTCAGAGACACATGA 

AAGTGTCATCGGAAAATATGATGAATCATCAACATCATTGTTCGGGAAACTCACAGAGCT 

CGGGGATGACGACGCAAGGCAGCTCCGGCAAAGCCATAGACACGGCTGAGAGCTTCTCTC 

AGGCGAAGACGACGACGTTTAATGTGGTGGAACAACAGTCAAACGAGAATTACTGGAACG 

TTGAAGATCTGTGGCCCGTCCACTTGCTTAATGGTGACCACCATGTGATTTAAGATATAT 

ATATAGACCTCCTATACATTTATATGCCCCAGCTGGGTTTTTTTGTATGGTACGTTATTT 

GGTTTTTCTATTGCTGAAATGTCGTTGCATTTAATTTACATACGAAAAGTGCATTAAATC 

ATTAAATCTTCAATACATATGGAGGTGGTGTTTGAGTAAAAAAAAAAAAAA 

M™L Amin ° ACid Se ^ uence (domain in AA coordinates : 26- 130) 

METTMKKKGRVKATITSQKEEEGTVRKGPWTMEEDFILFNYILNHGEGLWNSVAKASGLK 

nn^^ RWL ^ LRPD ^ GN1TEEEQLLIIQL ^ GmWSK ^l'PGRTDNEIKNF 

WRTKIQRHMKVSSENMMNHQHHCSGNSQSSGMTTQGSSGKAIDTAESFSQAKTTTFNWE 
QQSNENYWNVEDLWPVHLLNGDHHVI * 
>G1331 (1..786) 

GAATACGTCCGTGTTCACGGTGAACMTCGTTGGAACTCTGTCTCTAAACTCGCAGGATTG 
^™of C ^ GCTGCAGACTMGATGGGTG ^ TTACCTT AGACCAGACCTCAAG 
A ^ GATCACTCCACATG ^ G ^GTATAATACTT(^GCTACACGCTAAGTGGGGA 

^^f CTCCTAAAAAGGCAACAOTCAAGGA ^^ 

GAACAGCAGTTGTTTCAGTTCGACCAACTCGGTATGAAAAAGATCATCTCTTTACTCGAA 
GAAAACAATAGCAGTAGCAGTAGCGATGGCGGTGGTGATGTGTTCTATTATCCTGATCAA 
™ CACATOCATCAAAACCCmGGCTATAACTC TAATTCATTAGAGGAGCAG 
G °^ A J^ CTCCTG ™ 

™ A ^ ^ AACATGGATCWGTAAATCGACATGGT GGGAACTTGGGTGTTGTC 

tSS ACTGCTCCTTGTCGCCCAAGG ^ 

^^L PHta ° AGid Se ^ ence (conserved domain in AA coordinates -8-109) 

^E^KGPOTAEEDRLLIEYWVHGEGRWNSVSKIAGLKRNGKSCRLRWNYLRPDLK 

RGQI^HEESIILEIJ^GNRWSTIARSLPGRTDNEIKNYWTHFKK^ 




AATAACGPRKPYFHNLVIPFC 
>G1521 (1..891) 
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ATGCCTCCATTACCGTCCTCCACGGCGCCTTCGTCTTCGAGACATCTTCGATCGCCGGAA 

AGTATCGCGAAATTTGCAGGGAGAGCAATATTTCCTGCTTTACAGGGGJ\AATCGTGTCCG 

ATATGCCTCGAAAATCTAACCGAGCGAAGATCCGCCGCCGTGATCACGGTGTGCAAGCAC 

GGATACTGCCTTGCTTGTATTCGGAAGTGGAGCAGCTTCAAGAGGAATTGTCCTCTTTGT 

AACACTCGTTTTGATTCCTGGTTTATCGTTAGTGATTTTGCTTCTAGAAAATACCATAAG 

GAGCAATTACCAATTCTTCGTGATCGTGAGACTTTAACTTATCATCGGAATAATCCTTCC 

GATCGCCGGAGGATAATTCAAAGGTCGAGGGATGTTTTGGAAAACTCTAGCTCAAGATCA 

AGGCCATTGCCATGGCGGAGATCATTTGGACGACCAGGTTCAGTTCCTGATTCTGTTATC 

TTCCAGCGAAAGCTTCAGTGGCGAGCTAGCATATACACTAAGCAATTACGAGCTGTTCGA 

TTACATTCAAGGCGCTTGGAACTAAGTTTGGCGGTGAATGATTACACCAAAGCAT^AGATA 

ACTGAAAGAATTGAGCCATGGATTAGAAGAGAGCTTCAGGCAGTCCTTGGAGATCCTGAT 

CCCTCAGTTATTGTTCATTTTGCGTCAGCTCTTTTCATCT^AAAGGCTTGAGAGAGAGAAT 

AATCGACT^ACCGGGCAGACCGGGATGTTGGTGGAAGATGAAGTCTCCTCTCTTCGAAAA 

TTCTTGTCTGATAAGGTGGATATATTTTGGCATGAACTAAGATGTTTTGCGGAGAGTATA 

CTCACGATGGAGACTTATGATGCAGTGGTTGAATACAATGAGGTGGAGTAA 

>G1521 Amino Acid Sequence (domain in AA coordinates: 39-80) 

MPPLPSSTAPSSSRHLRSPESIAKFAGRAIFPALQGKSCPICLENLTERRSAAVITVCKH 

GYCLACIRKWSSFKRNCPLCNTRFDSWFIVSDFASRKYH^ 

DRRRIIQRSRDVLENSSSRSRPLPWRRSFGRPGSVPDSVIFQRKLQWRASIYTKQLRAVR 
LHSRRLELSLAVWDYTKAKITERIEPWIRRELQAVLGDPDPSVIVHFASALFIKMiEREN 
NRQTGQTGMLVEDEVSSLRKFLSDKVDIFWHELRCFAESILTMETYDAVVEYNEVE* 
>G183 (1..1458) 

ATGAGTGATTTTGATGAAAACTTCATCGAAATGACGTCGTATTGGGCTCCACCATCCAGT 

CCTAGCCCAAGAACGATATTGGCAATGCTGGAGCAAACCGACAATGGTCTGAATCCAATC 

AGTGAGATCTTCCCTCAAGAAAGCTTGCCAAGAGATCATACTGATCAATCTGGACAAAGA 

TCTGGTCTTCGTGAGAGACTGGCTGCAAGAGTAGGATTCAATCTTCCAACACTCAATACA 

GAAGAAAACATGAGTCCTTTGGATGCATTTTTCAGGAGCTCGAATGTTCCTAATTCTCCT 

GTCGTTGGAATCTCTCCAGGATTCAGTCCATCAGCACTATTGCATACTCCCAATATGGTC 

AGTGATTCTTCCCAGATTATCCCTCCGTCTTCAGCCACCAATTACGGACCTCTAGAGATG 

GTGGAAACTTCCGGTGAAGACAATGCAGCGATGATGATGTTCAACAACGATCTTCCTTAT 

CAGCCGTACAATGTTGATCTGCCTTCTCTAGAAGTCTTTGATGATATTGCAACGGAAGAG 

TCCTTTTATATCCCATCTTATGAACCTCATGTTGACCCAATTGG7VACTCCTTTAGTCACA ' 

TCCTTTGAATCTGAACTCGTTGACGATGCCCATACCGACATCATCTCCATTGAGGACAGT 

GAGAGCGAGGATGGAAACAAAGATGATGACGACGAGGACTTCCAATACGAAGACGAAGAC 

GAAGACCAATACGACCAAGATCAAGATGTAGATGAAGATGAAGAGGAAGAAAAAGATGAA 

GACAATGTTGCATTAGATGATCCTCAACCTCCACCTCCAAAGAGAAGGAGATATGAGGTA 

TCAAACATGATTGGAGCCACAAGAACAAGCAAGACACAAAGGATCATACTTCAGATGGAA 

AGCGACGAAGACAATCCTAACGATGGTTATCGCTGGAGAAAATACGGTCAGAAAGTCGTC 

AAAGGAAATCCTAATCCGAGGAGTTACITCAAGTGCACAAACATCGAGTGCAG 

AAACATGTGGAGAGAGGAGCAGACAATATCAAGTTGGTTGTGACTACATACGATGGGATA 

CAC^CCATCCTTCACCACCTGC^CGTAGAAGCAATTCCAGTTCAAGGAACCGGTCTGC 

GGGGCAACAATACCTCAAAATCAGAATGATCGAACCAGTCGGTTAGGTAGGGCTCCTCCT 

ACTCCTACTCCTCCTACTCCTCCTCCTTCGTCTTACACACCTGAGGAGATGAGGCCTTTC 

TCTTCGTTGGCTACAGAAATTGATCTGACAGAGGTTTATATGACCGGAATCTCTATGCTG 

CCGAATATACCGGTTTACGAGAATTCGGGTTTTATGTACCAGAATGATGAACCGACGATG 

AATGCGATGCCGGATGGTTCAGATGTGTACGATGGGATCATGGAACGCCTGTATTTTAAG 

TTTGGTGTCGACATGTAG 

>G183 Amino Acxd Sequence (domain in AA coordinates: TBD) 

MSDFDENFIEMTSYWAPPSSPSPRTILAMLEQTDNGLNPISEIFPQESLPRDHTDQSGQR 

SGLRERLAARVGFmiPTLNTEENMSPLDAFFRSSNVPNSPWAISPGFSPSALLHTPNMV 

SDSSQIIPPSSATNYGPLEMVETSGEDNAAMMMFNNDLPYQPY1JVDLPSLEVFDDIATEE 

SFYIPSYEPHVDPIGTPLVTSFESELVDDAHTDIISIEDSESEDGNKDDDDEDFQYEDED 

EDQYDQDQDVDEDEEEEKDEDNVALDDPQPPPPKRRRYEVSNMIGATRTSKTQRIILQME 

SDEDNPITOGYRWRKYGQKVVKGNPNPRSYFKCTNIECRV^ 

HNHPSPPARRSNSSSRNRSAGATIPQNQNDRTSRLGRAPPTPTPPTPPPSSYTPEEMRPF 

SSLATEIDLTEVYOTGISMLPNIPVYENSGFMYQM^ 

FGVDM* 
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>G2555 (177.. 956) 

CTGTTTTTGTATCCGTGTAAATTAATCACACGGTAGTTTTTGATGAAAAGACAACAATCG 
GAGAACAATCTGGTCTGCTGCTAAAATTTAATAAATTGTTTTGTCTAATTGTCTCCACCC 
ATAAAAAAGCGCGAATTCAATTCACCGACTAAAGACATTCTCCGGTGGAGACCCCGATGC 
AATCCACTCATATAAGCGGCGGAAGTAGCGGTGGTGGTGGTGGAGGAGGAGGAGAGGTGA 
GTCGAAGTGGATTATCTCGGATCCGTTCAGCTCCAGCTACTTGGATTGAAACCCTACTCG 
AAGAAGATGAAGAAGAAGGTTTAAAACCTAACCTTTGTTTAACAGAGCTGCTTACTGGTA 
ATAATAACTCTGGAGGAGTGATAACGAGTCGTGACGACTCGTTCGAGTTCCTGAGTTCTG 
TTGAGCAAGGATTGTAT7^ATCATCATCAAGGTGGTGGCTTTCACCGTCAGAATAGTTCTC 
CGGCTGATTTTCTTAGTGGGTCTGGTTCTGGGACTGATGGGTATTTCTCTAATTTTGGTA 
TTCCGGCGAATTATGACTATTTGTCGACCAACGTTGATATTTCTCCGACTAAACGGTCTA 
GAGATATGGAAACACAGTTTTCTTCTCAGCTGAAAGAAGAGCAAATGAGTGGTGGGATAT 
CAGGAATGATGGATATGAACATGGACAAGATTTTTGAGGATTCAGTTCCTTGTAGGGTTC 
GTGCTAAACGTGGTTGTGCTACTCATCCTCGTAGCATTGCTGAACGGGTGAGAAGAACGC 
GAATAAGTGATCGGATTAGGAGGCTGCAAGAGCTTGTTCCTAACATGGATAAGCAAACCA 
ACACTGCAGACATGTTGGAAGAAGCTGTGGAGTATGTGAAGGCTCTTCAAAGCCAGATCC 
AGGAATTGACAGAGCAGCAGAAGAGATGCAAATGCAAACCTAAAGAAGAACAATAATGTA 
TCCTTTAGGATTTGATATATCTGTATTTTATTTTTGTACTATCTAAAAATGGTGATGATC 
TGTTCGAAAATTCGAAACATGATCTTATATATTGAACTAGA7\A7y\ATAGATATATATGAA 
TTTTAGCTGTAAAATTTTTGTACAATAAGGAGAAAAAGATTTAGAAGAGTCAAT7UUU\AG 
ATGATGTTTACAAGTCAAAAAAAAAAA 

>G2555 Amino Acid Sequence (domain in AA coordinates: 175-245) 
MQSTHISGGSSGGGGGGGGEVSRSGLSRIRSAPATWIETLLEEDEEEGLKPNLCLTELLT 
G1TONSGGVITSRDDSFEFLSSVEQGLYNHHQGGGFHRQNSSPADFLSGSGSGTDGYFSNF 
GIPAim)YLSTNVDISPTKRSRDMETQFSSQLKEEQM^ 

VRAKRGCATHPRS IAERVRRTRI SDRIRRLQELVPNMDKQTNTADMLEEAVEYVKALQSQ 

IQELTEQQKRCKCKPKEEQ* 

>G375 (53.. 1171) 

TCGACAAAAACTCTCACTCTCCCTCAAACTAAACAAACATACAGAACACAAAATGGGTCT 
CACTTCTCTTGAAGTTTGCATGGATTCTGATTGGCTCCAGGAATCCGAGTCATCAGGAGG 
AAGCATGTTAGACTCTPCAACGAATTCTCCGTCAGCAGCCGACATACTAGCAGCTTGCAG 
CACTAGACCACAAGCCTCGGCCGTGGCTGTAGCCGCTGCAGCTCTGATGGACGGTGGTU^G 
GAGGCTGCGTCCACCTCACGACCATCCTCAAAAGTGTCCTCGTTGCGAGTCAACACATAC 
TAAGTTCTGTTACTACAATAACTACAGCCTCTCTCAGCCTCGTTACTTCTGCAAGACTTG 
TCGCCGTTACTGGACAAAAGGCGGAACTCTAAGGAATATTCCGGTTGGTGGTGGATGCCG 
TAAAAACAAGAAACCATCTTCCTCTAATTCCTCCTCCTCCACTTCTTCCGGCAAAAAACC 
ATCCAACATCGTTACCGCCAATACCTCTGATCTTATGGCTTTAGCACATTCTCATCAAAA 
TTACCAACATTCTCCTCTAGGGTTTTCACAT 

TCCGGAGCATGGTAACGTTGGTTTCTTGGAGAGCAAGTATGGCGGTTTGCTTTCGCAGAG 

CCCTAGACCTATTGATTTCTTGGACAGTAAGTTTGATCTCATGGGAGTGAACAATGACAA 

CCTGGTCATGGTTAATCATGGAAGTAACGGAGATCATCATCATCATCATAATCATCACAT 

GGGTCTGAATCACGGTGTAGGTCTTAACAACAACAACAACAATGGTGGATTTAATG 

TTCTACGGGAGGCAATGGAAATGGTGGTGGTCTCATGGATATATCGACATGCCA^GACT 

TATGCTATCTAATTATGATCATCACCAT^^ 

AACAATAATGGATGTGAAGCCAAATCCGAAGTTGTTATCGCTTGATTGGCAGCAAGATCA 
ATGCTACTCCAATGGTGGTGGTAGCGGAGGCGCAGGAAAATCCGACGGTGGTGGATACGG 
CAATGGTGGTTATATCAACGGTTTAGGTTCGTCGTGGAATGGTTTGATGAATGGCTATGG 
AACGTCCACTAAAAeAAACTCCTTGGTTTGATAAGTTAATCAGAACTTCTTTTTTCTTGT 
CGTCATCAACTAGTAGTAGTAGTAATAGTAGTTGGAGACTAGAGAAGCACTTCAAATTAT 
TTATGGGTTTGTTTG CTAAG CCAGTTTTAC 

>G375 Amino Acid Sequence (domain in AA coordinates: 75-103) 
MGLTSLQVCMDSDWLQESESSGGSMLDSSTNSPSAADILAACSTRPQASAVAVAAAALMD 
GGRIOiRPPHDHPQKCPRCESTHTKFCYyNNYSLSQPRYFCKTCRRYVrrKGGTLRNIPVGG 
GCRKNKKPSSSNSSSSTSSGKKPSNIVTANTSDLMALAHSHQNYQHSPLGFSHFGGmG^ 
YSTPEHGNVGFLESKyGGLLSQSPRPIDFLDSKFDLMGV^ 

HHMGLNHGVGLNNNNNNGGFNG I S TGGNGNGGGLMDI STCQRLMLSNYDHHHYNHQEDHQ 
RVATIMDVKPNPKLLSLDWQQDQCYSNGGGSGGAGKSDGGGYGNGGYINGLGSSWNGLMN 
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GYGTSTKTNSLV* 
>G1007 (86.-763) 

ATTCCTTCTTGCCTAGGAACTAATTGTTGCACACTTCGGTACACAATTTTTTGAGCACTT 
CGACATCAAAACGAGAGAGAAAAGAATGGTGGATTCTCATGGCTCCGACACGGAATGTTC 
CTCC AAGAAGAAAAAGGAGAAAACGAAAGAAAAGGGGGTATATCGTGGGG CTCG CATGAG 
GAGCTGGGGGAAATGGGTCTCGGAGATTCGGGAGCCCCGTT^AGAAATCAAGAATCTGGCT 
CGGGACTTTCCCCACGGCGGAGATGGCAGCGCGTGCCCATGATGTTGCGGCATTGAGTAT 
CAAAGGAAGTTCCGCAATCCTTAACTTCCCTGAGCTCGCGGATTTTCTGCCAAGACCAGT 
CTCGCTCAGCCAACAGGATATCCAGGCCGCAGCCGCCGAAGCCGCTCTTATGGATTTCAA 
AACTGTACCATTCCATCTTCAGGATGACTCAACGCCGTTGCAAACTAGGTGTGATACTGA 
GAAGATCGAAAAGTGGTCATCCTCATCGTCCTCAGCCTCATCCTCATCCTCATCTTCGTC 
CTCGTCCTCATCATCTATGCTTTCGGGGGAGCTAGGAGATATTGTGGAGTTGCCGAGTCT 
TGAAAACAATGTAAAATACGATTGTGCGCTGTATGACTCGTTGGAGGGGCTGGTGTCGAT 
GCCCCCATGGTTAGATGCTACCGAAAATGATTTTAGGTATGGAGATGATTCGGTACTGTT 
GGACCCATGTCTCAAAGAAAGCTTTTTGTGGAATTATGAGTAAGGTTTTTTTTTGGAAAG 
AAATGTGGTTTTTTGTTTCCTCCTCTCTTTTATACTTTCGATCTTTTTTTCTAAGCATAT 
ATATCTTCTACATATGTAATACTTTTCCATTAGTAAACAATGATTCGGTTTCGGGTACAA 

AAAAAAAAAAAAAAAAAAAAAAA 

>G1007 Amino Acid Sequence (domain in AA coordinates: TBD) 
MVDSHGSDTECSSKKKKEKTKEKGWRGAR 

AARAHDVAALSIKGSSAILNFPELADFLPRPVSLSQQDIQAAAAEAALMDFKTVPFHLQD 
DSTPLQTRCDTEKIEKWSSSSSSASSSSSSSSSSSSSMLSGELGDIVELPSLENNVKYDC 
ALYDSLEGIiVSMPPWLDATE^FRYGDDSVLLDPCLICESFLWNYE* 
>G1010 (344. .1276) 

ATTCTTCTTCTAAAAAATCTTGACAACTTTTTGTTTTTGTTTTCTTTCTCTGAATTTTTT 

AAAAGAGAGAGAGCTATGTAGCTATGAAACAGTAAGAGATATAGATATAGAGAGACAGAG 

AAAGATGATGATCAGTGAAGTTAGGCTAAACCCACTTTCTATTTATGTATAATTAGGTCA 

ATCACATCACCAATCTCCTCCTCCAATTCTCCTCCTCTCCTTCCAAATTCTAGGGTTTTG 

CTTGTATCTCACCCCCTTTCTCAATTCCCTAGGGAAACTGTGAATTTCATCAAATTCCAT 

TATTTTTTGGTCACACCCTTAAAGAGATCTGAGAGTTCTAAAGATGATGACAGATTTATC 

TCTCACGAGAGATGAAGATGAAGAAGAAGCAAAGCCCTTAGCAGAAGAAGAAGGAGCGCG 

TGAAGTAGCAGACAGAGAGCACATGTTCGACAAAGTTGTGACTCCAAGTGATGTCGGAAA 

ACTAAACCGACTTGTGATCCCAAAGCAACACGCAGAGAGATTCTTCCCTTTAGATTCATC 

TTCAAACGAGAAAGGTTTGCTTTTAAACTTCGAAGATCTCACTGGCAAATCTTGGAGGTT 

CCGTTACTCTTACTGGAACAGTAGTCAAAGCTATGTCATGACTAAAGGTTGGAGCAGATT 

CGTTAAAGACAAAAAGCTTGACGCCGGAGATATTGTCTCTTTCCAAAGATGTGTCGGAGA 

TTCAGGAAGAGATAGCCGTTTGTTTATTGATTGGAGGAGAAGACCTAAAGTCCCTGACCA 

TCCTCATTTCGCCGCCGGAGCTATGTTCCCTAGGTTTTACAGCTTTCCTTCGACCAATTA 

CAGTCTTTATAATCATCAGCAGCAACGTCATCATCAGAGTGGTGGTGGTTATAATTATCA 

TCAAATTCCGAGAGAATTTGGTTATGGTTACTTCGTTAGGTCAGTGGATCAGAGGAACAA 

TCCTGCGGCTGCGGTGGCTGATCCGTTGGTGATTGAATCTGTGCCGGTGATGATGCACGG 

GAGAGCTAATCAGGAACTTGTTGGAACGGCCGGGAAGAGACTGAGGCTTTTTGGAGTTGA 

TATGG7VATGCGGCGAGAGCGGAATGACCAACAGTACGGAGGAGGAATCATCATCTTCCGG 

TGGAAGTTTGCCACGTGGAGGCGGTGGTGGTGCTTCATCTTCCTCTTTCTTTCAGCTGAG 

ACTTGGAAGCAGCAGTGAAGATGATCACTTCACTAAGAAAGGAAAGTCTTCATTGTCTTT 

TGATTTGGATCAATAATAATGATGATGATGAAATTAGTTGGTATTTTAAGAAAAAAAACA 

TACATATATAATTCTATATATATGACAACATAATGCATTGATTTCCTT 

>G1010 Amino Acid Sequence (domain in AA coordinates: 33-122) 

MMTDLSLTRDEDEEEAKPLAEEEGAREVAD 

FPLDSSSNEKGLLLNFEDIjTGKSTOFRYSYWNSSQSYVMTKGWSRFVKDKKLDAGDIVSF 
QRWGDSGRDSRLFIDWRRRPKVPDHPHFAAGAMFPRFYSFPSTOTSLYNHQQQRHHHSG 
GGYNYHQIPREFGYGYFVRSVDQRl^AAAVADPLVIESVPV^ 

RLFGVDMECGESGMTNSTEEESSSSGGSLPRGGGGGASSSSFFQLRLGSSSEDDHFTKKG 

KSSLSFDLDQ* 
>G1014 (174.. 1112) 

CACAAACCACAGTCTCTCTTTCTCTCTCTATCTATCTTCTCTTTCTCTCTCTATCTCTAT 
CACTGAAACCCAAAGAGATCCACCATTTGTTCTTTTTTC 
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CTTCCACACTTCCTTTTTACTAGGCAGTGTTAACCAATTGAGAGAGAAAAATGATGGTTG 

ATGAAAATGTGGAAACCAAGGCCTCTACTTTAGTGGCAAGTGTTGATCATGGGTTTGGAT 

CCGGGTCGGGTCATGATCATCATGGGTTATCGGCGTCTGTGCCTCTTCTTGGTGTTAACT 

GGAAGAAGAGAAGGATGCCTAGACAGAGACGATCTTCTTCTTCCTTTAACCTTCTCTCTT 

TCCCTCCTCCTATGCCTCCTATTTCCGACGTGCCAACTCCTCTCCCCGCACGTAAAATTG 

ACCCAAGAAAGCTAAGATTCCTCTTCCAAAAGGAACTCAAGAACAGTGACGTCAGCTCTC 

TCCGACGTATGATACTCCCGAAGAAAGCCGCGGAGGCTCACTTGCCGGCACTTGAATGCA 

AGGAAGGGATTCCTATAAGAATGGAAGATTTGGACGGTTTTCACGTTTGGACCTTCAAGT 

ATAGGTACTGGCCAAACAACAATAGCAGAATGTACGTGCTAGAAAACACAGGCGATTTTG 

TGAATGCTCATGGTCTGCAGCTAGGTGACTTCATCATGGTTTACCAAGATCTCTACTCAA 

ACAATTACGTTATACAAGCAAGAAAAGCATCGGAAGAAGAAGAAGTAGACGTAATCAATC 

TTGAAGAAGACGACGTTTACACAAACTTAACAAGGATCGAAAACACTGTGGTTAACGATC 

TTCTCCTCCAAGATTTTAATCATCACAACAACAACAACAACAACAACAGCAACAGCAACA 

GCAACAAATGTTCTTACTATTATCCAGTCATAGATGATGTCACCACAAACACAGAGTCTT 

TTGTCTACGACACGACGGCTCTTACCTCCAACGATACTCCTCTCGATTTTTTGGGTGGAC 

ATACGACGACTACTAATAATTATTACTCCAAGTTCGGAACATTCGATGGTTTGGGCTCCG 

TTGAGAATATCTCTCTCGATGACTTCTACTAGATAATCAATCGATGGGCTCATGGTATTC 

TTGATGGTGATCAGCTATTTAATATCCTTATAATATATATAAGAATTAAATGCAATTTGC 

ATATATATTATCAAGTGTTGTAATATAACATTACAGTTTAAAAAAAAAAAAAAAAAA 

>G1014 Amino Acid Sequence (domain in AA coordinates: 90-172) 

MVDENVETI^STLVASVDHGFGSGSGHDHHGLSASVPLLGVNWKKRRMPRQRRSSSSFNIi 

LSFPPPMPPISHVPTPLPARKIDPRKLRFLFQKELKNSDVSSLRRMILPKKAAEAHLPAL 

ECKEGIPIRMEDLDGFHWTFKYRYWPNimSRMYVLENTGDFWAHGLQLGDFIMVYQDIj 

YSNl^^IQARKASEEEEVDVINLEEDDVYTNLTRIENTWTO 

SNSNKCSYYYPVIDDVTTNTESFVYDTTALTSNDTPLDFLGGHTTTTNNYYSKFGTFDGL 

GSVENI SLDDFY* 
>G1035 (103.. 624) 

CCATAATAATATATTAAAACTATATACTATAATCTTTTTACATAATAAACTTTGGGTCCT 

GCGTCTTAATCATAGTACTTAATTTTCTCTGTGTGTTTTAATATGAATAATAAAACTGAA 

ATGGGATCTTCCACAAGTGGAAATTGCTCGTCGGTTTCAACCACTGGTTTAGCTAACTCC 

GGTTCAGAATCTGATCTCCGGCAACGTGATCTAATCGACGAGCGGAAGAGAAAGAGGAAA 

CAGTCGAACAGAGAATCTGCGAGGAGGTCGAGGATGAGGAAGCAGAAGCATTTGGATGAT 

CTCACTGCTCAGGTGACTCATCTACGTAAAGAAAACGCTCAGATCGTCGCCGGAATCGCC 

GTCACGACGCAGCACTACGTCACTATCGAGGCGGAGAACGACATTCTCAGAGCTCAGGTT 

CTTGAACTTAACCACCGTCTCCAATCTCTTAACGAGATCGTTGATTTCGTCGAATCTTCT 

TCTTCAGGATTCGGTATGGAGACCGGTCAGGGATTATTCGACGGTGGATTATTCGACGGC 

GTGATGAATCCTATGAATCTAGGGTTTTATAATCAACCAATCATGGCTTCTGCTTCTACT 

GCTGGTGATGTTTTCAACTGTTAGAAAACTTCACATCATTATCATCGTGAGTGAGACTAA 

TCATCGCAGCAGGGGTAAAACTGTAATTTTTCTTATAAATTATGTGATGATGCTTTGTTT 

CTTTATTTTATAAGATGGTTAATTAGTGTTTAAAACTGATTGTAATGATAGACAGTGTAA 

GAAATGTGTGATATCLATGGAGATGGTGATGTGAGTTTGGTACAAATATTTTAAGATCTTT 

TCTTTCTATATATTAAAAGTGAAGAAATAATATT^ 

AAA 

>G1035 Amino Acid Sequence (domain in AA coordinates: 39-91) 
MNNKTEMGSSTSGNCSSVSTTGLANSGSESDLRQRDLIDERKRKRKQSNRESARRSRMRK 
QKHLDDLTAQVTHLRKENAQIVAGIAVTTQH^ 

DFVESSSSGFGMETGQGLFDGGLFDGVMNPMNLGFYNQPIMASASTAGDVFNC* 
>G1046 (1..567^ 

ATGATTAGACATCTAAAACCCTACATGGAGTCGTCTAGTGTCCATCGCrCTCATTGTTTC 

GATATTCTTGATGGAGTCCCACTACACGACGATCATTTCAACTCGGCATTCCTACCAAAC 

ACTGAC1TTAATGTTCATTTGCAGTCAAACGTATCGACCCGCATCAACAATCAGTCTCAC 

TTAGACCCAAATGCAGAAAACATTTTCCATAACGAAGGTCTTGCTCCAGAAGAAAGAAGA 

GCAAGAAGAATGGTCTCTAACCGGGAATCTGCAAGGAGGTCACGTATGCGCAAAAAGAAG 

CAGATCGAAGAGCTGCAACAACAAGTTGAACZAACTCATGATGTTGAATCATC 

GAGAAAGTCATCAACTTGTTGGAAAGCAACCATCAGATCCTACAAGAGAACTCACAGCTG 

AAAGAGAAAGTCTCTTCCTTTGACTTGCTCATGGCAGATGTGCTATTACCCATGAGAAAT 

GCAGAGAGCAACATCAATGACCGCT^ATGTGAATTATCTAAGAGGAGAACCATCAAACCGT 
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CCCACCAACAGTCCCTTTGGTAAGTAA 

>G1046 Amino Acid Sequence (conserved domain in AA coordinates : 79-138) 
MIRHLKPYMESSSVHRSHCFDILDGVPLHDDHFNSAFLPNTDFNVHLQSNVSTRINNQSH 
LDPNAENIFHNEGLAPEERRARRMVSNRESARRSRMRKKKQIEELQQQVEQLMMLNHHLS 
EKVINLLESNHQILQENSQLKEKVSSFHLLMADVLLPMRWAESNINDRNVNYLRGEPSNR 

PTNSPFGK* 
>G1049 (29.. 550) 

CTAACTTTCTTCCCAAGTAAACTTCAAAATGCAGCCGCAAACAGACGTTTTCAGCCTCCA 
TAACTACCTAAACTCATCGATACTGCAGTCTCCGTATCCTTCTAATTTCCCGATATCTAC 
GCCATTTCCAACCAACGGTCAAAACCCGTACCTCCTCTACGGATTCCAAAGCCCTACAAA 
CAATCCACAATCCATGAGCCTAAGCAGCAACAACTCAACATCAGATGAAGCAGAAGAGCA 
GCAGACGAACAACAATATAATCAACGAGCGGAAGCAGAGAAGGATGATTTCAAACCGAGA 
ATCCGCAAGGAGATCGCGTATGAGGAAGCAAAGACACCTTGACGAGCTTTGGTCACAAGT 
GATGTGGTTAAGGATCGAGAATCATCAGTTGCTTGATAAGCTTAACAATCTCTCTGAGTC 
TCACGACAAGGTTCTTCAAGAGAATGCTCAGCTTAAAGAAGAAACATTTGAGCTTAAGCA 
AGTGATCAGCGATATGCAAATTCAAAGCCCTTTCTCTTGCTTTAGAGACGATATAATCCC 
CATTGAATAAAGCATTTTTCCCCGATTCATATTTATGAAAATTTTCTTCAAGAGTATGTT 
TCTTTGTATGTATATGTGGAGATGTATTTCAGGGTTTTGATAATATGACCCTTTACGACG 
ACGTTTTTAGATTGTAGTAAATTTATAAACTAAAGAAGATTAGTGTTAATGAAGAACAAA 

TATAA 

>G104 9 Amino Acid Sequence (domain in AA coordinates 77-132) 
MQPQTDVFSLHNYLNSSILQSPYPSNFPISTPFPTNGQNPYLLYGFQSPTNNPQSMSLSS 
NNSTSDEAEEQQTNNNIINERKQRRMISl^ESARRSRMRKQRHLDELWSQVMWLRIENHQ 
LLDKLNNLSESHDKVLQENAQLKEETFELKQVISDMQIQSPFSCFRDDIIPIE* 
>G1069 (89.. 934) 

TTGGAACCCTAGAGGCCTTTCAAGCAAATCATCAGGGTAACAATTTCTTGATCTTTCTTT 

TTAGCGAATTTCCAGTTTTTGGTCAATCATGGCAAACCCTTGGTGGACGAACCAGAGTGG 

TTTAGCGGGCATGGTGGACCATTCGGTCTCCTCAGGCCATCACCAAAACCATCACCACCA 

AAGTCTTCTTACCAAAGGAGATCTTGGAATAGCCATGAATCAGAGCCAAGACAACGACCA 

AGACGAAGAAGATGATCCTAGAGAAGGAGCCGTTGAGGTGGTCAACCGTAGACCAAGAGG 

TAGACCACCAGGATCCAAAAACAAACCCAAAGCTCCAATCTTTGTGACAAGAGACAGCCC 

CAACGCACTCCGTAGCCATGTCTTGGAGATCTCCGACGGCAGTGACGTCGCCGACACAAT 

CGCTCACTTCTCAAGACGCAGGCAACGCGGCGTTTGCGTTCTCAGCGGGACAGGCTCAGT 

CGCTAACGTCACCCTCCGCCAAGCCGCCGCACCAGGAGGTGTGGTCTCTCTCCAAGGCAG 

GTTTGAAATCTTATCTTTAACCGGTGCTTTCCTCCCTGGACCTTCCCCACCCGGGTCAAC 

CGGTTTAACGGTTTACTTAGCCGGGGTCCAGGGTCAGGTCGTTGGAGGTAGCGTTGTAGG 

CCCACTCTTAGCCATAGGGTCGGTCATGGTGATTGCTGCTACTTTCTCTAACGCTACTTA 

TGAGAGATTGCCCATGGAAGAAGAGGAAGACGGTGGCGGCTCAAGACAGATTCACGGAGG 

CGGTGACTCACCGCCCAGAATCGGTAGTAACCTGCCTGATCTATCAGGGATGGCCGGGCC 

AGGCTACAATATGCCGCCGCATCTGATTCCAAATGGGGCTGGTCAGCTAGGGCACGAACC 

ATATACATGGGTCCACGCAAGACCACCTTACTGACTGAGTGAGCGATTTCTATATATAAT 

GGTCTATATAAATAAATATATAGATGAATATAAGCAAGCAATTTGAGGTAGTCTATTACA 

AAGCTTTTGCTCTGGTTGGAAAAATAAATAAGTATCAAAGCTTTGTTTGTTCOT 

AATATAGAGCTTGGGAAGGTAGAAAGAGACGACATT 

>G1069 Amino Acid Sequence (domain in AA coordinates: 67-74) 
MANPWWTNQSGLAGMVDHS VS SGHHQNHHHQSLLTKGDLG IAMNQSQDNDQDEEDDPREG 
AVEVWRRPRGRPPGSKNKPKAPIFvTRDSPNALRSHV^^ 

GVCVXjSGTGSVANWLRQAAAPGGWSLQGRFEILSLTGAFLPGPSPPGSTGLTVYLAGV 
QGQWGGSWGPLLAIGSVMVIAATFSNATYERLPMEEEEDGGGSRQIHGGGDSPPRIGS 
NLPDLSGMAGPGYl^PPHLIPNGAGQLGHEPYTWVHARPPY* 
>G1070 (170.. 1144) 

TCGACCAGCTTGGATTTCGTTGTTCATCATTACTACTCTCTTTCTTCTTCTAGCTAGCTA 
GTTTTGACAGCAAAATAAGAAGCAAAAAAAAGGTCAACTAAAAAAGATCTGTTCTTAGAT 
CACTCTCTTCTTCTTTTTTTGATCCAATTCCACCATTGAATCATAGATCATGGATCCAGT 
ACAATCTCATGGATCACAAAGCTCTCTACCTCCTCCTTTCCACGCAAGAGACTTTCAATT 
ACATCTTCAACAACAGCAACAAGAGTTCTTCCTCCACCATCACCAGCAACAAAGAAACCA 
AACCGATGGTGACCAACAAGGAGGATCAGGAGGAAACCGACAAATCAAGATGGATCGTGA 
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AGAGACAAGCGACAACATAGACAACATAGCTAACAACAGCGGTAGTGAAGGTAAAGACAT 
AGATATACACGGTGGTTCAGGAGAAGGAGGTGGTGGCTCCGGAGGAGATCATCAGATGAC 
AAGAAGACCAAGAGGAAGACCAGCGGGATCCAAGAACAAACCAAAACCACCGATTATCAT 
CACACGGGACAGCGCAAACGCGCTTAGAACCCACGTGATGGAGATCGGAGATGGCTGCGA 
CTTAGTCGAAAGCGTTGCCACTTTTGCACGAAGACGCCAACGCGGCGTTTGCGTTATGAG 
CGGTACTGGAAATGTTACTAACGTCACTATACGTCAGCCTGGATCTCATCCTTCTCCTGG 
CTCGGTAGTTAGTCTTCACGGAAGGTTCGAGATTCTATCTCTCTCAGGATCTTTTCTCCC 
TCCTCCGGCTCCTCCTACAGCCACCGGATTGAGTGTTTACCTCGCTGGAGGACAAGGACA 
GGTGGTTGGAGGAAGCGTAGTTGGTCCGTTGTTATGTGCTGGTCCTGTCGTTGTCATGGC 
TGCGTCTTTTAGCAATGCGGCGTACGAAAGGTTGCCTTTAGAGGAAGATGAGATGCAGAC 
GCCGGTTCATGGCGGAGGAGGAGGAGGATCATTGGAGTCGCCGCCAATGATGGGACAACA 
ACTGCAACATCAGCAACAAGCTATGTCAGGTCATCAAGGGTTACCACCTAATCTTCTTGG 
TTCGGTTCAGTTGCAGCAGCAACATGATCAGTCTTATTGGTCAACGGGACGACCACCGTA 
TTGATCAAATATACACACACACTCATAATCGTTGCTAGCTAGCTAACGATGAATCATGAG 
TTTAGTGGATATATATATGATTAAAAGAGGTTAGCTTATGAACATTAATAAGAGTTTGGA 
TTCTATCGAGCTTCATTATGTTTGGGTCATCGTTC 

>G1070 Amino Acid Sequence (domain in AA coordinates: 98-120) 

MDPVQSHGSQSSLPPPFHARDFQLHLQQQQQEFFLHHHQQQRNQTDGDQQGGSGGNRQIK 

MDREETSDNIDNIANNSGSEGKDIDIHGGSGEGGGGSGGDHQMTRRPRGRPAGSKNKPKP 

PIIITRDSANAIiRTHVMEIGDGCDLVESVATFARRRQRGVCVMSGTGNVTNVTIRQPGSH 

PSPGSWSLHGRFEILSLSGSFLPPPAPPTATGLSVYLAGGQGQWGGSWGPLLCAGPV 

WMAASFSNAAYERLPLEEDEMQTPVHGGGGGGSLESPPMMGQQLQHQQQAMSGHQGLPP 

NLLGSVQLQQQHDQSYWSTGRPPY* 

>G1076 (198. .1076) 

ATTTTAGTCTTCCTATAACTTCTTCTCAATCCTCTCTCATATCTTTTTTCTTAGTTTAAA 

TTTCAATAAAATAGAAAAAAAC^TATACAAATCTACAGAGAAGAGAAGCTTTATTTTAA 

CTTGTGTGTGTGTGTGTGTTTTATATAATTTTTATTTTTTTTCAAATTAAAATCTCTTCT 

TTGCTTTTGATGTGGGCATGGCTGGTCTTGATCTAGGCACAGCTTTTCGTTACGTTAATC 

ACCAGCTCCATCGTCCCGATCTCCACCTTCACCACAATTCCTCCTCCGATGACGTCACTC 

CCGGAGCCGGGATGGGTCATTTCACCGTCGACGACGAAGACAAC^CAACAACCATC^ 

GTCTTGACTTAGCCTCTGGTGGAGGATCAGGAAGCTCTGGAGGAGGAGGAGGTCACGGCG 

GGGGAGGAGACGTCGTTGGTCGTCGTCCACGTGGCAGACCACCGGGATCCAAGAACAAAC 

CGAAACCTCCGGTAATTATCACGCGCGAGAGCGCAAACACTCTAAGAGCTCACATTCTTG 

AAGTAACAAACGGCTGCGATGTTTTCGACTGCGTTGCGACTTATGCTCGTCGGAGACAGC 

GAGGGATCTGCGTTCTGAGCGGTAGCGGAACGGTCACGAACGTCAGCATACGTCAGCCAT 

CTGCGGCTGGAGCGGTTGTGACGCTACAAGGAACGTTCGAGATTCTTTCTCTCTCCGGAT 

CGTTTCTTCCTCCTCCGGCACCTCCCGGAGCAACGAGTTTGAGAATTTTCTTAGCCGGAG 

GACAAGGTCAGGTGGTTGGAGGAAGCGTTGTGGGTGAGCTTACGGCGGCTGGACCGGTGA 

TTGTGATTGCAGCTTCGTTTACTAATGTTGCTTATGAGAGACTTCCTTTAGAAGAAGATG 

AGCAGCAGCAACAGCTTGGAGGAGGATCTAACGGCGGAGGTAATTTGTTTCCGGAGGTGG 

CAGCTGGAGGAGGAGGAGGACTTCCGTTCTTTAATTTACCGATGAATATGCAACCAAATG 

TGCAACTTCCGGTGGAAGGTTGGCCGGGGAATTCCGGTGGAAGAGGTCCTTTCTGATGTG 

TATATATTGATAATCATTATATATATACCGGCGGAGAAGCnTTTCCGGCGAAGAATTTGC 

GAGAGTGAAGAAAGGTTAGAAAAGCTTTTAATGGACTAATGAATTTCAAATTATCATCGT 

GATTTCGGACATTGTCTTGTTCATCATGTTAAGCTTAGGTTTATT^ 

AATTTTATGTTTGAATCCTTTTTTTTTTCTGT^ 

AAAAAAAAATTCTCAAAAAAAA 

>G1076 Amino Aeid Sequence (domain in AA coordinates: 82-89) 
MAGLDLGTAFR YVNHQLHRPDLHLHHNS S SDDVTPGAGMGHFTVDDEDNWNNHQGIiDIiAS 
GGGSGSSGGGGGHGGGGDWGRRPRGRPPGSKNKPKPPVIITRESANTLRAHILEVTNGC 
DVFDCVATYARRRQRGICVLSGSGTVTNVSIRQPSAAGAVVTLQGTFEILSLSGSFIiPPP 
APPGATSLTI PLAGGQGQWGGS WGELTAAGPVIVIAAS FTNVAYERLPLEEDEQQQQL 

GGGSNGGGNLFPEVAAGGGGGLPFFNLPMNMQPNVQLPVEGWPGNSGGRGPF* 
>G1089 (31. .2427) 

AAGTAAGAGAGCTTCTTAAGGAAGAAGAAGATGGGTTGTGCTCAATCAAAGATCGAGAAC 
GAAGAAGCAGTTACTCGTTGCAAAGAACGAAAACAATTGATGAAAGACGCCGTCACTGCT 
CGTAACGCTTTCGCCGCCGCTCACTCAGCTTACGCTATGGCTCTTAAAAACACCGGAGCT 



66 



BNSDOCJD: <WO__03013227A2JA> 



WO 03/013227 PCT/US02/25805 



GCTCTTTCCGATTACTCTCACGGCGAGTTTTTAGTCTCTAATCACTCGTCTTCCTCCGCA 

GCTGCAGCAATCGCTTCTACTTCTTCTCTTCCCACTGCTATATCTCCTCCTCTTCCTTCT . 

TCCACCGCTCCGGTTTCTAATTCAACCGCTTCTTCTTCCTCCGCTGCGGTTCCTCAGCCG 

ATTCCTGATACTCTTCCTCCTCCTCCTCCTCCACCACCGCTTCCTCTTCAACGTGCTGCT 

ACTATGCCGGAGATGAACGGTAGATCCGGTGGTGGTCATGCTGGTAGTGGACTCAACGGA 

ATTGAAGAAGATGGAGCCCTAGATAACGATGATGATGACGATGATGATGATGATGACTCT 

GAAATGGAGAATCGTGATCGTTTGATTAGGAAATCGAGAAGCCGTGGAGGTAGTACTAGA 

GGAAATAGGACGACGATTGAAGATCATCATCTTCAGGAGGAGAAAGCTCCGCCACCTCCC 

CCTTTGGCGAATTCGCGGCCAATTCCGCCGCCACGTCAGCATCAGCATCAACATCAGCAA 

CAGCAACAACAACCTTTCTACGATTACTTCTTCCCTAATGTTGAGAATATGCCTGGAACT 

ACTTTAGAAGATACTCCTCCACAACCACAACCACAACCAACAAGGCCTGTGCCTCCTCAA 

CCACATTCACCAGTCGTTACTGAGGATGACGAAGATGAGGAGGAGGAAGAGGAGGAAGAG 

GAGGAGGAAGAGGAGACGGTGATTGAACGGAAACCACTGGTGGAGGAAAGACCGAAGAGA 

GTAGAGGAAGTGACGATTGAATTGGAAAAAGTTACTAATTTGAGAGGGATGAAGAAGAGT 

AAAGGGATAGGGATTCCCGGAGAGAGGAGAGGAATGCGAATGCCGGTGACTGCGACGCAT 

TTGGCGAATGTATTCATTGAGCTTGATGATAATTTCTTGAAAGCTTCTGAAAGTGCTCAT 

GATGTTTCTAAGATGCTTGAAGCTACTAGGCTCCATTACCATTCTAATTTTGCAGATAAC 

CGAGGACATATTGATCACTCTGCTAGAGTGATGCGTGTAATTACATGGAATAGATCATTT 

AGAGGAATACCAAATGCTGATGATGGGAAAGATGATGTTGATTTGGAAGAGAATGAAACT 

CATGCTACTGTTCTTGACAAATTGCTAGCATGGGAAAAGAAGCTCTATGACGAAGTCAAG 

GCTGGCGAACTCATG7\AAATCGAGTACCAGAAAAAGGTTGCTCATTTAAATCGGGTGAAG 

AAACGAGGTGGCCACTCGGATTCATTAGAGAGAGCTAAAGCAGCAGTAAGTCATTTGCAT 

ACAAGATATATAGTTGATATGCAATCCATGGACTCCACAGTTTCAGAAATCAATCGTCTT 

AGGGATGAACAACTATACCTAAAGCTCGTTCACCTTGTTGAGGCGATGGGGAAGATGTGG 

GAAATGATGCAAATACATCATCAAAGACAAGCTGAGATCTCAAAGGTGTTGAGATCTCTA 

GATGTTTCACAAGCGGTGAAAGAAACAAATGATCATCATCACGAACGCACCATCCAGCTC 

TTGGCAGTGGTTCAAGAATGGCACAGGCAGTTTTGCAGGATGATAGATCATCAGAAAGAA 

TACATAAAAGCACTTGGCGGATGGCTAAAGCTAAATCTCATCCCTATCGAAAGCACACTC 

AAGGAGAAAGTATCTTCGCCTCCTCGAGTTCCCAATCCCGCAATCCAAAAACTCCTCCAC 

GCTTGGTATGACCGTTTAGACAAAATCCCCGACGAAATGGCTAAAAGTGCCATAATCAAT 

TTCGCAGCGGTTGTAAGCACGATAATGCAGCAGCAAGAAGACGAGATAAGTCTCAGAAAC 

AAATGCGAAGAGACAAGAAAAGAATTGGGAAGAAAAATTAGACAGTTTGAGGATTGGTAC 

CACAAATACATCCAGAAGAGAGGACCGGAGGGGATGAATCCGGATGAAGCGGATAACGAT 

CATAATGATGAGGTCGCTGTGAGGCAATTCAATGTAGAACAAATTAAGAAGAGGTTGGAA 

GAAGAAGAAGAAGCTTACCATAGACAAAGCCATCAAGTTAGAGAGAAGTCACTGGCTAGT 

CTTCGAACTCGCCTCCCCGAGCTTTTTCAGGCAATGTCCGAGGTTGCGTATTCATGTTCG 

GATATGTATAGAGCTATAACGTATGCGAGTAAGCGGCAAAGCCAAAGCGAACGGCATCAG 

AAACCTAGCCAGGGAC^GAGTTCGTAAGAACTAATGTAAGATCAGAGTAATGTCTTCTTC 

TTCTTTGATCTTGAATATTTAAGCACACACATACATACAACGTATAGCTAAATCTTTATC 

ATtGCTTTCTTATATTAAGGTTTTGGOTm 

TATAGTGTTTGATTCTTAAGGAACTGTTCTGTTGAGTAATAAGAAAGTTGTGTATTGAAA 
TAGAGTTGCATTTGTTAATTTTG 

>G1089 Amino Acid Sequence (domain in AA coordinates 425-500) 

MGCAQS KI ENEEAVTRCKERKQLMKDAVTARNAFAAAHS AYAMALKOTGAALSDYSHGEF 

LVSNHSSSSAAAAIASTSSLPTAISPPLPSSTAPVSNSTASSSSAAVPQPIPDTLPPPPP 

PPPLPLQRAATMPEmGRSGGGHAGSGLNGIEEDGALDNDDDDDDDDDDSEMENRBRLIR 

KSRSRGGSTRGmTTIEDHHLQEEKAPPPPPLANSRPlPPPRQHQHQHQQQQQQPFYDYF 

FPNVENMPGTTLEOTPPQPQPQPTRPVPPQPHSPWTEDDEDEEEEEEEEEEEEETVIER 

KPLVEERPKRVEEVTIELEKVTNLRGMK^ 

NFLKASESAHDVSKMLEATRLHYHSNFADNRGHIDHSARVMRVITWNRSFRGIPNADDGK 
DDVDLEENETHATVLDKLIA^ 

RAKAAVSHLHTRYIVDMQSMDSTVSEINRIiRDEQLYLKLVHLVEAM 

AEISKVLRSLDVSQAVKETNDHHHERTIQLIiAWQEWHTQFCRMIDHQKEYIKAL 

LNLIPIESTLKEKVSSPPRVPNPAIQKLLHAWYDRLDKIPDEMAKSAIINFAAWSTIMQ 

QQEDEISLRNKCEETRKELGRKIRQFEDWYHKYIQKRGPEGMNPDEADNDHNDEVAVRQF 

NVEQIKKRLEEEEEAYHRQSHQVREKSLASLRTRLPELFQAMSEVAYSCSDMYRAITYAS 

KRQSQSERHQKPSQGQSS * 
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>G1093 (1..531) 

ATGGGTTATCCGGTGGGGTACACTGAGCTCCTCCTCCCAAGAATCTTCCTTCACTTACTC 

TCTCTCTTAGGCTTAATACGAACACTCATAGACACGGGTTTTCGGATATTGGGTCTACCC 

GACTTTCTCGAATCCGACCCGGTTTCATCGTCATCGTCATGGCTGGAACCACCGTATATG 

TCCACGGCGGCGCATCATCACCAAGAAAGCTCATTTTTCTTCCCAGTGGCGGCGAGGCTA 

GCTGGAG7VAATCTTGCCCGTCATCAGATTCTCGGAGCTAACTCGACCCGGATTCGGATCC 

GGATCCGATTGCTGCGCGGTGTGCCTCCACGAGTTCGAGAACGATGACGAGATCCGACGG 

CTGACGAATTGTCAACACATATTTCACCGGAGCTGTTTAGACCGTTGGATGATGGGTTAT 

AATCAGATGACGTGTCCACTTTGTAGAACGCCGTTTATTTCTGATGAGTTACAAGTTGCT 

TTTAACCAACGAGTTTGGTCTGAATCTGAACTTCTCGCAGAATCAAATTAG 

>G1093 Amino Acid Sequence (domain in AA coordinates: 105-148) 

MGYPVGYTELLIiPRIFLHLLSLLGLIRTLIDTGFRILGLPDFLESDPVSSSSSWIiEPPYM 

STAAHHHQESSFFFPVAARLAGEILPVIRFSELTRPGFGSGSDCCAVCLHEFENDDEIRR 

LTNCQHIFHRSCLDRWMMGYNQMTCPLCRTPFISDEIiQVAFNQRVWSESELLAESN* 

>G1127 (191. .1351) 

GACAGACTCTCTCTGTATGTGTGCGAGAAGCGAGAAGCGAGAGAGAGAGAGAGAGAGTTG 
TTAGCTCACACGCTTTCTCTATTTTCTCGGAATTCACAAAACAGAAAGTTTCATCCTTTA 
CGAGAATTAAGCCGAAAGAAACAATCTTTGAGTTTGATTTCTTCTTCCTTCCTTCTCTCT 
CTCTGCTCTAATGGATTCCAGAGACATCCCACCGTCACATAACCAGCTTCAACCACCACC 
GGGAATGTTAATGTCTCATTACCGTAACCCTAACGCCGCCGCTTCACCATTAATGGTTCC 
CACTTCCACATCTCAACCGATTCAACACCCTCGTCTTCCTTTTGGCAATCAACAACAATC 
TCAAACGTTTCATCAGCAGCAACAACAACAAATGGATCAGAAGACTCTTGAATCTCTTGG 
ATTTGGTGATGGATCACCTTCTTCTCAACCGATGCGATTCGGGATCGATGATCAGAATCA 
GCAACTGCAAGTGAAGAAGAAGCGAGGAAGGCCGAGAAAGTATACTCCTGATGGTAGCAT 
TGCTTTAGGTTTAGCTCCTACGTCTCCTCTTCTCTCTGCAGCTTCTAATTCTTACGGTGA 
GGGTGGTGTTGGAGATAGTGGTGGAAATGGAAACTCTGTTGATCCACCTGTTAAACGTAA 
C^GAGGAAGGCCTCCTGGTTCTAGTAAGAAACAGCTTGATGCTTTAGGAGGAACTTCAGG 
AGTTGGGTTTACACCTCATGTCATTGAAGTGAACACAGGAGAGGACATAGCGTCAAAGGT 
GATGGCTTTTTCGGATCAAGGGTCAAGAACAATTTGTATTCTCTCTGCAAGTGGTGCAGT 
TTCTAGAGTGATGCTTCGTCAAGCTTCTCATTCTAGTGGAATCGTTACTTATGAGGGACG 
ATTTGAGATCATTACTCTCTCAGGCTCAGTCTTGAATTATGAGGTAAATGGTTCCACCAA 
CAGAAGTGGTAACTTGAGTGTGGCTTTGGCTGGACCTGATGGCGGCATCGTAGGTGGCAG 
TGTAGTTGGTAATCTAGTAGCTGCAACACAAGTCCAGGTGATAGTGGGAAGCTTTGTTGC 
AGAAGCAAAGAAACCGAAACAAAGTAGTGTTAACATTGCTCGGGGGCAGAATCCTGAACC 
GGCTTCAGCGCCGGCTAACATGTTGAACTTTGGATCAGTCTCTCAAGGACCATCGAGCGA 
GTCATCAGAAGAGAATGAGAGCGGTTCTCCTGCAATGCACCGTGACAATAATAATGGGAT 
ATATGGAGCTCAACAACTUVCAACAACAACAACCTCTTCATCCTCATCAGATGCAAATGTA 
CCAACATCTTTGGTCTAATCATGGTCAATAAAATGAAGCGGAAATTAATTTGTTTCCGTT 
TTGGTTACGGTTATGGTTTGATTTCTT 

>G1127 Amino Acid Sequence (domain in AA coordinates : 103-110, 155-162) 

MDSRDIPPSHNQLQPPPGKmSHYRNPNAAASPLMVPTSTSQPIQHPRLPFGNQQQSQTF 

HQQQQQQMDQKTLES LGFGDGS PS S QPMRFG IDDQNQQLQVKKKRGRPRKYTPDGS I ALG 

LAPTSPIiLSAASNSYGEGGVGDSGGNGNSVDPPVKRNRGRPPGSSKXQIiDALGGTSGVGF 

TPHVIEVNTGEDIASKVMAFSDQGSRTICILSASGAVSRVMLRQASHSSGIVTYEGRFEI 

ITLSGSVLNYEVNGSTNRSGNLSVALAGPDGGIVGGSWGNLVAATQVQVIVGSFVAEAK 

KPKQSSWIARGQNPEPASAPANMLNFGSVSQGPSSESSEENESGSPAMHRDNNNGIYGA 

QQQQQQQPLHPHQMQMYQHLWSNHGQ* 

>G1131 (57.. 756) 

TCGACTCCTCTCCTGATTGCTTCACCTTCTTOTTTACTACAGGTTTCAGCTCCTCAATGT 
CCATGGATTGCTTAAGCTACTTCTTTAACTACGATCCTCCTGTCCAGCTCCAGGATTGCT 
TTATTCCCGAGATGGATATGATTATCCCTGAAACCGATAGTTTCTTCTTCCAATCTCAAC 

cgcaactggagtttcatcagccattgtttcaagaagaagctccttcac^gacccactttg 
accctttctgcgaccagtttcitoctccgcaagaaatc^ 

AAATCTTCAACGAAACACACGACCTCGATTTCTTTCTCCCCACGCCAAAACGCCAGAGAC 
TTGTTAACTCCAGCTACAATTGTAAGACTCAAAACCATTTCCAGAGCCGTAACCCGAATT 
TCTTCGACCCTTTCGGCGACACTGATTTCGTTCCAGAATCTTGTACCTTCCAGGAGTTTC 
GAGTTCCGGATTTCTCTTTAGCTTTCAAGGTAGGCCGGGGAGAT 
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AACCGACGCTTTCATCTCAGAGCATCGCGGCTAGAGGGAGGAGAAGAAGAATTGCAGAGA 
AGACTCACGAGCTCGGAAAACTCATCCCCGGTGGCAATAAACTTAACACCGCCGAGATGT 
TCCAAGCCGCCGCTAAGTATGTCAAGTTTTTGCAGAGTCAAGTTGGGATTCTCCAACTGA 
TGCAGACCACAAAGAAGGTAATAACCAACCCCAAATAAGAACTTTATCATCCAATTGAAA 
CTCTAATCGTGTTTTCTCACAAGCTTCTTAATTTGTTTACGCAGGGTAGCTCTAATGTGC 
AAATGGAAACTCAGTATTTGCTTGAATCGCAAGCAATCCAGGAGAAGTTATCAACAGAGG 
AAGTGTGTTTGGTACCGTGTGAAATGGTTCAAGATCTAACAACTGAAGAAACCATTTGCA 
GAACCCCGAATATTTCTCGAGAAATCAACAAGTTACTGTCTAAACATCTGGCTAACTAGT 
TTTAGTTTCAAGCCTGAAGTTCTCTATGCCTAAATTTGTGTCTGTTATCGTTGTTTTGTC 
TTCTTAGTTAGTGTTTTGTCTTGTTGATTTAGGGGCTAATTATCCTGGTTAATCTCCTCT 

TAACTGGGAA 

>G1131 Amino Acid Sequence (domain in AA coordinates: 173-2-0) 

MSMDCLSYFFNYDPPVQLQDCFIPEMDMIIPETDSFFFQSQPQLEFHQPLFQEEAPSQTH 

FDPFCDQFLSPQEIFLPNPKNEIFNETHDLDFFLPTPKRQRLVNSSYNCNTQNHFQSRNP 

NFFDPFGDTDFVPESCTFQEFRVPDFSLAFKVGRGDQDDSKKPTLSSQSIAARGRRRRIA 

EKTHELGKLIPGGNKIjNTAEMFQAAAKYVKFLQSQVGILQLMQTTKKyiTO 

>G1145 (243.. 1142) 

GTGATTTCTCTCTGCCATTTCCTTCGATTTGATTTCTGGGTTCTCTTCTTCTCGTCTCTC 

TTCTGCATGTTTCGCCACTCTACCTTAGAAAAAAGGTTACTTTCGCCTCCGATTTAGGCT 

CGATTTGATGAATTCGTCGTCGTGTGGCTATTTATCAAATTGAGCATTAGGGTTTCTGAT 

TTGTGGGTTCAGAATTGTTTTTATCTATCTGTCTTGTTGTTTTTTGTCCGCTACAAAAGC 

CTATGGATTCTCAGAGGGGTATTGTTGAACAAGCTAAATCTCAGTCCTTGAATAGGCAAA 

GCTCTCTTTACAGCTTAACACTTGATGAGGTTCAAAATCACTTGGGGAGTTCTGGTAAAG 

CTCTGGGAAGCATGAACCTTGATGAGCTTTTGAAGAGTGTCTGTTCTGTTGAAGCTAATC 

AGCCATCGTCTATGGCTGTCAATGGTGGAGCAGCTGCTCAGGAGGGTCTTTCTCGCCAGG 

GGAGTTTGACTTTGCCTCGGGATCTCAGCAAAAAGACTGTTGATGAGGTTTGGAAAGACA 

TTCAGCAGAATAAGAATGGAGGTAGTGCTCATGAGAGGAGGGATAAGCAGCCTACACTTG 

GGGAAATGACGCTTGAAGACCTGTTGTTGAAAGCAGGAGTGGTCACTGAGACTATCCCTG 

GTTCGAACCATGATGGTCCTGTTGGTGGTGGTAGTGCTGGTTCAGGTGCTGGTTTAGGGC 

AAAACATTACTCAAGTTGGCCCATGGATTCAATATCATCAGCTCCCATCAATGCCACAGC 

CTCAAGCATTTATGCCCTATCCGGTTTCAGATATGCAAGCAATGGTGTCTCAGTCTTCTT 

TGATGGGTGGTTTGTCAGATACACAAACTCCTGGAAGGAAGAGGGTAGCTTCAGGAGAAG 

TTGTAGAGAAGACTGTAGAGAGGAGGCAGAAGAGAATGATAAAGAACAGAGAGTCTGCTG 

CTCGTTCCCGAGCTAGGAAACAGGCTTACACTCATGAGCTAGAGATCAAAGTTTCACGGT 

TAGAAGAAGAAAACGAAAGACTCAGGAAGCAAAAGGAGGTGGAAAAATCCTCCCAAGTGT 

ACCACCGCCTGATCCCAAGCGGCAGCTCCGACGGACAAGCTCGGCTCCTTTCTGATCTCT 

AAACTCTTTTTGTCTTTTTCTTTTTTTCTCTTCTGTGTCGGTTCACTTATAAAAAAGAGA 

GGAAAACAGCTTTGTTTCTTTGTACATTCCGTAGACTTTCTTGACTTGGAGCAATTCTGT 

TAACTTTAAAATATTCTCGAGTTATTGTAGTAGCAGACTAGCAGCAGTAATGGTTTTCAT 

GAGTCCGATTGAAATTCAGAGATTGAACAGGAAAAAA 

>G1145 Amino Acid Sequence (conserved domain in AA coordinates : 227-270) 
MDSQRGIVEQAKSQSLNRQSSLYSLTLDEVQNHLGSSGKALGSMNLDEIiLKSVCS 
PSSmWGGAAAQEGLSRQGSLTLPRDLSKKTVDEVWKDIQQNKNGGSAHERRDKQPT 
EMTLEDLLLKAG VVTETI PGSNHDGPVGGGSAGSGAGLGQNITQVGPWI QYHQLPSMPQP 
QAFMPYPVSDMQAMVSQSSLMGGLSOTQTPGRKRVASGEVV^KTVERRQKRMIKNRESAA 
RSRARKQAYTHELE IKVSRLEEENERLRKQKEVEKS SQVYHRLI PSGS SDGQARLLSDL * 

>G1229 (123.. 1217) 

TTTGGGCGGGTCTTTCTTTCCCTAAATCTTTCTTTTATTTTGCTGTTTAAAAAAAAAATC 
CAACCATAAGACAAAACAACGAACGAGGAAGAGAGAGAGAGAAGGATATATCTCTAATCA 
CGATGCAGGAGATAATACCGGATTTTCTTGAAGAGTGTGAATTTGTCGACACTTCACTAG 
CCGGAGATGATCTATTTGCCATCTTAGAGAGTCTTGAAGGTGCCGGAGAGATATCTCCGA 
CAGCTGCATCTACACCTAAAGATGGAACCACAAGTTCCAAGGAGTTAGTTAAGGATCAAG 
ATTATGAAAACTCATCTCCTAAGAGGAAAAAGCAAAGACTAGAAACCAGGAAAGAAGAGG 
ACGAAGAAGAAGAAGACGGAGACGGAGAAGCAGAAGAAGATAATAAGCAAGATGGGCAAC 
AAAAGATGTCTCATGTAACCGTGGAACGTAACCGGAGAAAGCAAATGAACGAGCACTTAA 
CCGTTTTGCGTTCTCTTATGCCTTGTTTCTACGTCAAACGGGGGGACCAAGCATCGATCA 
TAGGAGGAGTTGTGGAGTACATAAGCGAGTTACAACAAGTTCTCCAATCTTTGGAAGCCA 
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AGAAACAACGTAAAACCTACGCCGAAGTCCTAAGCCCGAGAGTTGTCCCGAGCCCTCGTC 
CTTCACCGCCTGTTCTAAGCCCAAGAAAACCGCCTCTTAGCCCGCGCATCAACCACCACC 
AGATTCACCACCACCTACTTCTCCCTCCCATAAGTCCTCGAACACCTCAGCCAACAAGCC 
CATACCGGGCCATTCCACCGCAACTACCACTCATCCCACAGCCTCCGCTTCGCTCTTACA 
GCTCATTGGCCAGTTGCAGCAGCTTAGGAGATCCACCTCCATACTCTCCTGCTTCATCTT 
CTTCATCTCCTTCAGTTAGTAGTAACCATGAGAGTAGTGTGATCAATGAGCTTGTTGCTA 
ACTCAAAATCGGCTTTGGCTGATGTGGAAGTGAAGTTTTCAGGAGCTAACGTGCTGCTCA 
AAACGGTGTCGCATAAGATCCCGGGACAAGTTATGAAGATAATTGCTGCTCTTGAAGATT 
TGGCTCTTGAGATTCTTCAGGTTAATATTAACACCGTCGACGAAACCATGCTTAATTCTT 
TCACCATCAAGATTGGAATTGAGTGCCAACTAAGTGCAGAAGAACTGGCTCAACAAATTC 
AGCAAACATTCTGCTAGTAAAGAAGGATTTAATATAGCTTCGTATAAACCTTAACGAGAG 
AGCAGTACGTACTCACTTTCTCTCCTTAGTATCCCTTTAATTATCTTTTCAGTTTTCTGC 
AAAGATATGGAGTTTAAAAAAATAAAATTGTTATCTAAAGTTTTAATCAAATATTGATTA 
ATTATAACTAATATAGGTATAAGTGAGTTTTAAAGATTATCAGCTTCATAACAGCCATCG 
TCATGTTTACTTTCTTTTAAATTTTAGAATTTAGACGTACTCCTACCATGTAATTTTATT 
TCTGTCATTACATCAAGCATTGTAGCTGTAATTGCATATGAATGAACAATAGTGTATGAG 
TGATCTCATGAATAATATTCTTCTTGCAACACAAAAAAAAAAAA 

>G1229 Amino Acid Sequence (domain in AA coordinates: 102-160) 

MQE1IPDFLEECEFVDTSLAGDDLFAILESLEGAGEISPTAASTPKDGTTSSKELVKDQD 

YENSSPKEKKQRLETRKEEDEEEEDGDGEAEEDNKQDGQQKMSHVTVERimRKQMNEHLT 

VLRSLMPCFYVKRGDQASIIGGVVEYISELQQVLQSLEAKKQRKTYAEVliSPRVVPSPRP 

SPPVLSPRKPPLSPRINHHQIHHHLLLPPISPRTPQPTSPYRAIPPQLPLIPQPPLRSYS 

SLASCSSIiGDPPPYSPASSSSSPSVSSNHESSVINELV^SKSALADVEVKFSGANVLLK 

TVSHKIPGQVMKIIAALEDLALEILQWINTVDETMLNSFTIKIGIECQLSAEELAQQIQ 

QTFC* 

>G1246 (1..1746) 

ATGATCATGTACGGAGGAGGAGGAGCAGGGAAGGACGGTGGATCCACCAATCACTTATCA 
GACGGAGGAGTGATATTGAAGAAAGGTCCATGGACGGCGGCGGAAGATGAGATACTTGCT 
GCGTACGTTAGAGAGAACGGTGAAGGGAATTGGAACGCCGTTCAGAAAAACACAGGTTTG 
GCTCGTTGCGGCAAAAGCTGCCGTCTTCGATGGGCCAATCACCTCCGACCAAATCTGAAA 
AAAGGCTCTTTC^CCGGTGACGAAGAACGTCTCATCATTC^GCTTC^TGCTCAGCTTGGT 
AACAAATGGGCTCGCATGGCTGCTCAGTTACCGGGAAGAACAGACAACGAGATTAAGAAC 
TATTGGAACACGAGATTGAAACGACTTCTTCGCCAAGGACTTCCTCTTTATCCTCCAGAT 
ATTATCCCTAACCATCAACTCCATCCACATCCACATCATCAACAACAACAGCAACATAAC 
CATCATCATCATCATCATCAACAACAACAACAACATCAACAAATGTATTTTCAACCACAA 
TCTTCACAACGAAACACACCATCATCTTCCCCTCTTCCATCTCCAACACCAGCAAACGCA 
AAGTCCTC^TC^TCCTTCACTTTTCATACCACGACTGCTAACCTCCTCCATCCACTTAGC 
CCTCACACTCC3U^CAC^C<^TCTCAACTCTC^ 

TCTCCTTTATGTTCCCCTCGCAACAACCAATACCCGACCCTOCCCCTCTTTGCCCTCCCG 
CGTTCCCAAATCAACAACAACAACAACGGAAATTTCACTTTCCCTAGACCTCCA.CCTCTC 
CTTCAACCGCCTTCATCACTCTTCGCAAAACGTTACAACAATGCT 
TGCATCAACCGCGTCTCAACCGCACCATTTTCCCCTGTTTCAAGAGACTCCTACACI^ 
. TTTCTTACATTGCCTTACCCTTCCCCAACC^ 
AACCCTTACTCTTCCTCTCCTTCCTTCTCTTTAAACCCTTCTTCTTCTTCTO 
TCAACTTCTTCCCCAAGCTTTCTTCACTCCCATTACACT^ 
ACCAACCCAGTTTACTCCATGAAACAAGAGCAGCTCCCTT 
GATGGCTTCAATAACGTCAACAACTTCA(^GACAAC^ 

AGTTCCGGTGCTCA^AGAAGAAGTAGTAGCTGCAGCCTCTTAGAGGATGTCTTCGAAGAG 

GCCGAAGCTTTAGCCTCTGGAGGCAGAGGCCGACCTCCAAAACGAAGACAACTCACAGCT 

TCTCTTCCGAACCACAACAACAACACCAACAACAACGACAACTTCTTCT 

GGACATTATGATTCTTCTGACAACTTATGTTCCTTGCAAGATTTGAAATCAAAGGAAG^ 

GAGTCTCTTCAAATGAACACAATGCAGGAGGACATAGCTAAGCTTCTTGATTGGGGAAGT 

GATAGTGGAGAGATCTCTAATGGACAATCATCTGTTGTCACTC 

GATGTTCATCAATTAGCTTCACTATTCCCGGCTGATTCTA(^GCCGTCGTAGCCGC^C^ 
AACGACCAACACAACAAGAATAATAACAATAATTGTTCCTGGGATGACATGCAGGGAATA 
AGGTAG 

>G1246 Amino Acid Sequence (domain in AA coordinates: 27-139) 
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MIMYGGGGAGKDGGSTOHLSDGGVILKKGPWTAAEDEILAAYVRBNGEGNWNAVQKNTGL 

ARCGKSCRLRWANHLRPNLKKGSFTGDEERLIIQLHAQLGNKWARMAAQLPGRTDNEIKN 

YWNTRLKRLLRQGLPLYPPDIIPNHQLHPHPHHQQQQQHNHHHHHHQQQQQHQQMYFQPQ 

SSQRNTPSSSPLPSPTPANAKSSSSFTFHTTTANLLHPLSPHTPNTPSQLSSTPPPPPLS 

SPLCSPRJTOQYPTLPLFALPRSQINNNNMGNFTFPRPPPLLQPPSSLFAKRYNNANTPLN 

CINRVSTAPFSPVSRDSYTSFLTLPYPSPTAQTATYHNTNNPYSSSPSFSLNPSSSSYPT 

STSSPSFLHSHYTPSSTSFHTNPVYSMKQEQLPSNQIPQIDGFNNVNWFTDNERQNHNLN 

SSGAHRRSSSCSLLEDVFEEAEALASGGRGRPPKRRQLTASLPNHNNNTNNNDNFFSVSF 

GHYDSSDNLCSLQDLKSKEEESLQMNTMQEDIAKLLDWGSDSGEISNGQSSWTDDNLVL 

DVHQLASLFPADSTAWAATNDQHNKNHNNHCSWDDMQGIR* 

>G1255 (138.. 1388) 

CAGCTCAAACTCTCTAGGACTACACTAAATCTAACTTTTTGCAGAGAGCAAAAGATTCAA 

TAATTGAGATTGATCTCAAAACCAAAGCTCTCGTGCTCTTGTCGTTGATGTTGGTTGTGT 

AGACTTTGTATACAATGATGAAAAGTTTGGCGAATGCTGTTGGAGCGAAGACGGCGAGGG 

CTTGCGACAGCTGCGTGAAGAGACGTGCACGGTGGTACTGCGCGGCCGACGATGCTTTTC 

TTTGCCAGTCTTGCGACAGTTTGGTCCATTCAGCAAACCCTCTTGCTCGCCGCCACGAGA 

GAGTCCGTTTGAAGACGGCTAGCCCGGCGGTCGTAAAGCATAGCAACCACTCATCAGCTt 

CTCCTCCACATGAGGTCGCCACGTGGCATCACGGGTTTACTCGTAAAGCTCGAACGCCAC 

GTGGCTCTGGTAAGAAAAACAATTCGTCGATATTTCATGACTTGGTTCCTGATATTAGTA 

TTGAGGATCAGACAGACAACTATGAGCTTGAAGAGCAGCTGATCTGTCAAGTGCCGGTTC 

TAGATCCGTTGGTGTCTGAGCAGTTCTTGAACGATGTCGTTGAGCCCAAGATCGAGTTTC 

CTATGATCAGAAGTGGTTTGATGATCGAGGAGGAGGAAGACAACGCTGAAAGTTGTCTTA 

ATGGATTTTTCCCGACCGACATGGAGCTTGAGGAGTTTGCTGCTGACGTGGAGACTCTGC 

TCGGTCGCGGGTTAGACACGGAGTCGTATGCCATGGAGGAGCTAGGGTTATCTAATTCAG 

AGATGTTCAAAATCGAAAAAGATGAGATTGAAGAAGAAGTAGAAGAGATAAAAGCCATGA 

GCATGGATATATTTGATGATGATCGAAAAGACGTGGATGGAACAGTACCGTTTGAGCTAA 

GCTTTGATTACGAGTCGTCACACAAGACGTCCGAAGAAGAGGTAATGAAGAACGTTGAAA 

GTAGTGGTGAATGTGTTGTTAAGGTGAAAGAGGAAGAACATAAGAATGTTCTGATGCTAA 

GATTAAACTATGACTCGGTGATATCCACTTGGGGAGGTCAAGGTCCACCGTGGAGTTCAG 

GAGAGCCACCGGAACGAGACATGGACATCAGCGGTTGGCCAGCCTTTTCCATGGTGGAGA 

ATGGAGGAGAAAGTACTCATCAGAAGCAATACGTTGGTGGATGTTTACCATCAAGTGGGT 

TTGGAGATGGAGGTAGAGAAGCTAGAGTTTCGAGATACAGAGAGAAGAGGAGGACAAGGT 

TGTTTTCTAAGAAGATACGGTACGAGGTACGTAAATTGAATGCAGAGAAAAGACCACGAA 

TGAAAGGAAGATTCGTGAAGAGAGCCTCGCTCGCTGCTGCTGCTTCACCATTAGGTGTTA 

ATTACTGAATAGTTAATATCTATTCATGTTATATCTCACTTTACAAATTTCGGTGAATCT 

TTTTTCTTCTGAAACAACAGAAGTTATTTTGGCACTTAATTGTGCTTTGAGGACTTGTAT 

GTACATAGAAGTAACCAATAATAATGTGACTTTTACTA 

>G1255 Amino Acid Sequence (domain in aa coordinates: 18-55) 
MKSLANAVGAKTARACDSCVKRRARWYCAADDAFLCQSCDSLVHSANPLARRHERA/RIiKT 
ASPAVVKHSNHSSASPPHEVATWHHGFTRKARTPRGSGKKNNSSIFHDLVPDISIEDQTD 
NYELEEQLICQVPVLDPLVSEQFLNDVVEPKIEFPMIRSGLMIEEEEDNAESCLNGFFPT 
DMELEEFAADVETLLGRGLDTESYAMEELGLSNSEMFKIEKDEIEEEVEEIKAMSMDIFD 
DDRKDVDGTVPFELSFDYESSHKTSEEEVMKl^SSGECVVKVKEEEHKNVLMLRLNYDS 
VISTWGGQGPPWSSGEPPERDMD1SGWPAFSMVENGGESTHQKQYVGGCLPSSGFGDGGR 

EARVSRYREKRRTRLFSKKIRYEVRKLNAEKRPRMKGRFVKRASLAAAASPLGVNY* 
>G1304 (1..978) 

^^^^ AT ^ C ^ TGTTGCGAT ^ GMTGGTCT ^ GAAAG GGCCATGGACACAAGAG 
GAGGATGATAAACT-GATAGATCACATTCAAAAACATGGCCATGGCAGCTGGAGAGCTCTT 
C^GCAAGCCGGTTTAAACCGATCCGGAAAGAGTTCTAGATTAAGATGGAC^CTAC 

GACAA * GAAATAAAAAACTA ^^ 

GTGACCCATAGGCCAAGAACCG ACCATCTAAACGTTTTAGCAGCTCTCCCG 

CTCmGCTAAAGCTCAACTGC TACACACTATGATTCAAGTCCTTAGCACC 
™™™ CCMMI ^^ 

CXCTTTGGCCAAGCTTCTTACTTAGA^CCAAAATCTTTTTGGTCAGTCTCAAAACTTC 
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Tf-TCACATTCTTGAGGATGAGAATTTGATGGTCAAAACCCAAATTATTGATAACCCTTTG 
rS^SScTTCCCCCATACAACCCGGTTTTCAAGATGATCATAATTCACTCCCTCTA 

?StoggcgtSSgaagaatcta^^ 
atcgSSSccatcatcatgatgcttcaaacccttcatcatca^^ 

JaagatcatSSacccatggtgtgacactattgatgatggag 

AAAGAGATAATAGAGTAA conserved domain in AA coordinates : 13-118) 

T^n™NFTEEEEQTIINLHSLLGNI<WSSIAGNLPGRTDNEIKNYWNTHLRKKLLQMG 

SsspiqpgfS 
qdhhhpwcdtiddgasdsfwkeiie* 

I^^gaggaagccagaggtagccattgcagctagtactcaccaagtaaagaagatg 

aftr^CGGACTTTGGTCTCCTGAGGAAGACTCAAAGCTGATGCAATACATGTTAAGCAAT 



CAAACCAACAAT 

ACCCCTTATGTAGATGGTATCTA' 



fE \ACCAACAATCCATTTCCAACGGGAAACATGATCAGCCACCCGTGCAATGACGATTTT 
C ^ CC ^^!^^ ATCTATGGAGT AAACGCAGGGGTACAAGGGGAACTCTACTTC 
rGTGAAGAAGGTGATTGGTACAATGCAAATATAAACAACCACTTAGAC 



TGGGACCTTGACCAGTTGATGAACACTGAGGTTCCTTCGTTTTACTTCARCTTCAAACAA 
AGCATATGAATATTTTTACGTCATCTTATTCTTTTTTCTATTGCGGTI^ 

^SagSac^^^ 

taaaacatatatcctataataaataaataaargaaataataagcacataaaaaaaaaaaa 

t G 1318 Amino Acid Sequence (domain in AA coordinates: 20-123) 

mrkp^aSthqvkkmkkglwspeedsklmqymlsngqgcwsdvaknaglqrcgk^ 

LRWIOTlS^^GAFSPQEEDLIIRFHSILGNRWSQIAARIjPGRTDNEIKNFWNSTIIK 
^P^G^ISHPCNDDFTPYVDGIYGVNACTQGm.YFPPLECEEGDWYNANINNHLDEL 

ntngsgnapegmrpveefwdldqlmntevpsfyfnfkqsi* 

GAAGMC^TAAAGMCAAAAGGAGAGAGGTATTAAAAAATGATGTGTAGTCGAGGCCAW 

GGAGACOTGCAGARGACGAGAAGCTAAGAGAACTCGTCGAGCAATTTGGTCCTCM 

GGAACGCCA^GCTCAGAAGCTCTCTGGTCGATCTGGTAAGAGTTGTAGATTGAGATG^ 

CTAATCAATT^ATCCTAGGATTAACCGAAACCCTTTCACGGAGGAAGAAGJiAGAA 

T^^AG^CCTCATCGGATCCATGGGAACAGATGGTCTGTGATCGCTAGATTTTTTCCCG 



ftaGGGTCCAAGCTCCGTCCACGAGGCCTTGGCCATGATGGCACGGTGGCTGCGACTGGGA 

^a™^™™gactgcgataaggagagaagattggcaaccacaaccgctatca 

AT^CTCCTTATCAATTCTCTCATATTAATCATTTTCAAGTCCTCAAAGAGTCCTTG^ 

gaSaSggttcagaaatagtactactccaatacaagaaggagcaa^^ 
aac^accStggagttctacaattttctccaagtaaacacggattcgaagata 
tcatagataattcaagaaaagacgaagaagaagatctcgat 

ACGAGAATOGTGTTCCATTrTrCGACTTTTTGTCTGTTGG 

TATGTTAMTTGTCCGTACCACATGTACTATAAGGTGGACCATATGTTAACTAAAGATAA 
TGTAGAAAGTOCTAATCAATTAGAGCTCCTGTTTGAGCCAAATGTGAAAATTAGTT 

StccSca^c^tataacacatataaggttgtacttttatcaggtctaat^ 

CTATTTT^^TTAAGGATGTTrAATCAGACCCATAACCATTCGATAAAAAAAAAAAAAA 
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>G1320 Amino Acid Sequence (domain in AA coordinates- 5-108} 
MMCSRGHWRPAEDEKLRBLVEQFGPHNWNAIAQKLSGRSGKSCRLRWFNQLDPRINRNPF 
TEEEEERLLAPHRIHGNRWSVIARFFPGRTDNAVKNHWHVIMARRGRERSKLRPRGLGHD 
GTVAATGMIGNYKDCDKERRLATTTAINFPYQFSHINHFQVLKESLTGKIGFRNSTTPIQ 

EGAIDQTKRPMEFYNFLQVNTDSKIHELIDNSRKDEEEDVDQNNRIRNENCVPFFDFLSV 
GNSASQGLC* 

>G1330 (36..959) 

GTACCGGCGACCTCTTTGTGGGTCACTCTTCATCAATGGGTGACAAAGGAAGGAGCTTAA 
AGATCAACAAGAACATGGAGGAATTCACGAAAGTGGAAGAAGAAATGGACGTAAGGAGAG 
GTCCATGGACAGTTGAGGAAGATTTAGAGCTCATCAATTACATTGCTAGTCATGGTGAAG 



GTCGATGGAACTCTCTCGCTCGTTGCGCCGAACTCAAAAGGACCGGAAAAAGCTGCAGAC 

TTCGGTGGCTGAACTATCTCCGACCAGATGTGCGCCGTGGAAACATAACCCTCGAAGAAC 

.CTCTTGATTCTTGAACTTCACACACGTTGGGGCAATAGATGGTCTAAGATTGCACAAT 

..JTTACCAGGAAGAACGGATAACGAGATCAAAAACTATTGGAGAACACGTGTTC^VAAGC 

ATGCAAAACAGCTTAAATGCGACGTGAACAGTCAACAATTTAAAGACACCATGAAGTATC 

TTTGGATGCCTCGGCTCGTAGAAAGGATCCAAGCCGCGTCCATCGGGTCTGTTTCCATGT 
CATCTTGCGTCACCAf , rTrrTPflr;aTr'Ar!T'pr./-.rr.r.7.m^. 1 > „ „» - 



- ^"■-■^»- i >-«^^^'-«xrtV3/irtft 1J i3A r i. ( JAAUC:CGCGTCCATCGGGTCTGTTTCCATGT 

CATCTTGCGTCACCACCTCCTCAGATCAGTTCGTGATCAACAACAACAACACCAACAACG 

TGGATAATTTGGCTTTAATGAGTAACCCTAATGGTTACATCACGCCGGATAATTCCAGCG 

TGGCAGTATCTCCTGTATCAGATTTGACGGAGTGTCAAGTGAGTAGTGAAGTGTGGAAGA 

TTGGTCAGGATGAGAATTTGGTGGATCCAAAAATGACATCGCCGAATTATATGGATAATA 

GCAGTGGACTATTAAACGGAGATTTTACGAAGATGCAAGATCAAAGTGACCTTAATTGGT 

TTGAAAATATTAATGGGATGGTACCAAATTATTCGGACAGTTTTTGGAACATTGGAAATG 

ATGAAGACTTCTGGCTCTTACAACAACATCAACAAGTCCACGACAATGGAAGCTTCTGAA 
TAGACAAGA^GCTATGCGGCC 

>G1330 Amino Acid Sequence (domain in AA coordinates- 28-134) 
MGDKGRSLKINKNMEEFTKVEEEMDVRRGPWTVEEDLELINYIASHGEGRWNSLARCAEI, 
KRTGKSCRLRWLNYLRPDVRRGNITLEEQLLILELHTRWGNRWSKIAQYLPGRTDNEIKN 

YWRTRVQKHAKQLKCDVNSQQFKDTMKYLWMPRLVERIQAASIGSVSMSSCVTTSSDQFV 
INNNNTHNVDNIiAI,MRNPMRVT'rDriMCC!\77viroT>iT<-.i->T rr,,-,^,. • 



-™™-» *^w-*<-.i i«.i_Hrt(_rtA(_AX(JAACAAGTCCACGACAATGGAAGCTTCTGAA 
TAGACAAGA^GCTATGCGGCC 

>G1330 Amino Acid Sequence (domain in AA coordinates- 28-134) 
MGDKGRSLKINKNMEEFTKVEEEMDVRRGPWTVEEDLELINYIASHGEGRWNSLARCAEL 
KRTGKSCRLRWLNYLRPDVRRGNITLEEQLLILELHTRWGNRWSKIAQYLPGRTDNEIKN 
YWRTRVQKHAKQLKCDVNSQQFKDTMKYLWMPRLVERIQAASIGSVSMSSCVTTSSDQFV 
INl^TNNVDNLALMSNPNGYITPDNSSVAVSPVSDLTECQVSSEVWKIGQDENLVDPKM 
TSPNYMDNSSGLLNGDFTKMQDQSDLNWFENINGMVPNYSDSFWNIGNDEDFWLLQQHQQ 

>G1352 (79.. 900) 

GCGCGATTAAAAACTCTCAACTTTTCTCTCAAATTTCTGATCCTTTGATCCAACAGTTAG 
AAGAAGATTCATCTGATCATGGCCCTCGAAGCGATGAACACTCCAACTTCTTCTTTCACC 
AGAATCGAAACGAAAGAAGATTTGATGAACGACGCCGTTTTCATTGAGCCGTGGCTTAAA 
CGCAAACGCTCCAAACGTCAGCGTTCTCACAGCCCTTCTTCGTCTTCTTCCTCACCGCCT 
CGATCTCGACCCAAATCCCAGAATCAAGATCTTACGGAAGAAGAGTATCTCGCTCTTTGT 
CTCCTCATGCTCGCTAAAGATCAACCGTCGOUyvCGCGATTTCATCAACAGTCGCAATCG 
TTAACGCCGCCGCCAGAATCAAAGAACCTTCCGTACAAGTGTAACGTCTGTGAAAAAGCG 
^CCTTCCTATCAGGCTTTAGGCGGTCACAAAGCAAGTCACCGAATCAAACCACCAACC 
GTAATCTCAACAACCGCCGATGATTCAACAGCTCCGACCATCTCCATCGTCGCCGGAGAA 
AAACATCCGATTGCTGCCTCCGGAAAGATCCACGAGTGTTCAATCTGTCATAAAGTGTTT 
CCGACGGGTC^GCTTTAGGCGGTCACAAACGTTGTCACTACGAAGGCAACCTCGGCGGC 
GGAGGAGGAGGAGGAAGCAAATCAATCAGTCACAGTGGAAGCGTGTCGAGCACGGTATCG 
™^ TCAGC ^ CCGTCGATO ^ TCGATCTAAACCTA CCGGCGTTACCTGAACTCAGC 
CTTCATCACAATCCAATCGTCGACGAAGAGATCTTGAGTCCGTTGACCGGTAAAAAACCG 
CTTTTGTTGACCGATCACGACCAAGTCATCAAGAAAGAAGATTTATCTTTAAAAATCTAA 
TACTCGACTATTAATOCTTGTGTGATTTTTTTCGTTACAACCATAGTTTCATTTTCATTT 
TTTTAGTTACAAATTTTTAATTGTTCTGATTTGGATTGAAA 



^IIU^° Sequence (domain in AA coordinates: 108-129.167-188) 

MALEAMNTPTSSFTRIETKEDLMNDAVFIEPWLKRKRSKRQRSHSPSSSSSSPPRSRPKS 

QNQDLTEEEYLALCLLMIAKDQPSQTRFHQQSQSLTPPPESKNLPYKCNVCEKAFPSYQA 

LGGH^HRIKPPTVISTTADDSTAPTISIVAGEKHPIAASGKIHECSICHKVFPTGQAL 

GGHKRCHYEGNLGGGGGGGSKSISHSGSVSSTVSEERSHRGFIDLNLPALPELSLHHNPI 
VDEEI LS PLTGKKPLLLTDHDQVI KKEDLSLKI * 



>G1354 (1..1047) 
ATGGAAAGTCTCGCACACATTCCTCCCGGTTATCGATTCCATCCGACCGATGAAGAACTC 
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GTTGACTATTATCTCAAGAACAAAGTTGCATTCCCGGGAATGCAAGTTGATGTTATCAAA 
GATGTTGATCTCTACAAAATCGAGCCATGGGACATCCAAGAGTTATGTGGAAGAGGGACA 
GGAGAAGAGAGGGAATGGTATTTCTTTAGCCACAAGGACAAGAAATATCCAACTGGGACA 
CGAACCAATAGAGCAACGGGCTCCGGATTTTGGAAAGCAACGGGTCGAGACAAGGCCATT 
TACTCAAAGCAAGAGCTTGTTGGGATGAGGAAGACTCTTGTCTTTTACAAAGGTAGGGCC 
CCAAATGGTCAGAAATCTGATTGGATAATGCACGAATACCGTCTTGAGACCGATGAAAAT 
GGACCGCCTCATGAGGAAGGATGGGTGGTTTGTCGCGCTTTCAAGAAGAAGCTAACCACG 
ATGAACTACAACAATCCAAGAACAATGATGGGATCATCATCAGGCCAAGAATCTAACTGG 
TTCACGCAGCAAATGGATGTGGGGAATGGTAATTACTATCATCTTCCTGATCTAGAGAGT 
CCGAGAATGTTTCAAGGCTCATCATCATCATCACTATCATCATTACATCAGAATGATCAA 
GACCCTTATGGTGTCGTACTCAGCACTATTAACGCAACCCCAACTACAATAATGCAACGA 
GATGATGGTCATGTGATTACCAATGATGATGATCATATGATCATGATGAACACAAGTACT 
GGTGATCATCATCAATCAGGATTACTAGTCAATGATGATCATAATGATCAAGTAATGGAT 
TGGCAAACGCTTGACAAGTTTGTTGCTTCTCAGCTAATCATGAGCCAAGAAGAGGAAGAA 
GTTAACAAAGATCCATCAGATAATTCTTCGAATGAAACATTTCATCATCTCTCTGAAGAG 
CAAGCTGCAACAATGGTTTCGATGAATGCTTCTTCCTCTTCTTCTCCATGTTCCTTCTAC 

TCTTGGGCTCAAAATACACACACGTAA 

>G1354 Amino Acid Sequence (domain in AA coordinates: TBD) 

MESIiAHIPPGYRFHPTDEELVDYYLKNKVAFPGMQVDVIKDVDLYKIEPWDIQELCGRGT 

GEEREWYFFSHKDKKYPTGTRTNRATGSGFWKATGRDKAIYSKQELVGMRKTLVFYKGRA 

PNGQKSDWIMHEYRLETDENGPPHEEGWWCRAFKKKLTTMNYimPRTMMGSSSGQES™ 

FTQQMDVGNGriYYHLPDLESPRMFQGSSSSSLSSLHQNDQDPYGVVLSTINATPTTIMQR 

DDGHVITNDDDHMIMMNTSTGDHHQSGLLVNDDHNDQVMDWQTLDKB^ 

VNKDPSDNSSNETFHHLSEEQAATMVSMNASSSSSPCSFYSWAQNTHT* 

>G1360 (1..1257) 

ATGGGAGATAGAAACAACGACGGTGATCAGAAAATGGAGGATGTATTGTTGCCCGGATTT 

AGGTTTCATCCAACCGACGAAGAGCTCGTAAGCTTCTACCTGAAGCGGAAGGTTCAACAC 

AACCCTCTCTCCATTGAGCTCATAAGACAACTCGATATCTACAAATATGACCCCTGGGAT 

CTTCCAAAGTTTGCGATGACGGGTGAAAAAGAATGGTACTTTTATTGTCCAAGGGACAGG 

AAGTATAGGAAC^GCTCGAGGCCAAACCGAGTGACCGGAGCTGGTTTTTGGAAAGCCACG 

GGAACGGACCGGCCGATATACTCGTCAGAAGGAAACAAATGCATAGGTTTAAAGAAGTCC 

TTAGTGTTCTACAAAGGAAGAGCAGCGAAAGGAGTTAAGACTGATTGGATGATGCATGAG 

TTTCGTTTGCCTTCTCTCTCCGAACCATCTCCTCCTTCTAAGAGATTCTTCGACTCTCCT 

GTCTCTCCCAACGATTCATGGGCTATATGCAGAATCTTCAAAAAGACCAACACAACGACC 

CTAAGAGCTCTCTCTCACTCTTTTGTTTCCTCGTTACCACCAGAAACAAGCACCGACACA 

ATGTCTAACCAAAAGCAATOUVACACATACCATTTTTCTTCAGACAAGATCCTCAAACCT 

AGCTCTCACTTCCAGTTTCACCATGAGAATATG7yVCACTCCCAAAACTAGTAATAGTACA 

ACTCCATCCGTTCCCACTATAAGTCCCTTCTCTTACTTGGATTTCACTTCATACGACAAA 

CCCACCAACGTTTTCAATCCGGTTTC^TC 

CTTGCCACACAAGAAAGACAACCTCAGTTTCCC^^ 

TCGTTTCTGCTAAACACGTCTTCAGATTCGACCTTCTTGGGAGAATTCACGAGCCATATC 

GACCTCAGCGCAGTGTTGGCCCAAGAGCAATGTCCCCCGCTTGTAAGCCTACCACAGGAG 

TATC7VAGAGACGGGATTCGAAGGAAATGGTATAATGAAGAACATGCGTGGTTCCAATGAA 

GATCATCTTGGTGATCATTGCGACACACTTCGGTTTGATGATTTCACTTCAACAATTAAT 

GAGAACCATCGTCATCATCAAGACCTGAAACAGAACATGACATTGCTGGAGAGTTATTAT 

TCTTCTTTATCGTCCATCAATAGCGATTTGCCAGCTTGTTTCTCCAGTACAACCTGA 

>G1360 Amino Acid Sequence (conserved domain in AA coordinates : 18-174) 

MGDRNNDGDQKMEDVLLPGFRFHPTDEELVSFYLKRKVQHNPLSIELIRQLDIYKYDPWD 

LPKFAMTGEKEWYFYCPRDRKYRNSSRPNROTGAGFWKATGTDRPIYSSEGNKCIGLKKS 

LVTYKGRAAKGVKTDWMMHEFRLPSLSEPSPPSKRFro 

LRALSHSFVSSLPPETSTDTMSNQKQSNTYHFSSDKILKPSSHFQFHHENMNTPKTSNST 
TPS VPTI S PFS YLDFTS YDKPTNVFNPVS CLDQQYLTNLFLATQETQPQFPRLPS SNE I P 
SFLIjOTSSDSTFLGEFTSHIDLSAVIjAQEQCPPLVSLPQEYQETGFEGNGIMKNMRGSNE 
DHLGDHODTLRFDDFTSTINENHRHHQDLKQNMTLLESYYSSLSSINSDLPACFSSTT* 
>G1364 (1..537) 

ATGGCGGAGTCGCAGGCCAAGAGTCCCGGAGGCTGTGGAAGCCATGAGAGTGGTGGAGAT 
(^AAGTCC^GGTCGTTACATGTTCGTGAGCAAGATAGGTTTCTTCCGAT^ 
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AGCCGTATCATGAAAAGAGGTCTTCCTGCTAATGGGAAAATCGCTAAAGATGCTAAGGAG 

ATTGTGCAGGAATGTGTCTCTGAATTCATCAGTTTCGTCACCAGCGAAGCGAGTGATAAA 

TGTCAAAGAGAGAAAAGGAAGACTATTAATGGAGATGATTTGCTTTGGGCAATGGCTACT 

TTAGGATTTGAAGACTACATGGAACCTCTCAAGGTTTACCTGATGAGATATAGAGAGGGT 

GACACAAAGGGATCAGCAAAAGGTGGGGATCCAAATGCAAAGAAAGATGGGCAATCAAGC 

CAAT^ATGGCCAGTTCTCGCAGCTTGCTCACCAAGGTCCTTATGGGAACTCTCAAGTAACT 

TTTCCTCTCTTCTCTTCACACTCAAGCAATACGCATCATTCTCTTCTAATTTGTTAA 

>G1364 Amino Acid Sequence (conserved domain in AA coordinates : 29-120) 

MAESQAKSPGGCGSHESGGDQSPRSLHVREQDRFLPIANISRIMKRGLPANGKIAKDAKE 

IVQECVSEFISFVTSEASDKCQREKRKTINGDDLLWAMATLGFEDYMEPLKVYLMRYREG 

DTKGSAKGGDPNAKKDGQSSQNGQFSQLAHQGPYGNSQVTFPLFSSHSSNTHHSLLIC* 

>G1379 (68..622) 

CTCTGCCTC 1 TCTCTCTCTCTCAAAACCCATCTCGAAAGTCTTTCTCTTTCGAGGGTTTAG 
ATCCTCCATGGAAGGCGGCGGAGTTGCTGACGTGGCTGTCCCCGGTACGAGGAAGAGAGA 
CAGACCTTACAAAGGAATTAGGATGAGGAAGTGGGGAAAGTGGGTGGCGGAGATTCGTGA 
GCCTAACAAGCGCTCTAGGTTATGGCTTGGCTCTTACTCTACTCCCGAGGCGGCGGCGCG 
AGCTTACGACACGGCGGTTTTCTATCTTAGAGGACCTACGGCGAGGCTTAACTTCCCTGA 
GCTTCTTCCTGGGGAGAAATTCTCCGACGAGGATATGTCGGCTGCGACCATCAGGAAGAA 
AGCCACGGAGGTCGGTGCTCAGGTTGATGCTTTGGGCACGGCGGTGCAAAATAACCGCCA 
CCGTGTTTTTGGTCAGAATCGAGATAGTGATGTGGATAATAAGAATTTTCATCGGAATTA 
TCAAAACGGTGAACGAGAAGAAGAAGAAGAAGATGAGGATGACAAGAGATTGAGGAGTGG 
CGGCCGGTTATTGGATCGGGTTGACTTGAATAAATTACCCGACCCGGAAAGCTCCGATGA 
AGAATGGGAAAGCAAACATTAAAAATATATAGTTTGGAGCGGTGGCTGTTGCTAACGTAC 
GCCAACGGCTTGCTTCTACGAATCATTAGCGCCGTTTA 

CATTATCTGAAAATTTAGGGCTTTTTAGTTATTAATTTTTGTTTTGTTTTTTTCCTTTCT 

TGCGAGTTTTGCGGTTTATGGAATTTTAGGCTATTGCTTAACGAAAAAAAAAAAAAAAA 

>G1379 Amino Acid Sequence (domain in AA coordinates: 18-85) 

MEGGGVADVAVPGTRKRDRPYKGIRMRKWGKWAEIREPNKRSRLWLGSYSTPEAAARAY 

DTAVTTLRGPTARLNFPELLPGEKFSDEDMSAATIRKKATEVGAQVDALGTAVQNNRHRV 

FGQNRDSDVDNKNFHRITYQNGEREEEEEDEDDKRLRSGGRLLDRTOLNKLPDPESSDEEW 

ESKH* 

>G1384 (33.. 977) 

GTACATTTTTTTTTGTATTTCAGGAAACTCCGATGGCGGATCTCTTCGGTGGTGGCCACG 

GCGGCGAGCTTATGGAAGCACTTCAACCTTTTTACAAAAGTGCTTCCACGTCTGCTTCAA 

ATCCTGCGTTTGCGTCCTCAAACGATGCGTTTGCGTCTGCCCCAAACGACCTATTTTCTT 

CTTCTTCTTACTATAATCCTCATGCATCTTTATTCCCTTCACATTCCACAACCTCTTACC 

CGGATATTTATTCTGGATCCATGACCTATCCATCTTCATTCGGGTCGGATCTTCAACAAC 

CCGAAAACTACCAATCTCAGTTCCATTACCftAAACACTATCACTTACACTC^CCAAGACA 

AC^CACTTGCATGCTTAACTTCATTGAGCCGAGCCAACCGGGTTTTATGACCCAACCGG 

GTCCGAGTTCGGGTTCGGTTTCAAAACCGGCTAAGCTCTATAGAGGAGTGAGGCAAAGAC 

ATTGGGGAAAATGGGTCGCGGAGATCCGTTTACCCAGGAACCGAACCCGACTTTGGCTCG 

GAACATTCGACACGGCTGAAGAAGCCGCGTTGGCTTATGATCGCGCCGCGTTTAAGCTTC 

GTGGTGACTCGGCTCGGCTTAACTTCCCAGCTCTCCGATACC71AACCGGCTCGTCTCCGT 

CTGATACCGGCGAATATGGTCCTATTGAAGCTGCCGTAGACGCTAAACTAGAAGCCATAT 

TAGCTGAGCCGAAGAATCAGCCGGGCAAAACGGAGAGGACGTCGAGGAAACGAGCTAAAG 

CCGCGGCTTCTTCAGCTGAGCAGCCGTCAGCGCCACAACAACATTCCGGGTCGGGTGAAA 

GTGATGGGTCGGGTTCACCGACTTGGGATGTTATGGTGCAGGAGATGTGCCAAGAGCCAG 

AGATGCCATGGAATGAAAATTTCATGCTCGGCAAGTGTCCTTCTTATGAGATAGATTGGG 

CTTCAATTTTATCGTGAAAAATTAGGATTCAATTCATTTTTATTC^ 

TATTTTCTTTTAAACTTTAGGGTTATTAGCTGTGCGTAA 

>G1384 Amino Acid Sequence (domain in AA coordinates: TBD) 

MADLFGGGHGGELMEALQPFYKSASTSASNPAFASSNDAFASAPNDLFSSSSYYNPHASL 

FPSHSTTSYPDIYSGSMTYPSSFGSDLQQPENYQSQFHYQNTITYTHQDNNTCMLNFIEP 

SQPGFMTQPGPSSGSVSKPAKLYRGVRQRHWGKWVAEIRLPRNRTRLWLGTFDTAEEAAL 

AYDRAAFKLRGDSARLNFPALRYQTGSSPSDTGEYGPIQAAVX)AKLEAILAEPKNQPGKT 

ERTSRKRAKA7^ASSAEQPSAPQQHSGSGESDGSGSPTSDVMVQEMCQEPEMPWNENFMLG 

KCPSYEIDWASILS* 
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>G1399 (261. .1475) 

AGGTCGAATTTTCTGAAATTAAGATTCATTCCTCCATGGAAGAAGCTCTGTTTTTATTCT 
CTTTAGCTTAGCTTAGCTTCTACTGATCTGTTTTTGCTACAAAATCCCATCTTTTTCTTT 
AAAACTCTTTATCTCTGAATCTTGAGTTTCTTGTAGAAGAAGTiAGCAATTTTGAATCTTT 
CGTAATCATAAAGATTCGTGGAGGATCTCTACTGATTTGTCGGAATCTCTCACTACAGAA 
TCACTTGATCTTATGTCCGGATGGAGGAGAGAGAAGGAACCAACATCAACAACAACATCA 
CTAGCAGTTTCGGCTTGAAGCAGCAACATGAAGCTGCTGCTTCTGATGGTGGTTACTCAA 
TGGACCCACCACCAAGACCCGAAAACCCTAACCCGTTTTTAGTCCCACCCACTACTGTCC 
CCGCGGCCGCCACCGTAGCAGCAGCTGTTACTGAGAATGCGGCTACTCCGTTTAGCTTAA 
CAATGCCGACGGAGAACACTTCAGCTGAGCAGCTGAAAAAGAAGAGAGGTAGGCCGAGAA 
AGTATAATCCCGATGGGACTCTTGTCGTGACTTTATCGCCGATGCCAATCTCGTCCTCTG 
TTCCGTTGACGTCGGAGTTTCCTCCAAGGAAACGAGGAAGAGGACGTGGCAAGTCTAATC 
GATGGCTCAAGAAGTCTCAAATGTTCCAATTCGATAGAAGTCCTGTTGATACCAATTTGG 
CAGGTGTAGGAACTGCTGATTTTGTTGGTGCCAACTTTACACCTCATGTACTGATCGTCA 
ACGCCGGAGAGGATGTGACGATGAAGATAATGACATTCTCTCAACAAGGATCTCGTGCTA 
TCTGCATCCTTTCAGCTAATGGTCCCATCTCCAATGTTACGCTTCGTCAATCTATGACAT 
CCGGTGGTACTCTAACTTATGAGGGTCGTTTTGAGATTCTCTCTTTGACGGGTTCGTTTA 
TGCAAAATGACTCTGGAGGAACTCGAAGTAGAGCTGGTGGTATGAGTGTTTGCCTTGCAG 
GACCAGATGGTCGTGTCTTTGGTGGAGGACTCGCTGGTCTCTTTCTTGCTGCTGGTCCTG 
TCCAGGTAATGGTAGGGACTTTTATAGCTGGTCAAGAGCAGTCACAGCTGGAGCTAGCAA 
AAGAAAGACGGCTAAGATTTGGGGCTCAACCATCTTCTATCTCCTTTAACATATCCGCAG 
AAGAACGGAAGGCGAGATTCGAGAGGCTTAACAAGTCTGTTGCTATTCCTGCACCAACCA 
CTTCATACACGCATGTAAACACAACAAATGCGGTTCACAGTTACTATACAAACTCGGTTA 
ACCATGTCAAGGATCCCTTCTCGTCTATCCCAGTAGGAGGAGGAGGAGGTGGAGAGGTAG 
GAGAAGAAGAGGGTGAAGAAGATGATGATGAATTAGAAGGTGAAGACGAAGAATTCGGAG 
GCGATAGCCAATCTGACAACGAGATTCCGAGCTGATGATGATCATACGGTTTCTTTTCGC 
GGATTTGTTAGGTTTGATGGATTTCAGATTTTGGTTGATTGTTTTTATTAACACAGAATG 
TTTAGAAGCTGCTATCTTTAGGTTCCCATCCTCTTGTGATTGTTGAGTATCCTTGTTAGA 
AACAAACTTACTGTTGCAAAACTCTCTTCAAAAAAGTTO 

>G1399 Amino Acid Sequence (domain in AA coordinates: 86-93) 

MEEREGTNINNNITSSFGLKQQHEAAASDGGYSMDPPPRPENPNPFLVPPTTVPAAATVA 

AAVTENAATPFSLTMPTENTSAEQLKKKRGRPRKYHPDGTLVVTLSPMPISSSVPLTSEF 

PPRKRGRGRGKSNRWLKKSQMFQFDRS PVDTNLAGVGTADFVGANFTPHVL I VNAGEDVT 

MKIMTFSQQGSRAICILSANGPISNVTLRQSMTSGGTLTYEGRFEILSLTGSFMQNDSGG 

TRSRAGGMSVCLAGPDGRVFGGGLAGLFLAAGPVQVIWGTFIAGQEQSQIiELAKERRLRF 

GAQPSSISFNISAEERKARFERLNKSVAIPAPTTSYTHWTTNAVHSYYTNSVNHV^ 

SSIPVGGGGGGEVGEEEGEEDDDELEGEDEEFGGDSQSDNEIPS* 

>G1415 (60.. 680) 

CCTTATC^CTmCCAAAAGTCGTCACATAATATCACTTTCGAGTTATCAACATCCGTACA 
TGTCATCCATAGAGCCAAAAGTAATGATGGTTGGTGCTAATAAGAAACAACGAACCGTCC 
AAGCTAGTTCGAGGAAAGGTTGTATGAGAGGAAAAGGTGGACCCGATAACGCGTCTTGCA 
CTTACAAAGGTGTTAGACAACGCACTTGGGGCAAATG 

ACCGAGGAGCTCGTCTTTGGCTCGGTACCTTCGACACCTCCCGTGAAGCTGCCTTGGCTT 

ATGACTCCGCAGCTCGTAAGCTCTATGGGCCTGAGGCTCATCTCAACCTCCCTGAGTCCT 

TAAGAAGTTACCCTAAAACGGCGTCGTCTCCGGCGTCCCAGACTACACCAAGCAGCAACA 

CCGGTGGAAAAAGCAGCAGCGACTCTGAGTCGCCGTGTTCATCCAACGAGATGTCATCAT 

GTGGAAGAGTGACAGAGGAGATATCATGGGAGCATATAAACGTGGATTTGCCGGTAATGG 

ATGATTCTTC7U^TATCGGAAGAAGCTACAATGTCGTTAGGATTTCCATGGGTTCATGAAG 

GAGATAATGATATTTCTCGGTTTGATACTTGTATTTCCGGTGGCTATTCTAATTGGGATT 

CCTTTCATTCCCCACTTTGAGGTGTCACTAGACTCT 

CAAACTACATATATATAOVAATA^^ 

CCAGTTAC^TGTACTTATATATGTGCA(^TCTATATATGTGGTTTGTCTGTATAGTGTGA 
AAGCAGATTCTTACCATATCA 

>G1415 Amino Acid Sequence (domain in AA coordinates: TBD) 
MSSIEPKVMMVGANKKQRTVQASSRKGCM^ 

l^G ARLWLGTFDTS REAALAYDS AARKL YGPEAHLNLPE S LR S YP KTAS S PAS QTTP S SN 
TGGKSSSDSESPCSSNEMSSCGRVTEEISWEHINVDLPVI^DSSIWEEATMSLGFPWVHE 
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GDNDISRFDTCISGGYSNWDSFHSPL* 
>G1417 (32. .1501) 

TCTATCTCTATCTATCTCTCTTTGTCTGCAAATGGAAGAACATATTCAAGATCGCCGTGA 
AATTGCGTTCTTACACTCAGGAGAATTTCTCCACGGAGATTCTGACTCAAAGGATCATCA 
ACCGAACGAGTCTCCGGTGGAACGTCATCACGAGTCGTCTATCAAAGAAGTTGATTTCTT 
CGCTGCTAAAAGTCAG C CGTTTGATCTTGGTCATGTGAGAACAACGACGATCGTTGGATC 
ATCTGGTTTTAATGATGGATTAGGTTTGGTAAATTCATGtCATGGAACATCAAGCAATGA 
TGGCGATGACAAAACCAAAACTCAAATTAGTAGACTGAAGTTGGAGCTAGAGAGGCTTCA 
CGAGGAGAATCACAAACTGAAGCATTTATTAGATGAGGTCAGTGAGAGTTACAACGACCT 
CCAAAGAAGAGTTTTGTTAGCAAGACAAACACAAGTGGAAGGTCTTCATCATAAACAACA 
TGAGGATGTACCTCAAGCTGGTTCCTCACAAGCTCTAGAGAACAGAAGACCAAAGGATAT 
GAACCATGAAACTCCGGCCACCACCTTGAAACGACGGTCTCCAGACGACGTGGATGGTCG 
TGATATGCACCGAGGATCACCAAAAACTCCTCGAATAGACCAAAACAAGAGTACTAATCA 
TGAAGAACAACAT^AACCCTCATGATCAATTACCCTATAGAAAAGCTAGGGTTTCCGTTAG 
AGCTAGATCTGATGCCACTACGGTAAATGACGGATGTCAATGGAGAAAATACGGTCAGAA 
AATGGCGAAAGGGAATCCATGTCCTCGCGCTTATTATCGTTGCACCATGGCCGTTGGATG 
TCCTGTCCGTAAACAGGTCCAACGATGCGCGGAGGATACAACTATCTTGACAACAACGTA 
CGAAGGAAACCATAACCATCCTCTTCCCCCGTCAGCCACAGCCATGGCTGCAACCACCTC 
CGCCGCAGCAGCCATGCTCTTATCAGGCTCCTCCTCCAGCAACCTCCACCAAACACTCTC 
TAGCCCCTCCGCCACGTCATCATCATCCTTCTACCATAACTTCCCATACACCTCCACAAT 
CGCAACACTCTCTGCCTCAGCTCCTTTCCCCACCATAACCTTAGACCTCACCAACCCACC 
TCGACCGCTACAACCGCCACCGCAGTTTCTAAGCCAGTATGGTCCCGCCGCGTTTTTACC 
AAACGCTAATCAAATTAGGTCTATGAATAATAATAACCAGCAGTTATTAATACCTAATTT 
GTTTGGCCCACAAGCCCCACCACGTGAAATGGTCGATTCAGTTAGGGCTGCGATTGCGAT 
GGATCCGAACTTCACGGCGGCACTTGCGGCCGCGATCTCAAACATTATCGGAGGAGGTAA 
TAACGACAACAATAATAATACTGATATTAATGATAACAAGGTTGATGCAAAAAGTGGAGG 
GAGTAGTAACGGAGATTCGCCACAGCTTCCTCAGTCTTGCACCACTTTCTCTACAAACTA 
ATTTTACTACCATTATTATATGTTATCTTATTATATATTACACACACATATTATACATTA 
TGCGTATCTTAAGTTTTTTTTTGGGGGCCATTATATATGAATGATATGGAGATCACTGAG 



>G1417 Amino Acid Sequence (domain in AA coordinates: 239-296) 

MEEHIQDRREIAFLHSGEFLHGDSDSKDHQPNESPVERHHESSIKEVDFFAAKSQPFDLG 

HVRTTTIVGSSGFNDGLGLVNSCHGTSSNDGDDKTKTQISRLKLELEIUiHEEiraKLKHLL 

DEVSESYNDLQRRVLLARQTQVEGLHHKQHEDVPQAGSSQALENRRPKDMNHETPATTLK 

RRSPDDVDGRDMHRGSPKTPRIDQNKSTNHEEQQNPHDQLPYRKARVSVRARSDATTVND 

GCQWRKYGQKr^KGNPCPRAYYRCTMAVGCPVRKQVQRCAEDTTILTTTYEGNHNHPLPP 

SATAKAATTS7VAAAMLLSGSSSSNLHQTLSSPSATSSSSFYHNFPYTSTIATLSASAPFP 

TITLDLTNPPRPLQPPPQFLSQYGPAAFLPNANQIRSMNNNNQQLLIPNLFGPQAPPREM 

VDSVRAAIAl^PNFTAALAAAISNIIGGGNNDNNNNTDINDNKTO 

QSCTTFSTN* 

>G1442 (1.-1293) 

ATGGGAACAAGAGCAGAACGCAAGGAAGATTTTGTTGGTGGGTTTGGATTTGGTGTTGTA 
GAAAATTCGCATAAAGACGTTATGGTGCTACCTCATCAT 

TC^CCTTCCTCTTCTTCTTTGTGTTACTGTTCTGCI^GTGTTAGCGATCCC^TGTTCTCT 

GTTTCTAGCAATCAGGCTTACACTTCTTCTCACAGTGGTATGTTCACACCCGCCGGTTCT 

GGTTCTGCTGCTGTGACTGTAGCAGATCCTTTTTTCTCCTTGAGCTCTTCAGGGGAAATG 

AGAAGAAGTATGAACGAAGATGCTGGTGCAGCTTTCAGCGAAGCTCAATGGCATGAGCTT 

GAGAGGCAGAGGAA^ATATACAAGTACATGATGGCTTCTGTTCCTGTTCCTCCAGAGCTT 

CTCACACCCTTTCCCAAGAACC^CCAATCAAACACTAACCCGGATC 

GCGAC7VGGAGGCTCATTGCAGCTGGGGATTGCTTCAAGCGCAAGCAATAACACGGCTC 

CTGGAGCCATGGAGGTGCAAGAGAACAGATGGGAAGAAATGGAGGTGCTCTAGAAACGTG 

ATTCCTGATCAGAAATACTGTGAGAGACACACACACAAGAGCCGTCCTCGTTCAAGAAAG 

CATGTGGAATCATCTCACCAATCATCTCACCACAATGACATTCGTACGGCTAAGAATGAT 

ACTAGCCAGCTTGTGAGAACTTATCCTCAGTTTTACGGACAACCTATAAGCCAGATCCCT 

GTGCTTTCTACTCTTCCGTCTGCCTCCTCTCCATATGATCACCACAGAGGACTGAGGTGG 

TTTACGAAAGAAGATGATGCCATTGGAACCTTAAACCCGGAGACTCAAGAAGCTGTCCAG 

CTGAMGTTGGATCAAGCAGAGAGCTCAAACGGGGATTCGATTATGATCTGAATTTCAGG 
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CAGAAA.GAGCCAATAGTAGACCAGAGCTTTGGAGCATTGCAGGGTCTATTAAGTCTAAA.C 
CAGACACCACAACATAACCAAGAAACAAGACAGTTTGTTGTAGAAGGAAAGCAAGATGAA 
GCGATGGGAAGCTCTCTGACACTCTCAATGGCTGGAGGAGGCATGGAGGAAACAGAGGGA 
ACAAACCAGCATCAGTGGGTTAGCCATGAAGGTCCATCATGGCTCTATTCAACAACACCA 
GGTGGACCATTGGCTGAAGCACTGTGTCTCGGTGTCTCCAACAACCCAAGTTCTAGTACT 
ACTACTAGTAGCTGCAGCAGAAGCTCAAGCTAA 

>G1442 Amino Acid Sequence (domain in AA coordinates: 172-223) 

MGTRAERKEDFVGGFGFGWENSHKDVMVLPHHHYYPSYSSPSSSSLCYCSAGVSDPMFS 

VSSNQAYTSSHSGMFTPAGSGSAAVTVADPFFSLSSSGEMRRSMNEDAGAAFSEAQWHEL 

ERQRNIYKYMMASVPVPPELLTPFPKNHQSNTNPDVTVAVATGGSLQLGIASSASNNTAD 

LEPWRCKRTDGKKWRCSRNVIPDQKYCERHTHKSRPRSRKHVESSHQSSHHNDIRTAKND 

TSQLVRTYPQFYGQPISQIPVLSTLPSASSPYDHHRGLRWFTKEDDAIGTLNPETQEAVQ 

LKVGSSRELKRGFDYDLNFRQKEPIVDQSFGALQGLLSLNQTPQHNQETRQFWEGKQDE 

AMGSSLTLSMAGGGMEETEGTNQHQWVSHEGPSWLYSTTPGGPLAEALCLGVSNNPSSST 

TTSSCSRSSS* 

>G1454 (86.. 1180) 

CTAGTAGTGATGATATGATCGCTTCTTCTCCTACAATCTCAGAAACCTCCGATCACGGTT 

TTAGATATCTTCTACAACGGATACAATGGAGAGCACCGATTCTTCCGGTGGTCCACCACC 

GCCACAACCTAACCTTCCTCCAGGCTTCCGGTTTCACCCTACCGACGAAGAGCTTGTTGT 

TCACTACCTCAAACGCAAAGCAGCCTCTGCTCCTTTACCTGTCGCCATCATCGCCGAAGT 

CGATCTCTATAAATTTGATCCATGGGAACTTCCCGCTAAAGCATCGTTTGGAGAACAAGA 

ATGGTACTTCTTTAGTCCACGAGATCGGAAGTATCCAAACGGAGCAAGACCT^ACAGAGC 

GGCGACTTCAGGTTATTGGAAAGCGACCGGTACAGATAAACCGGTACTTGCTTCCGACGG 

TAACCAAAAGGTGGGCGTGAAGAAGGCACTAGTCTTCTACAGTGGTAAACCACCAAAAGG 

CGTTAAAAGTGATTGGATCATGCATGAGTATCGTCTCATCGAAAACAAACCAAACAATCG 

ACCTCCTGGCTGTGATTTCGGCAACAAAAAAAACTCACTCAGACTTGATGATTGGGTGTT 

ATGTAGAATCTACAAGAAGAACAACGCAAGTCGACATGTTGATAACGATAAGGATCATGA 

TATGATCGATTACATTTTCAGGAAGATTCCTCCGTCTTTATCAATGGCGGCTGCTTCTAC 

AGGACTTCACCAACATCATCATAATGTCTCAAGATCAATGAATTTCTTCCCTGGCAAATT 

CTCCGGTGGTGGTTACGGGATTTTCTCTGACGGTGGTAACACGAGTATATACGACGGCGG 

TGGCATGATCAACAATATTGGTACTGACTCAGTAGATCACGACAATAACGCTGACGTCGT 

TGGTTTAAATCATGCTTCGTCGTCAGGTCCTATGATGATGGCGAATTTGAAACGAACTCT 

CCCGGTGCCGTATTGGCCTGTAGCAGATGAGGAGCAAGATGCATCTCCGAGCAAACGGTT 

TCACGGTGTAGGAGGAGGAGGAGGAGATTGTTCG7VACATGTCTTCCTCCATGATGGAAGA 

GACTCCACCATTGATGCAAC7^ACAAGGTGGTGTGTTAGGAGATGGATTATTCAGAACGAC 

ATCGTACCAATTACCCGGTTTAAATTGGTACTCTTCTTAATCAAATGTGTTTCGCCGCCG 

GTGTGAAGAATTTTCCGGTGACAGTGAAGATTTTTTTCCGATTGGTGGGGTCATTTGCAT 

GCATTATATAATTTGAGATTTGTGTATATGTTTTGGGTTAATTAATTGGTCACAGGGGC 

>G1454 Amino Acid Sequence (conserved domain in AA coordinates : 9-178) 

mestdssggppppqpnlppgfrfhptdeelvvhylkrkaasaplpvaiiaevdlykfdpw 
elpakasfgeqewyffsprdrkypngarpnraatsgywkatgtdkpvliasdgnqkvgvkk 
alvfysgki>pkgvksdwimheyrlienkp]^ 

asrhvd1tokdhdmidyifrkippslsmaaastglhqhhhnvsrsmnffpgkfsgggygif 

SDGGNTSIYDGGGMINNIGTDSVDHDNN^ 

DEEQDASPSKRFHGVGGGGGDCSNMSSSMMEETPPLMQQQGGVLGDGLFRTTSYQLPGLN 
WYSS* 

>G1459 (1..1272) 

ATGATGAAAGGTCTGATTGGGTATAGATTTAGTCCGACGGGAGAGGAAGTGATCAACCAT 

TACCrAAAGAACAAACTTCTGGGTAAGTATTGGCTCGTTGATGAAGCTATTAGCGAGATC 

AACATCTTGAGTCACAAACCCAGCAAGGATTTGCCrAAGTTAGCTAGGATCCAATCGGAA 

GATCTTGAATGGTATTTCTTCTCTCCGATTGAGTACACGAACCCGAATAAGATGAAAATG 

AAGAGGACGACAGGTTCTGGGTTTTGGAAACCTACTGGTGTTGATCGGGAAATTAGGGAT 

AAAAGAGGAAATGGTGTTGTGATAGGGATTAAGAAGACGCTTGTGTACCATGAAGGTAAG 

AGTCCTCATGGAGTTAGAACTCCTTGGGTTATGCACGAGTATCAC^TCACTTGCT 

CATCATAAGAGGAAATATGTTGTCTGCCAAGTAAAGTATAAGGGTGAAGCTGCAGAAATT 

TCATATGAGCCAAGTCCCTCTTTGGTATCCGATTCGCATACCGTCATAGCGATTACCGGA 

GAACCGGAACCTGAGCTTCAGGTTGAGCAGCCAGGTAAAGAAAATCTCTTGGGTATGTCT 
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GTAGATGATTTGATAGAACCAATGAACCAACAAGAGGAGCCACAAGGTCCTCACTTAGCT 
CCGAATGATGATGAGTTTATACGTGGATTGAGGCATGTTGATCGAGGGACGGTTGAATAT 
TTGTTTGCCAATGAAGAAAACATGGATGGTTTGTCTATGAATGACTTGAGAATCCCAATG 
ATCGTCCAACAAGAGGATCTCTCTGAGTGGGAGGGATTTAACGCAGACACCTTTTTCAGC 
GACAACAACAATAACTATAACCTTAACGTGCATCATCAACTAACGCCTTACGGCGATGGC 
TATTTGAATGCATTTTCGGGTTATAACGAAGGGAATCCTCCCGATCACGAATTAGTGATG 
CAAGAGAACCGCAACGATCACATGCCAAGGAAACCTGTGACAGGGACCATTGATTATAGC 
AGCGATAGTGGCAGTGATGCTGGATCCATATCTACAACGGTGAAACAAGAAATCCCAAGA 
GCTGTTGATGCACCCATGAACAATGAGTCATCTTTGGTGAAAACAGAGAAGAAAGGCTTG 
TTTATTGTAGAGGACGCAATGGAGAGAAACCGCAAGAAACCACGATTTATCTATCTCATG 
AAGATGATCATAGGCAACATCATATCGGTTTTACTACCCGTCAAAAGATTGATCCCGGTG 
AAGAAGTTATGA 

>G1459 Amino Acid Sequence (conserved domain in AA coordinates : 10-152 ) 

mmkgligyrfsptgeevii^ylknkllgkywlvbeaiseinilshkpskdlpklariqse 

dlewyffspieytnpnkmkmkrttgsgfwkptgvt)reirakrgngwigikktlvyhegk 

sphgtot/pwvt^eyhitclphhkrkywc 

epepelqveqpgkenllgmsvddliepmnqqeepqgphl^ 

lfaneenmdglsmjtolripmivqqedl^ 

ylnafsgynegnppdhelvmqeni^hmprkpvtgtidyssdsgsdagsisttvkqeipr 
avdap^^esslvktekicglfivedamernrkkprfiylmkmiigniisvllpvkrlipv 

KKL* 

>G1460 (87.. 995) 

CGTCGACCTTCACTCAAACCCTAATCCCGGGAACCCGGGAATTTTGATCATTTTGTTTCT 

TTTCGATCTGTTTCTATTTTAAAAAGATGATGAAAGATCCGACTGGGTATAGATTTAGTC 

CGACGGGAGAGGAAGTGATAAACCATTACCTAAAGAACAAAATTCTGGGTAAGACTTGGC 

TCGTTGATGAAGCCATTAGCGAGATCAACATCTTGAATCACAAACCCAGCAAGGATTTGC 

CTAAGTTAGCTAGGATCCAATCGGAAGATCTTGAGTGGTACTTTTTCTCTCCGATTGAGT 

ACACGAACCCGAATAAGATGAAAATGAAGAGGACGACAGGTTCTGGGTTTTGGAAACCTA 

GTGG TGTTGATCGG AAAATTAGGGATAAAAGAGGAAATGGTGTTGTGATAGGGATTAAGA 

AGACGCTTGTGTACCATGAAGGTAAGAGTCCTCATGGAGTTAGAACTCCTTGGGTTATGC 

ACGAGTATCACATCACTTGCTTGCCTCATCATAAGAGGAAATATGTTGTCTGCCAAGTAA 

AGTATAAGGGTGAAGCTGCAGAAATTTCATATGAGCCAAGTCCCTCTTTGGTATCCGATT 

CGCATACCGTCATAGCGATTAACGGAGAACCGGAACCTGAGCTTCAGGTTGAGCAGCCAG 

GTAAAGAAAATCTCTTGGGTATGTCTGTAGATGATTTGATAGAACCAATGAACCAACAAG 

AGGAGCCACAAGGTCCTCACTTAGCTCCGAATGATGATGAGTTTATACGTGGATTGAGAC 

ATGTTGATCGAGAGCCGGTTGAATATTTGTTTGCCAATGAAGAAAACATGGATGGTTTGT 

CTATTATGAATGACTTGACAATCCCAATGATCGCCCAACAAGAGGATCTCATTCTCTCTG 

AGTGGGAGGGATTTATCGGAGCCACCTTTTTCAGCGACAACAACAATAACAATAACCT 

ACGTGCATCAACTAACGTCTTTCTTACCGGGATGATTATCAGAATGCATTTTGGGTTACA 

ACGGAGCGNCCGCT 

>G1460 Amino Acid Sequence (domain in AA coordinates: TBD) 

MMKDPTGYRFSPTGEEVINHYIjKNKILGKTWLTO 

DLEWYFFSPIEYTNPNKMKMKRTTGSGFWKPSGV1DRKI 

SPHGTOTPWVT1HEYHITCLPHHKRKYWCQVTCYKGEAAEISYEPSPSLVSDSH 

EPEPELQVEQPGKENLLGMSVDDLIEPMNQQEEPQGPHLAPNDDEFIRGLRHVXIREPV^ 

I»FANEENMDGLSIMNDLTIPMIAQQEDLILSEWE 

PG* 

>G147 (37.. 672^- 

AAATCATCAGATAGAAGGAAATATTCTGATTGAGAGATGGCTCGTGGAAAGATTCAGCTT 
AAGAGGATTGAGAACeCGGTTCACAGACAAGTGACTTTTTGCAAGAGGAGAACTGGTCTT 
CTCAAGAAGGCTAAGGAGCTCTCTGTGCTCTGTGATGCCGAGATCGGTGTTGTGATCTTC 
TCTCCTCAGGGCAAGCTCTTTGAGCTCGCTACTAAAGGAACAATGGAGGGAATGATTGAT 
AAGTACATGAAGTGTACTGGTGGTGGTCGTGGTTCTTCTTCTGCTACTTTTACTGCTCAA 
GAACAACTTCAACCACCAAATCTTGATCCGA 

ATTGAGATGCTTCAGT^AAGGGATAAGCTATATGTTTGGAGGAGGAGATGGGGCTATGAAT 
CTTGAAGAACTTCTTTTGCTTGAGAAGCATCTTGAGTATTGGATTTCTCAGATTCGCTCT 
GCTAAGATGGATGTTATGCTTCAAGAAATTCAGTCATTGAGGAACAAGGAAGGAGTCCTC 



79 



BNSDOCID: <WO_0301 3227A2_tA> 



WO 03/013227 PCT/US02/25805 



AAAAACACCAACAAGTATCTCCTCGACAAGATAGAGGAAAACAACAATAGCATATTAGAT 

GCTAACTTCGCAGTCATGGAGACAAACTATTCCTATCCGCTAACAATGCCAAGTGAAATA 

TTTCAGTTCTAGACCATAGGGTATTTGAAGACTATGTCTCACGAATTTAAATAACCTTGG 

TAAGTATAATATAGTGTTGTTAAATCACACATAATTAAAATAAAGCCTGTGGAACTTCGC 

TAGGCAGTTGAAAATCTATCCGTATGTTTTATCCTCTTGTTTTACATTTGTTGGTGTGAA 

GATGAAATGACTGCAAGTGTGGTGTGTACTTATAACTCTTTCTACTTTCTATCTATGTTT 
TGAATTTATGGATT 

>G147 Amino Acid Sequence (domain in AA coordinates: 2-57) 

MARGKI QLKRI ENP VHRQVTFCKRRTGLLKKAKELS VLCDAEI G WI FS PQGKLFELATK 

GTMEGMIDKYMKCTGGGRGSSSATFTAQEQLQPPNLDPKDEINVLKQEIEMLQKGISYMF 

GGGDGAMNLEELLLLEKHLEYWISQIRSAKMDVMLQEIQSLRNKEGVLKHTNKYLLDKIE 

ENNNSILDANFAVMETNYSYPLTMPSEIFQF* 

>G1471 (1..735) 

ATGGAGAACCAATCTATGTCTTCATCAAGCTCCTCCACACACAAACATGATCAAAAACTC 

AAAAGTTCCGTTGTGGCCATGGAGGTCCTGGAGGAGAAGGAGACAGTGAACAATCCGCCC 

CAGTATTATAATAAGATCTACATCTGTTACTTGTGCAAGAGAGCGTTCCCAACCCCTCAT 

GCCCTTGGCGGTCACGGAACCACCCACAAGGAGGACCGAGAATTGGAGAGGCAACAGATC 

GAGTCAAGGCTTTCTAACAAAGACAAGTCTAACTTGCTCTTTGGTGGGTCTTCACAAGAT 

GTTTTATCAAATGATAATCACCTTGGACTCTCTCTTGGTCCATTGAAGTCCATAGAAGGT 

AGCAGCAGCAGCAACAACGTTAACCCATTGCTTAATGTTGGAGTCCCTAGAGGAACCACA 

GATATGAACATGAACAACTATAGCTCACATGCTTTATCAACTGATGATATTAATCTTGAT 

CTTACTCTTGGTCCATCTAAGTCCATAGGAGATAGCAACAATATCATTAATAACAACACT 

AACTCATCCTTCGATGGGAATCTGATCATTCCCGTTCGTCCTCGTGTGTCTAGATACCAT 

TTTGTTGCTGGGAACCCCCTTGATTCAATCTCTAGAAACATTCCTCCTTCTATTACTTTT 

CCTCATCTAAACATCAATCTTTCTCATGATTCGTTTTCTTTACAAGAGAATGGTTCGGGC 
TCTAGTCACTCATAA 

>G1471 Amino Acid Sequence (domain in AA coordinates: 49-70) 
MENQSMSSSSSSTHKHDQKLKSSWAMEVLEEKETVNNPPQYYNKIYICYLCKRAFPTPH 
ALGGHGTTHKEDRELERQQIESRLSNKDKSNLLFGGS SQDVLSNDNHLGLSLGPLKS IEG 
S S S SNNVNPLLNVGVPRGTTDMNMNNYS SHAL S TDD INLDLTLGPS KS I GDSNNI INNNT 

NSSFDGNLIIPVRPRVSRYHFVAGNPLDSISRNIPPSITFPHLNINLSHDSFSLQENGSG 
SSHS* 

>G1475 (1..645) 

ATGAAGAGAAC^CATTTGGCAAGTTTTAGTAACAGAGACAAAACCCAAGAAGAAGAAGGA 

GAAGACGGTAATGGTGACAACAGAGTCATCATGAATCACTACAAGAATTACGAAGCTGGG 

CTGATCCCATGGCCTCCCAAGAATTACACTTGCAGCTTCTGCAGGAGAGAGTTCAGATCT 

GCTCAAGCACTTGGAGGCC^C^TGAATGTTCATAGAAGAGACAGAGCAAAACTCAGGCAG 

ATCCCTTCTTGGCTCTTCGAACCTCACCACCACACACCTATTGCAAACCCTAACCCTAAT 

TTTAGCTCTTCTTCTTCCTCTTCAACAACAACAGCTCATCTTGAGCCT^ 

CAGAGATCCAAAACCACrCCTTTTCClTCT 

AGCTATGGAGGTTTGATGATGGACAGAGAGAAGAACAAGAGCAATGTATGTAGCAGAGAG 

ATCAAGAAAAGTGCCATCGATGCATGTCATTCAGTAAGATGTGAGATAAGCCGTGGGGAT 

CTGATGAATAAGAAAGATGATCAAGTCATGGGGTTGGAGCTTGGGATGAGTTTGAGGAAT 

CCC^CCAAGTTCTTGATTTGGAGCOTCGACTAGGCTACCTCT^ 

>G1475 Amino Acid Sequence (domain in AA coordinates: 51-73) 

MKRTHLASFSNRDKTQEEEGEDGNGDNRVIMNHYKNYEAGLIPWPPKNYT^ 

AQALGGHMNVHRRDRAKLRQIPSWLFEPHH^ 

QRSKTTPFPSARFD^LDSTTSYGGLMMDREKNKSNVCSREIKKSAIDACHSVRCEISRGD 

LMNKKDDQVMGLELGMSLRNPNQVLDLELRLGYL* 

>G1477 (1. .606) • 

ATGTTGTCCTCGGACTCGAATTACGCTAGTGATATTAGCGACGATGCCTCCGCCACCGGA 
TCGATAGAGAATCCTATATACAAATGCAAGTATTGTCCTAGGAAGTTCGATAAAACACAA 
GCATTAGGTGGTCATCAAAATGCACACAGAAAGGAGAGAGAGGTCGAAAAACAA(^\AAA 
GCATTTTTGGCGCAT1TGAACCGACCAGAACCAGATCTTTACGCGTACTCGTATTCGTAT 
CATC7VTTCATTTCCTAACCAATACGCACTCC(^CCGGGATTTGAACAGCCTCA 

AGCITTGC^GGACTACAAAGTGACCGAAGTCAAGGAATGAACC^ 
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GGGATCCCATTCCTACCCCAATCTCAACCTCAACCACTATCGTCACCAATATGTTTGGAT 
CTTTGCCTTGGCATTGGTAGCTCCCAAACCC7U\CCACAACCTCAAG7VACCAAATGATGCA 
ACAGAAGAGATGGATGCTGAGAAAGAAAATGATGGTTCTTCCCTTTCTCTCTCACTCAAA 
CTGTGA 

>G1477 Amino Acid Sequence (domain in AA coordinates: 29-48) 

MLSSDSNYASDISDDASATGSIENPIYKCKYCPRKFDKTQALGGHQNAHRKEREVEKQQK 

AFLAHLNRPEPDLYAYSYSYHHSFPNQYALPPGFEQPQYKVDRSYKMSMVYNQYVGSSSS 

S FAGLQSDPSQGMNQDWTFTGI PFLPQS QPQPLS S P I CLDLCLG I GS S QTQPQPQEPNDA 

TEEMDAEKENDGSSLSLSLKL* 

>G1487 (1..1020) 

ATGGAACAAGCCGCGTTGAAGAGCAGCGTCAGGAAAGAGATGGCTCTCAAAACGACTTCT 

CCGGTTTACGAAGAGTTTCTTGCCGTCACCACCGCTCAAAATGGCTTTTCCGTCGACGAT 

TTCTCTGTAGACGACTTGCTTGACTTGTCAAACGATGACGTTTTTGCCGACGAAGAAACT 

GACCTCAAGGCTCAACATGAGATGGTCCGTGTTTCCTCTGAGGAACCCAACGACGACGGA 

GACGCTCTTCGCCGGAGCAGCGATTTCTCCGGCTGTGACGACTTTGGTTCTCTCCCTACA 

AGCGAACTCTCTCTTCCGGCGGATGATTTAGCGAACCTTGAGTGGCTCTCTCATTTCGTG 

GAGGACTCCTTCACGGAATATTCGGGTCCAAACCTCACCGGAACCCCGACTGAGAAACCG 

GCGTGGTTAACGGGTGACCGGAAACATCCTGTGACTGCAGTCACGGAAGAGACCTGTTTC 

AAATCCCCTGTTCCGGCTAAAGCCCGTAGCAAACGTAACCGCAATGGCCTCAAGGTCTGG 

TCGCTTGGTTCGTCGTCCTCCTCGGGTCCTTCCTCGTCCGGTTCGACCTCCTCCTCCTCT 

TCGGGTCCTTCCAGCCCGTGGTTCTCCGGCGCTGAGCTGCTCGAGCCTGTGGTCACGTCA 

GAGAGGCCACCGTTTCCCAAGAAGCATAAGAAAAGGTCAGCCGAGTCTGTTTTCTCCGGT 

GAGCTGC^GC^GCTGC^ACCTCAGCGAAAGTGCAGCCACTGCGGCGTTCAGAAAACTCCG 

CAGTGGAGAGCCGGGCCAATGGGAGCCAAGACCCTGTGCAATGCGTGCGGTGTCCGGTAC 

AAGTCGGGTAGGTTGCTACCGGAATACAGACCCGCTTGTAGCCCGACATTCTCGAGTGAG 

CTGCACTCGAACC^CCACCGGAAAGTCATAGAGATGAGGCGGAAGAAGGAGCCAACCAGT 

GACAACGAAACCGGTTTAAACCAGCTGGTTCAGTCCCCACAAGCTGTACCAAGTTTTTGA 

>G1487 Amino Acid Sequence (domain in AA coordinates : 251-276). 

MEQAALKSSVRKEMALKTTSPVYEEFLAVTTAQNGFSVDDFSVDDLLDLSNDDVFADEET 

DLKAQHEMVRVSSEEPNDDGDALRRSSDFSGCDDFGSLPTSELSLPADDLANLEWLSHFV 

EDS FTEYSGPNLTGTPTEKPAWLTGDRKHPVTAVTEETCFKS PVPAKARSKRNRNGLKVW 

SLGSSSSSGPSSSGSTSSSSSGPSSPWFSGAELLEPVVTSERPPFPKKHKKRSAESVFSG 

ELQQLQPQRKCSHCGVQKTPQWRAGPMGAKTLCNACGVRYKSGRLLPEYRPACSPTFSSE 

LHSNHHRKVIEMRRKKEPTSDNETGLNQLVQSPQAVPSF* 

>G1492 (149.. 919) 

AATCCCAACCCACACACCTCTCAAATCCTCCTCTCCTCGTTTCTCTTTCTCTCCTCTTCA 
CAGAACCAAAACATATCAAACCTTTTTTTCTCTTGGGTTTAAGTAAAAATCGAATCTTTG 
TGTCGGTTTTTAGGGTTCTTGAAACGATATGGGTAAGTCTAGTGGTAGAAATGGTAACGG 
AAGCTTTAACGGCAATAAATTTCACGGAGTTAGACCTTACGTACGGTCTCCAGTTCCACG 

TCAACACCGAGCAACACCAAAACTTGTTCTTAAGATGATGGATC 

TTC^CATGTCAAAAGCCACCTTCAGATGTATAGAGGAGGTTCAAAGCTCACTTTGGAGA^ 

ACCAGAAGAAAGCTCATCATCTTCAATAAGAAGAAGACAAGACAGTGAAGAAGATTATTA 

TCTTCATGACAACTTGTCTTTACACACAAGGAATGATTGTCTTTTGGGTTTTCACTCTTT 

TCCTCTTOCTT(^CATTCTT<^TTTAGAGGAGGAGGAGGAGGAAGAACAAAAGAGCAG^ 

GACTTCAGAGTCTGGTGGTTATGATGATGATGCTGACTTTCTTCACATCAAGAAGATGAA 

CGATACGACGACGTTTTTGT(^C!ATCATTTCCCCAAGGGAACAGAGGAGTGGCGGGAACA 

AGAACACGAAGAAGftAGAAGAAGATTTGTCGTTGTCTCTGTCGTTAAATC^TC^TCATTG 

GAGAAGCAATGGATCATCGGTGGTGAGCGAAACGAGTGAAGCAGCAGTCTCGACTTGTTC 

AGCACCATTCGTATCeAAAGATTGCTTTGGTTCTTCAAAGATTGATCTTAATCTGTCAAT 

TTCTCTCCTCGGTAGCTAAATAAGTTATGCAAGATTTAGGTTCAGAGAAACTATTCGGAT 

GTGTTTTTGAAACTAGGATATTGAATGTTAGTAGAGAAACCTAGAAAATGAAGTTTAGAT 

AAATTATCAACGCAG CGTTTTGATCG CCTTTGAACGGAAAATTAACAAA 

>G1492 Amino Acid Sequence (dpmain in AA coordinates: 34-83) 

MGKS SGRNGNGS FNGNKFHGVRP YVRS PVPRLRWTPDLHRCFVHAVEILGGQHRATPKLV 

LKMMDVKGLTI SHVKSHLQMYRGGSKLTLEKPEES S SSS IRRRQDSEED YYLHDNLSLHT 

RNDCLLGFHSFPLSSHSSFRGGGGGRTKEQQTSESGGYDDDADFLHIKKMNDTTTFLSH^ 
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FPKGTEEWREQEHEEEEEDLSLSLSLNHHHWRSNGSSWSETSEAAVSTCSAPFVSKDCF 
GSSKIDLNLS I SLLGS * 
>G1531 (1. .666) 

ATGTGTGAGTCAAGCAACAAAGTCAGAGTATCGCCATACCCGCTTCGGTCTTCGAGGACC 
GACAAACACAAGGCGTCAGAGTCGCCTATTGAGACAGGTTGGGAGGATGTGCGTGGATGT 
CATCCTTACATGTGCGATACGAGTGTTCGTCACTCCAATTGTTTCAAGCAGTTCCGCAGA 
AAAACCATAAAAAAGCGCCTATACCCCAAGACCTTACATTGTCCTCTCTGTAGAGGTGAA 
GTATCCGAGACGACAAAGGTGACGAGCACTGCAAGAAGATTTATGAATGCTAAACCGAGG 
TCTTGCTCCGTAGAGGATTGCAAATTCTCTGGGACGTTTTCTCAGCTTACTAAGCACTTG 
AAAACTGAGCATCGCGGTATTGTGCCACCAAAGGTCGATCCACTGAGACAACAGAGATGG 
GAAATGATGGAGAGACATTCTGAATACGTTGAACTCATGACTGCAGCTGGGATTTCGCGT 
ATGGCTGAGGTGATGCAACAACAGCTTCCCCAGGATCAGAATCATCCTCATGTGTTTCAA 
GTGACCGTTAATGGAACCATATGGAATCTAATTGATCCGAGTCAGGGAAGGAATGGATTA 
GGCATCACCAACTATAGCGCAATGCAGTTTGTACCATTAAGCATAAATCACAGTAGAACT 
CTGTGA 

>G1531 Amino Acid Sequence {domain in AA coordinates: 41-77) 

MCESSNKVRVSPYPLRSSRTDKHKASESPIETGWEDVRGCHPYMCDTSVRHSNCFKQFRR 

KTIKKRLYPKTLHCPLCRGEVSETTKVTSTARRFMNAKPRSCSVEDCKFSGTFSQLTKHL 

KTEHRGIVPPKVDPLRQQRWEMMERHSEYVELMTAAGISRMAEVMQQQLPQDQNHPHVFQ 

VTVNGTIWNLIDPSQGRNGLGITNYSAMQFVPLSINHSRTL* 

>G1540 (122.. 997) 

atctctttactaccagcaagttgttttcttgctaacttcaaacttctctttctcttgttc 
ctctctaagtcttgatcttatttaccgttaactttgtgaacaaaagtcgaatcaaacaca 
catggagccgccacagcatcagcatcatcatcatcaagccgaccaagaaagcggcaacaa 
caacaacaagtccggctctggtggttacacgtgtcgccagaccagcacgaggtggacacc 
gacgacggagcaaatcaaaatcctcaaagaactttactacaacaatgcaatccggtcacc 
aacagccgatcagatccagaagatcactgcaaggctgagacagttcggaaagattgaggg 
caagaacgtcttttactggttccagaaccataaggctcgtgagcgtcagaagaagagatt 
caacggaacaaacatgaccacaccatcttcatcacccaactcggttatgatggcggctaa 
cgatcattatcatcctctacttcaccatcatcacggtgttcccatgcagagacctgctaa 
ttccgtcaacgttaaacttaaccaagaccatcatctctatcatcataacaagccatatcc 
cagcttcaataacgggaatttaaatcatgcaagctcaggtactgaatgtggtgttgttaa 
tgcttctaatggctacatgagtagccatgtctatggatctatggaacaagactgttctat 
gaattacaacaacgtaggtggaggatgggcaaacatggatcatcattactcatctgcacc 
ttacaacttcttcgatagagcaaagcctctgtttggtctagaaggtcatcaagacgaaga 
agaatgtggtggcgatgcttatctggaacatcgacgtacgcttcctctcttccctatgca 
cggtgaagatcacatcaacggtggtagtggtgccatctggaagtatggccaatcggaagt 
tcgcccttgcgcttctcttgagctacgtctgaactagctcttacgccggtgtcgctcggg 
attaaagctctttcctctctctctctctttcgtactcgtatgttcacaactatgcttcgc 
tagtgattaatgatgcagttgttatattagtagttaactagttatctctcgttatgtgta 
atttgtaattactagctaagtatcgtctaggtttaattgtaattgacaaccgtttatctc 
tatgatgaataagttaaatttatatat 

>G1540 Amino Acid Sequence (domain in AA coordinates: 35-98) 
MEPPQHQHHHHQADQESGNNNNKSGSGGYTCRQTSTO 

TADQIQKI TARLRQFGKI EGKNVFYWFQNHKARERQKKRFNGTNMTTPS SS PNS VMMAAN 
0HYHPLLHHHHGVPMQRPANS VNVKLNQDHHLYHHNKPYPS FNNGNLNHAS SGTECGWN 
ASNGYMSSHVYGS^QDCSMNYNNVGGGWANl^ 

ECGGDAYLEHRRTIjPLFPl^GEDHINGGSGAIWKYGQSEVRPCASLELRIiN* 
>G1544 (1..2178) 

ATGTCTCAGTCAAACATGGTACCAGTGGCTAAC^ 

AACAACAACAACAACAACAACAATGGTGGAACTGACAACACTAATGCTGGAAATGATTCT 

GGAGATCAAGATTTCGACAGTGGGAATACCTCAAGTGGCAATCATGGAGAAGGGTTGGGA 

AACAATGAAGCTCCTCGTCATAAGAAGAAAAAATACAATCGTCACACCCAACT^ 

TCGGAGATGGAAGCTTTCTTCAGAGAGTGTCCTCACCCAGATGACAAACAAAGGTACGAC 

CTTAGCGCTCAATTGGGATTGGACCCTGTTCAGATCAAATTCTGGTTCCAGAACAAACGC 

ACTCAAAACAAGAATCAACAAGAACGCTTTGAGAACT 

CACCTTAGGTCTGAAAATCAGCGGTTACGAGAAGCTATTCATCAAGCCTTATGCCCTAAG 



82 



BNSDOCID: <WO_03013227A2_IA> 



WO 03/013227 



PCT/US02/25805 



CTTACGGGGATAcSg^ 

cctgcagatgctaa?accaS 

ctcttggtgatggctcaa^SS 

ttagctttgaacttgga^S 

GGCGGGTTTCGAAcSaS^^ 
ATTGTTGAAATGCTCATGCAAGAGAA^CTGTG^ 

AGAGCCAGGACTCATCAACAGATAAT^™^ ^AACAATGTTTGCCGGAATTGTTGGT 

caaataatgagtgc^Sc^^^^ 

TTCGTCCGCTACTGTAAGCAACAtGGAP^S^ CTAGTC ACAACCCGCGAAAGCTAC 
TCTACAATATTTTCtSSS 

tctgttgc^ctactgSS^^ 

CCTTGTCAGACTGCT?GA ^^^ CTGATTCGTACAACCGTGCG GAGGATCAAAGATTTGTIT 

MS™A^gJ^^ 64-124, 
NMQAPRHKKKKYNraTQLQISE^^ 



NDPGKPPGVIICAATSFWLPAP^F^^^ 

™~ gg s L L 

>G156 (39.. 755) 

tctcagccaccgc^gctttccga^ 

ttgaccgatacttgcatS?ggaSSS 
aattgcaccatgagat<^ CT aS 

^cattccatgc^atgact™^^ 
ctgggatagataccScSSgSSS^^ 

AGCTraTACACTTCCTTC^^^^ 

-ttcaaaacgatcc™ 
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TTTCTTTATGGAGACAGATTCATGAACTTTTATTACCTATATTTTGATAAGCCAGTGTCT 
TCTTTTGTGTGGCTATGGAAACCTTGTTTAAAGCACAATGCACTTGAGTTCTTGGTTATA 
TAATTAATCATCATTATTACATANWAAANAANNAAAAAAAAAAAAAA 

>G156 Amino Acid Sequence (domain in AA coordinates: 2-57) 
MGRGKIEIKKIENQTARQVTFSKRRTGLIKKTRELSILCDAHIGLIVFSATGKLSBFCSE 
QNRMPQLIDRYLHTNGLRLPDHHDDQEQLHHEMELLRRETCNLELRLRPFHGHDLASIPP 
NELDGLERQLEHSVLKVRERKRRMLEEDNNNMYRWLHEHRAAM 

IEQLQCYKPGEYQQFLEQQQQQPNSVLQLATLPSEIDPTYNLQLAQPNLQNDPTAQND* 
>G1584 (160.. 1281) 

ATTCACATTTTTATTTATCTTTCCATTTAGCCATTCTGTTCCCTGTCTCTTCCTCCTCTC 

TTTTTGACACATCACATCATCATCACATCATCATTCAACATCAATCATCATCATATGCAT 

ACACATACATCTGTGTTCTGGGGATCGAGTTAATTAGTTATGGCTTCTTCGAATAGACAC 

TGGCCAAGCATGTTCAAGTCCAAACCTCATCCCCATCAATGGCAACATGACATCAACTCT 

CCTCTCTTGCCTTCTGCTTCTCACCGATCTTCTCCTTTCTCTTCAGGATGTGAAGTGGAG 

AGGAGTCCAGAGCCAAAACCAAGATGGAATCCAAAGCCAGAGCAGATTCGGATACTTGAA 

GCAATCTTTAACTCCGGGATGGTGAATCCTCCAAGAGAGGAGATCAGGCTTCAAGAATAC 

GGCCAAGTCGGTGATGCTAACGTCTTCTACTGGTTCCAAAACCGTAAGTCCCGTAGTAAA 

CACAAACTCCGCCTCCTCCACAACCACTCCAAACACTCTCTCCCTCAAACGCAACCGCAG 

CCGCAGCCGCAACCTTCGGCTTCCTCTTCCTCTTCCTCCTCCTCTTCCTCCTCCAAATCC 

ACCAAACCCCGAAAAAGCAAGAACAAGAACAACACTAATCTCTCTTTGGGTGGTAGTCAA 

ATGATGGGGATGTTTCCACCGGAACCGGCGTTTCTCTTCCCGGTCTCCACTGTCGGAGGG 

TTTGAAGGTATCACCGTCTCATCCCAATTAGGGTTTCTCTCCGGTGATATGATTGAGCAA 

CAAAAACCGGCTCCAACGTGTACCGGACTCCTGCTGAGTGAGATCATGAACGGTAGTGTG 

AGTTATGGAACTCATCATCAACAACACTTGAGTGAGAAAGAAGTTGAAGAAATGAGGATG 

AAGATGTTGCAACAGCCACAGACTCAGATTTGTTACGCTACCACTAATCATCAAATAGCT 

TCTTACAACAACAACAACAACAACAATAACAT 

ACTGCCACCACTATTACTACTTCGCATTCTCTCGCTACTGTCCCATCAACTTCGGACCAG 
CTTCAAGTTCAAGCGGACGCACGAATAAGAGTTTTCATCAATGAAATGGAGCTTGAAGTG 
AGCTCAGGACCGTTCAATGTGAGGGATGCATTTGGGGAAGAGGTTGTTCTGATTAATTCC 
GCGGGTCAGCCCATTGTCACCGATGAATATGGCGTCGCTCTTCACCCTCTTCAACACGGA 
GCCTCGTACTATCTGATCTAGTCGTGTGGGAGATTTGAGTTTGAAGAAGAAATTAAGACC 
TGTCTCTTTCTTTCACCATCTCTCGTACGTAGGCTTAAATGTTAAGATTTTATAAAGTAT 
TGGTTTCAGTTACCTGTTGTGACGGTGTTTATGTATGAGTTTCGGACAACATTCACAAAA 
CTCTCTCGTTAAATTGTTGACCAATAATATATGATGTGTGTTTCATTATTATCTAAAAAA 

AA 

>G1584 Amino Acid Sequence (domain in AA coordinates: TBD) 
MASSNRHWPSMFKSKPHPHQWQHDINSPLIiPSASHRSSPFSSGCEVERSPEPKPRWNPKP 
EQIRILEAIFNSGMVNPPREEIRLQEYGQVGDANVFYWFQl^KSRSKHKIiRLLHNHSKHS 
LPQTQPQPQPQPS AS SSSSSSSSSS KSTKPRKS KNKNNTNLSLGGSQMMGMFPPEPAFLF 
PVSTVGGFEGITVSSQLGFLSGDMIEQQKPAPTCTGLLLSEIMNGSVSYGTHHQQHLSEK 

EVEEMRMKMLQQPQTQICYATTNHQIASY*^^ 

VPSTSDQLQVQADARIRVF INEMELEVS SGPFNVRDAFGEEWLINS AGQPI VTDE YGVA 
LHPLQHGAS YYLI * 
>G1587 (1..816) 

ATGGGCTACATCTCCAACAACAACCTCATCAACTATTTGCCCCTCTCTACTACTCAACCT 
CCTCTTCTTCTCACCCACTGTGATATTAACGGCAATGATCACCATCAGCTCATAACCGC^ 
TCATCAGGAGAACACGATATTGATGAACGGAAAAACAACATTCCTGCGGCGGCGACTTTG 
AGATGGAATCCGACSCCAGAGCAGATCACGACGCTAGAAGAGCTTTACAGAAGCGGAACA 
CGGACGCCGACGACGGAACAGATCCAACAGATAGCATCTAAGCTTCGTAAATATGGGAGA 
ATCGAAGGGAAGAACGTTTTCTATTGGTTTCAGAATCATAAGGCTAGAGAGAGACTAAAA 
CGCCGCCGTCGTGAAGGTGGTGCTATTATCAAACCACATAAAGACGTCAAGGATTCATCA 
TCAGGTGGTCATCGAGTTGATCAGACAAAGCTCTGCCCATC^^ 

CCACAGCCACAGCATGAATTAGATCCTGCGAGTTACAATAAAGACAACAATGCTAATAAT 
GAAGATCATGGGACGACTGAAGAATCTGATCAGAGGGCATCAGAGGTTGGTAAATACGCC 
ACATGGAGAAATCTTGTTACTTGGTCGATAACTCAACAACCGGAAGAGATTAATATCGAC 
GAAAATGTCAACGGAGAAGAAGAAGAAACGAGGGACAACCGGACTTTAAATCTCTTTCCG 
GTTAGGGAGTACCAAGAGAAAACAGGCCGGTTGATAGAGAAGACGAAAGCATGCAACTAC 
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TGTTACTACTACGAGTTCATGCCTCTGAAGAACTGA 

>G1587 Amino Acid Sequence (conserved domain in AA coordinates : 61-121) 

MGYISNNNLINYLPLSTTQPPLLLTHCDINGNDHHQLITASSGEHDIDERKNNIPAAATL 

RWNPTPEQITTLEELYRSGTRTPTTEQIQQIASKLRKYGRIEGKNVFYWFQNHKARERLK 

RRRREGGAIIKPHKDVKDSSSGGHRVDQTKIjCPSFPHTNRPQPQHELDPASYNKDNNANW 

EDHGTTEESDQRASEVGKYATWRNLVTWSITQQPEEINIDENVNGEEEETRDNRTLNLFP 

VREYQEKTGRLIEKTKACNYCYYYEFMPLKN* 

>G1588 (1. .2232) 

ATGTACCATCCAAACATGTTTGAGAGCCATCATATGTTCGATATGACCCCAAAGAGTACC 

TCTGATAACGACTTGGGAATCACCGGTAGCCGAGAAGATGACTTTGAGACCAAGTCAGGT 

ACCGAAGTCACTACTGAGAATCCTTCTGGTGAAGAGCTTCAAGATCCTAGCCAACGTCCC 

AACAAAAAGAAGCGTTACCATCG CCACACGC AACGC CAAATTCAAGAGCTCGAATCATTC 

TTTAAGGAATGTCCTCATCCAGATGATAAGCAACGAAAAGAGTTGAGCCGTGATCTCAAT 

TTAGAGCCTCTTCAAGTTAAGTTTTGGTTCCAAAACAAACGCACACAGATGAAGGCACAA 

AGTGAGAGGCATGAGAACCAGATTCTAAAGTCAGACAATGACAAGCTCAGAGCAGAGAAC 

AATAGATACAAAGAAGCTCTAAGCAATGCTACATGCCCTAACTGTGGCGGTCCAGCTGCT 

ATTGGAGAAATGTCTTTTGACGAACAACATCTCAGGATCGAAAATGCTCGGCTCCGCGAA 

GAGATTGATAGGATCTCTGCTATTGCTGCGAAATACGTTGGGAAGCCGTTAGGATCGTCT 

TTCGCTCCACTAGCGATCCACGCGCCTTCTCGTTCGCTTGATCTTGAAGTTGGAAACTTT 

GGGAACCAGACAGGCTTTGTAGGAGAAATGTATGGAACAGGGGACATTTTGAGGTCAGTT 

TCGATTCCTTCTGAGACTGATAAGCCTATAATCGTGGAGCTAGCGGTTGCAGCTATGGAG 

GAACTCGTGAGAATGGCTCAAACTGGAGATCCTTTATGGCTTTCAACCGATAATTCAGTC 

GAGATTCTCAACGAAGAAGAGTATTTCAGAACGTTTCCGAGAGGAATTGGACCAAAGCCA 

TTAGGATTAAGATCAGAGGCGTCAAGACAATCTGCAGTTGTTATAATGAATCACATCAAT 

CTCGTTGAGATTCTCATGGATGTGAATCAATGGTCTTGTGTTTTCTCTGGGATTGTGTCA 

AGAGCCTTGACACTTGAAGTTCTTTCAACTGGAGTTGCTGGGAACTACAACGGTGCTTTA 

CAAGTGATGACAGCTGAGTTTCAAGTTCCATCACCCCTAGTCCCAACGCGTGAGAACTAC 

TTTGTGAGATACTGCAAACAACACAGTGACGGCTCTTGGGCTGTGGTTGATGTCTCTTTG 

GACAGCCTTAGACCAAGTACTCCAATCTTAAGAACTAGAAGAAGGCCTTCAGGTTGTCTG 

ATTCAAGAATTGCCTAATGGTTATTCTAAGGTTACATGGATAGAGCATATGGAGGTAGAT 

GATAGATCAGTTCACAACATGTATAAACCGTTGGTTCAGTCCGGTTTAGCTTTCGGTGCG 

AAACGTTGGGTGGCTACACTCGAACGACAATGCGAGCGGCTTGCTAGCTCCATGGCCAGC 

AACATTCCTGGTGATCTTTCCGTGATAACGAGTCCTGAAGGAAGGAAGAGTATGTTGAAG 

CTAGCTGAGAGAATGGTTATGAGTTTCTGCAGTGGTGTTGGCGCGTCGACTGCACACGCT 

TGGACAACAATGTCGACAACAGGATCCGATGATGTTCGGGTCATGACCCGCAAGAGTATG 

GATGATCCAGGAAGACCTCCGGGTATTGTTCTTAGTGCAGCTACTTCATTCTGGATCCCA 

GTTGCTCCCAAACGTGTTTTTGATTTCCTCCGTGACGAAAATTCAAGAAAAGAGTGGGAT 

ATTCTGTCA7VATGGAGGTATGGTTCAGGAAATGGCTCATATAGCCAATGGTCATGAACCT 

GGAAACTGTGTCTCCTTGCTCCGAGTCAATAGTGGAAACTCGAGCCAGAGCAACATGTTG 

ATTCTACAAGAGAGCTGTACAGATGCATCAGGATCGTATGTGATTTACGCGCCAGTGGAT 

ATAGTGGCGATGAATGTGGTTCTAAGCGGTGGAGATCCTGATTACGTGGCGTTGTTGCCG 

TCTGGTTTTGCTATTTTACCGGATGGTTCGGTTGGAGGAGGAGATGGGAATCAGCATCAG 

GAAATGGTTTCTACTACTTCTTCTGGGAGTTGTGGTGGTTCGCTTTTAACCGTTGCGTTT 

C^GATTCTTGTTGACTCTGTTCCTACAGCTAAACTCTCACTTGGCTCGGTGGCTACGGTT 

AATAGTCTGATCAAATGTACGGTGGAGAGGATTAAAGCTGCTGTTTCTTGTGATGTTGGA 

GGAGGAGCGTAG 

>G1588 Amino Acid Sequence (domain in AA coordinates: 66-124) 

MYHPNMFESHHMFDMTPKSTSDNDLGITGSREDDFETKSGTEVTTENPSGEELQDPSQRP 

NK3CKRYHRHTQRQIQELESFFKECPHPDDKQRKELSRDLNLEPLQV1CFWFQNK31TQMKAQ 

SERHENQI LKSDNDKLRAENNRYKEALSNATCPNCGGPAAI GEMS FDEQHLRIENARLRE 

EIDRI S AIAAKYVGKPLG S S FAPLAIHAPSRSLDLEVGNFGNQTGFVGEMYGTGD ILRS V 

SIPSETDKPIIVELAVAAMEELVIU^QTGDPLWLSTO 

LGLRSEASRQSAVVIMNHINLVEIL*©WQWSW 

QVMTAEFQVPSPLVPTRENYFVRYCKQHSDGSWAVVDVSIiDSLRPSTPILRTRRRPSGCL 
IQELPNGYSKvTWIEHMEvXJDRSVHNMYKPLVQSGLAFGAKRW 

NIPGDLSVITSPEGRKSMLKLAERMV^SFCSGVGASTAHAWTTMSTTGSDDWvKTRKSM 
DDPGRPPGIVLSAATSFWIPVAPKRVFDFLRDENSRKEWDILSNGGMVQEMAHIANGHEP 
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GNCVSLLRWSGNSSQSNMLILQESCTDASGSYVIYAPVDIVAMNWLSGGDPDYVALLP 
SGFAILPDGSVGGGDGNQHQEMVSTTSSGSCGGSLLTVAFQILVDSVPTAKLSLGSVATV 

NSLIKCTVERIKAAVSCDVGGGA* 
>G1589 (179.. 2221) 

ACCAAACTCACATAGCAATCACACACATCTCCACAAACACAGCTTGAGATGATCATGAAA 

CACGTGCATCCTCAGATCTCTATCAATCCAGCTTGGTGAAAGAAGGTCAAGAATTGAAAG 

AGAATCAAAGAAAACGACGTCGTTTCATTCGTGTGTAACAACTACTAATTATACATAGAT 

GGCTGCTTACTTTCACGGAAACCCACCGGAGATCTCTGCCGGATCCGACGGTGGTCTTCA 

AACGTTGATCCTCATGAATCCAACTACTTACGTTCAGTACACCCAACAAGACAACGACTC 

GAACAACAACAACAACAGCAACAATAGCAACAACAACAACACAAACACAAACACAAACAA 

CAACAACAGTAGTTTCGTTTTCCTCGATTCCCACGCGCCGCAGCCAAACGCGAGCCAGCA 

GTTCGTCGGAATACCACTCTCAGGTCACGAAGCTGCTTCCATTACAGCCGCCGACAACAT 

CTCCGTACTTCACGGTTATCCTCCGCGCGTGCAGTACAGTCTCTACGGTAGCCACCAAGT 

GGATCCCACTCACCAGCAAGCCGCGTGTGAGACGCCACGCGCGCAGCAAGGCCTCTCTTT 

AACCCTCTCGTCTCAACAGCAGCAGCAACAGCAACATCATCAACAACACCAGCCTATTCA 

CGTCGGATTCGGGTCCGGACATGGAGAAGATATCCGGGTCGGGTCTGGCTCTACAGGATC 

GGGGGTAACAAACGGTATAGCTAATCTTGTTAGCTCCAAGTACTTGAAGGCAGCACAAGA 

GCTTCTTGACGAAGTAGTCAACGCTGATTCCGATGACATGAACGCTAAATCCCAACTATT 

CTCATCGAAAAAGGGTAGTTGCGGAAATGATAAACCTGTCGGAGAATCATCGGCCGGCGC 

TGGAGGAGAAGGTTCCGGTGGCGGAGCAGAAGCAGCCGGGAAACGTCCGGTGGAGCTAGG 

CACGGCAGAGAGACAAGAAATACAGATGAAGAAAGCAAAACTTAGTAACATGCTTCATGA 

GGTGGAGCAGAGATATAGACAGTACCACCAGCAGATGCAGATGGTGATCTCTTCGTTCGA 

GCAAGCGGCAGGGATAGGATCAGCGAAGTCATACACGTCGCTAGCATTGAAAACCATATC 

AAGACAGTTCCGTTGCTTGAAAGAGGCGATCGCTGGTCAGATAAAAGCGGCCAACAAGAG 

TCTTGGGGAGGAAGATTCAGTGTCTGGTGTTGGGAGGTTTGAGGGGTCGAGGCTCAAGTT 

CGTGGACCACCACTTGAGACAGCAAAGAGCTCTTCAACAACTGGGAATGATTCAACATCC 

TTCCAATAATGCTTGGAGACCTCAACGTGGTCTCCCAGAACGAGCCGTCTCAGTTCTCCG 

TGCTTGGCTCTTCGAACACTTTCTTCATCCATACCCTAAGGATTCGGACAAGCACATGCT 

AGCTAAGCAAACAGGACTCACTCGTAGGCAGGTGTCGAACTGGTTTATAAACGCGAGAGT 

TCGGTTATGGAAACCAATGGTGGAGGAGATGTACATGGAGGAAATGAAGGAGCAGGCAAA 

GAACATGGGATCCATGGAAAAGACTCCTTTGGATCAAAGCAACGAAGATTCTGCTTCAAA 

GTC^UVCAAGTAACCAAGAAAAGAGCCCAATGGCGGACACT-^TTACCATATGAATCCCAA 

TCACAACGGTGACCTAGAAGGCGTCACTGGAATGCAAGGATGCCCCAAGAGACTAAGAAC 

CAGCGACGAGACAATGATGCAGCCAATAAATGCGGATTTC^GCTCCAACGAGAAGCTCAC 

GATGAAGATTCTAGAAGAACGGCAAGGGATAAGATCAGATGGTGGCTACCCTTTCATGGG 

TAATTTCGGGCAATACCAAATGGATGAGATGTCAAGATTTGATGTAGTCTCAGACCAGGA 

GCTCATGGCGCAAAGGTACTCAGGAAACAACAATGGCGTGTCCCTCACGTTAGGTTTACC 

TCATTGTGATAGCTTGTCGTCCACGGACCATCAGGGTTTCATGCAGACCCACCATGGGAT 

TCCTATAGGGAGAAGAGTGAAAATAGGAGAAACAGAGGAATATGGACCCGCCACCATCAA 

TGGTGGTAGCTCGACCACAACCGCACATTCATCAGCGGCAGCTGCCGCGGCTTACAATGG 

GATGAACATACAGAACCAGAAGAGATATGTGGCTCAGTTATTGCCCGACTTCGTTGCATA 

AACCCATCTCTCTAGAAGGAGAAACCGAAACAGGTTATTATATACGTTTCTAGTTTTTAA 

TTAGTATATAGTTTCTCATACCATTGAACCZAAAACAAAGAAC7VAAATTTA 

TTGGTTATATATGGCCGACGGGCTACGTCAGGGCCCTGACGTAGC 

>G1589 Amino Acid Sequence (conserved domain in AA coordinates : 384-448) 
MAAYFHGNPPE I S AGSDGGLQTL ILMNPTTYVQYTQQDNDSNNNNN^ 
NNNSSPWLDSHAPQPNASQQFVGIPLSGHEAASITAADNISVIjHGYPPRVQYSIjYGSHQ 
VDPTHQQAACETPRAQQGLSIiTLSSQQQQQQQXIHQQHQPIHVGFGSGHGEDIRVGSGSTG 
SGVlTCGXANLVSSKYLKAAQELLDEv^ 

AGGEGSGGGAEAAGKRPV^LGTAERQEIQMKKAKLSNMLHEV^QRYRQYHQQMQMVISS 

EQAAGIGSAKSYTSLALKTISRQFRCLKEAIAGQIKAANKSLGEEDSVSGVGRFEGSRLK 

FTOHHLRQQRALQQLGMIQHPSNNAWRPQRGLPERAVSVTiRAWLFEHFLHPYPro 

LAKQTGLTRSQVSNWFINARVlUiWKPMVEEMYMEEMKEQAKN^ 

KSTSNQEKSPMADTNYHMNPNHNGDLEGVT^ 

TMKILEERQGIRSDGGYPFMGNFGQYQMDEMSRFDWSDQELMAQRYSGNNNGVSLTLGL 
PHCIDSLSSTDHQGFMQTHHGIPIGRRVKIGETEEYGPATINGGSSTTTAHSSAAAAAAYN 
GMNIQNQKRYVAQLLPDFVA* 
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>G160 (38. .784) 

TCAAATTTGTCATTTGTTTATTCAAATTTTTGAGAAAATGGTGAGAAGTACCAAAGGTCG 
TCAGAAAATAGAGATGAAAAAAATGGAAAACGAAAGCAACCTTCAGGTTACTTTCTCAAA 
AAGAAGATTCGGTCTTTTCAAAAAAGCTAGTGAACTTTGCACATTAAGTGGTGCAGAGAT 
TCTGTTGATTGTGTTCTCTCCTGGTGGGAAAGTGTTTTCTTTTGGCCATCCAAGTGTTCA 
AGAACTCATTCATCGCTTTTCGAATCCTAACCATAATTCTGCCATTGTCCATCATCAGAA 
CAACAATCTCCAACTTGTTGAAACCCGTCCGGATAGAAATATCCAATATCTCAACAATAT 
ACTCACTGAGGTGCTGGCAAACCAGGAAAAGGAGAAACAGAAGAGAATGGTTTTGGACCT 
ATTGAAAGAATCCAGAGAACAAGTAGGAAACTGGTATGAAAAAGATGTGAAAGATCTCGA 
CATGAATGAAACCAACCAGCTGATATCTGCTCTTCAAGATGTGAAAAAGAAACTGGTAAG 
AGAAATGTCTCAATATTCTCAAGTAAATGTTTCGCAGAATTACTTTGGTCAAAGTTCTGG 
CGTGATTGGTGGTGGTAATGTTGGCATTGATCTTTTTGATCAAAGAAGAAATGCATTCAA 
CTATAATCCAAACATGGTGTTTCCCAATCATACACCACCAATGTTTGGATACAACAATGA 
TGGAGTTCTCGTTCCGATATCCAACATGAACTACATGTCAAGTTACAACTTCAACCAGAG 
CTAGAGTCTGAAGCTAGAAGAACATCCTAATCAATATTTGCGTTATTTTGGCTATGGTTA 
CTGTTAGGATTGTTCTTGTATTGTGAGACTTAAGTTTGTTTTTTCTTTTAATTTGTTTCA 
GTTGGTTGGTTTTTCATTTTATTCGTCGTTTGTTTTCCTTTGTTTTTGGATATTTTTGTA 
TCCCAGAATAAATTTATTTATCCTTTAAAAA 

>G160 Amino Acid Sequence (domain in AA coordinates: 7-62) 

MWSTKGRQKIEMKKMENESl^QVTFSKRRFGLFKXASELCTLSGAEILLIVFSPGGKVF 

SFGHPSVQELIHRFSNPimNSAIVHHQNNNLQLra 

QKmVLDLLKESREQVGNWYEKDVKDLDMNETO^ 

NYFGQSSGVIGGGNVGIDLFDQRRNAFNYNPNMVFPNHT^ 

SSYNFNQS* 

>G1636 (19.. 666) 

GAGTAATCATCAACGATTATGGCGTCAAGTCAGTGGACGAGGTCGGAGGATAAGATGTTT 
GAGCAAGCTTTGGTTCTTTTTCCTGAAGGATCTCCTAATCGGTGGGAGAGAATCGCTGAT 
CAGCTTCATAAATCTGCTGGTGAAGTTAGGGAGCATTACGAGGTCTTGGTTCATGATGTT 
TTCGAGATTGATTCTGGTCGAGTTGATGTCCCTGATTACATGGATGACTCGGCGGCTGCG 
GCGGCGGGTTGGGATTCCGCTGGTCAGATCTCTTTTGGGTCTAAACATGGCGAGAGTGAA 
CGGAAAAGAGGAACTCCTTGGACAGAGAACGAACACAAATTGTTTCTGATCGGATTAAAG 
AGATATGGTAAGGGAGATTGGAGGAGTATCTCGAGAAACGTTGTGGTGACGAGGACACCG 
ACGCAAGTCGCGAGTCACGCTCAGAAGTATTTTCTGAGACAGAACTCGGTGAAGAAGGAG 
AGGAAAAGGTCGAGCATCCATGATATAACTACGGTTGATGCTACTTTGGCTATGCCTGGG 
TCTAACATGGACTGGACTGGCCAACACGGGAGTCCTGTTCAGGCGCCGCAGCAGCAACAG 
ATTATGTCTGAGTTCGGTCAGCAATTGAATCCTGGTCATTTCGAGGATTTTGGGTTTCGG 
ATGTGATG 

>G1636 Amino Acid Sequence (domain in AA coordinates: 100-165) 

MASSQWTRSEDKMFEQALVLFPEGSPNRWERIADQLHKSAGEVREHYEVLVHDVFEIDSG- 

RVDVPDYMDDSAAAAAGWDSAGQISFGSKHGESERKRGTPWTENEHKLFIilGLKRYGKGD 

WRS ISRNVVVTRTPTQVASHAQKYFLRQNSVKKERKRS S IHDITTVDATIiAMPGSNMDWT 

GQHGSPVQAPQQQQIMSEFGQQLNPGHFEDFGFRM* 

>G1642 (1..1077) 

ATGGGTCATCACTCATGCTGCAACAAGCAAAAGGTGAAGAGAGGGCTTTGGTCACCTGAA 

GAAGACGAAAAGCTCATCAACTACATCAATTCATATGGCCATGGATGTTGGAGCTCTGTT 

CCTAAACATGCAGGTTTGCAGAGATGTGGAAAGAGTTGTAGATTAAGATGGATAAATTAT 

CTAAGACCTGATCTTAAACGTGGAAGCTTCTCTCCTCAAGAAGCTGCTCTTATCATTGAG 

CTTC^CAGCATTCTTOGTAACAGATGGGCTCAAATTGCTAAACATCTACCTGGAAGAAC^ 

GATAACGAGGTGAAGAATTTCTGGAACTCGAGCATTAAAAAGAAGCTCATGTCTCACCAT 

CATCACGGTCATCATCATCATCATCTCTCTTCCATGGCGAGTTTGCTCACAAACCTTCCT 

TATC^CAATGGATTC^CCCTACTACAGTCGACGATGAAAGTTCAAGATTC^TGTCa^T 

ATCATCACAAACACTAACCCTAATTTCATCACTCCAAGCCATCTCTCTCTTCCTTCTCCT 

CATGTTATGACCCCATTGATGTTCCCAACCTCTAGAGAAGGAGATTTCAAGTTTCTAACC 

ACAAACAACCCAAACCAATCTCATCACCATGATAAT^ 

TTGTCACCCACACCAACTATAAACAATCATCM 

GATAATAATCTCCAATGGCCAGCGTTACCAGATTTCCCAGCGAGTACCATTTCTGGTTTC 
CAAGAAACCCTTCAAGATTATGATGATGCTAATAAACTCAACGTGTTTGTGACACCATTC 
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AACGATAATGCCAAAAAGTTATTATGTGGAGAAGTTCTCGAAGGCAAAGTACTATCTTCC 

TCCTCACCAATTTCACAAGATCACGGCCTTTTTCTTCCCACCACGTACAACTTTCAAATG 

ACTTCTACGAGTGATCATCAACATCATCATCGAGTGGACTCATACATCAATCACATGATC 

ATACCATCATCATCCTCATCGTCGCCAATCTCTTGTGGACAGTACGTCATAACTTAA 

>G1642 Amino Acid Sequence {domain in AA coordinates: TBD) 

MGHHSCCNKQK^KRGLWSPEEDEKLINYINSYGHGCWSSVPKHAGIiQRCGKSCRLRWINY 

LRPDLKRGS FS PQEAALI IELHSI LGNRWAQI AKHLPGRTDNEVKNF WNS S I KKKLMSHH 

HHGHHHHHLSSMASLLTNLPYHNGFNPTTVDDESSRFMSNIITNTNPNFITPSHLSLPSP 

HVWTPIjMFPTSREGDFKFLTTNNPNQSHH^ 

DNNLQWPALPDFPASTISGFQETLQDYDDANKLNVFVTPF1TO 

SSPISQDHGLFLPTTYNFQMTSTSDHQHHHRVDSYINHMIIPSSSSSSPISCGQYVIT* 
>G1747 (1..777) 

ATGAAAATGATGCAAGAGGAGGGAAACCGAAAAGGTCCATGGACAGAACAGGAAGACATA 

CTTCTGGTAAATTTTGTTCACTTATTTGGAGATCGACGATGGGATTTTATAGCAAAAGTA 

TCAGGTTTGAACAGAACAGGAAAGAGTTGCAGGCTAAGATGGGTTAATTACCTACATCCT 

GGTCTCAAACGTGGCAAGATGACGCCTCAAGAAGAGCGCCTCGTCCTTGAGCTTCACGCT 

AAGTGGGGAAACAGGTGGTCGAAAATAGCCCGAAAATTGCCGGGACGAACGGATAACGAG 

ATAAAGAACTACTGGAGGACTCATATGAGGAAGAAAGCTCAAGAAAAGAAGCGTCCTGTT 

TCCCCAACTTCCTCATTTTCCAACTGCAGCTCGTCATCTGTGACCACTACCACCACCAAT 

ACTCAAGATACATCGTGCCACTCGCGTAAATCTTCAGGGGAAGTGAGCTTTTACGACACT 

GGAGGTTCCCGATCCACTAGAGAGATGAATCAAGAAAACGAAGACGTGTACTCGTTGGAT 

GATATATGGAGAGAGATTGATCACTCAGCAGTAAACATAATAAAACCGGTTAAAGACATC 

TACTCAGAACAAAGCCATTGCTTAAGTTACCCAAATCTAGCTTCACCATCATGGGAAAGC 

TCATTGGATTCTATATGGAACATGGATGCAGATAAAAGTAAGATATCGTCTTACTTTGCA 

AATGATCAGTTTCCTTTCTGTTTCCAACACAGTAGATCACCATGGTCGTCAGGTTAA 

>G1747 Amino Acid Sequence (domain in AA coordinates: 11-114) 

MK3^QEEGNRKGPWTEQEDILLWFVHLFGDRRTO 

GLKRGKMTPQEERLVLELHAKWGNRWS KI ARKL PGRTDNEI KNYWRTHMRKKAQEKKRPV 
S PTS SFSNCS S S S VTTTTTNTQDTSCHSRKS SGEVS F YDTGGSRSTREMNQENEDVYSLD 
DIWREIDHSAVNIIKPVKDIYSEQSHCLSYPNl^SPSWESSLDSIVmMDADKSKISSYFA 
NDQFPFCFQHSRSPWSSG* 
>G1749 (59.. 535) 

CAAC7VCTTCTCAGTGACCGTGAGCAACGAATTATTTTCAGTTCAACGACTCCGCGGAAAT 

GGAAAATTC^GAAAATGTTCCCTCTTACGATC^U^CATCAATTTCACTCCTAATTTGAC 

GAGAGATCAAGAACATGTGATCATGGTCTCTGCTTTGCAACAAGTAATATCCAACGTCGG 

AGGTGACACGAACTCGAATGCATGGGAAGCTGATCTTCCACCTTTGAACGCTGGCCCTTG 

TCCTCTTTGTAGTGTCACCGGCTGCTACGGTTGCGTCTTCCCACGACACGAGGCGATAAT 

TAAGAAGGAGAAGAAGCACAAAGGAGTGAGGAAAAAACCATCAGGTAAATGGGCGGCGGA 

GATATGGGATCCGAGTTTGAAAGTAAGGAGATGGCTTGGAACGTTTCCAACAGCGGAGAT 

GGCGGCTAAGGCTTACAACGATGCGGCGGCTGAGTTTGTCGGAAGAAGATCAGCAAGACG 

TGGCACAAAGAACGGAGAGGAAGCATCTACCAAGAAGACGACTGAGAAAAATTAACGGAG 

AAGGAGCACGTATAGT^AAGGCAGGAAGAGGCATCTTACTTGCTTCACAAGTAAATCAGAA 

TTTTTTTGAAAAGTAAAAACGTTATTTTGTTTGGTAATAAAATAAAGTAAAA 

TGCTAACGCAAGACTTATCAAGTTCAGTCGTGACTGTGAGTGTGTTTTTATGTATCTTAC 

TTCATTTTTTGTCTTTCAATTGTGTGTGTGTGTGT 

>G1749 Amino Acid Sequence (conserved domain in AA coordinates : 84-155) 

MENSENVPSYDQNINFTPNIiTRDQEHVIMVSALQQVISNVGGDTNSNAWEADLPPIiNAGP 

CPLCSWGCYGCVF^RHEAIIKKEKKHKGTOKKPSGKWAAEIWDPSLK^ 

MAAKAYNDAAAEFVGRRSARRGTKNGEEASTKKTTEKN* 

>G1751 (117.. 923) 

AAACACAAACAAAACTCATATTTTCAATCTCCAGGTGCTTTACACCAA 
AAAACAAAAACCAAACTCGGATTTAGTTTGACAGAAGAAGGAATCGAGAGTCGGGTATGC 
ATTATCCTAACAACAGAACCGAATTCGTCGGAGCTCCAGCCCCAACCCGGTATCAAAAGG 
AGC^GTTGTCACCGGAGCAAGAGCTTTCAGTTATTGTCT 

CAGGGGAAAACGAAACGGCGCCGTGTCAGGGTTTTTCCAGTGACAGCACAGTGATAAGCG 

CGGGAATGCCTCGGTTGGATTCAGACACnTGTCAAGTCTC 

GCIX3TAACTACTTTTTCGCGCCAAATCAGAGAATTGAAAAGAATC 
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AGATTACTAGTAGTAGTAACAGAAGAAGAGAGAGCTCTCCCGTGGCGAAGAAAGCGGAAG 

GTGGCGGGAAAATCAGGAAGAGGAAGAACAAGAAGAATGGTTACAGAGGAGTTAGGCAAA 

GACCTTGGGGAAAATTTGCAGCTGAGATCAGAGATCCTAAAAGAGCCACACGTGTTTGGC 

TTGGTACTTTCGAAACCGCCGAAGATGCGGCTCGAGCTTATGATCGAGCCGCGATTGGAT 

TCCGTGGGCCAAGGGCTAAACTCAACTTCCCCTTTGTGGATTACACGTCTTCAGTTTCAT 

CTCCTGTTGCTGCTGATGATATAGGAGCAAAGGCAAGTGCAAGCGCCAGTGTGAGCGCCA 

CAGATTCAGTTGAAGCAGAGCAATGGAACGGAGGAGGAGGGGATTGCAATATGGAGGAGT 

GGATGAATATGATGATGATGATGGATTTTGGGAATGGAGATTCTTCAGATTCAGGAAATA 

CAATTGCTGATATGTTCCAGTGATAAATGAGCTCTTTCTTGTTGGCGTTTTTTGGAGTTA 

AGTGCAAGAAGAGATTGACACTGTGGCTTGTTTAAAGTGAACAAGAACAAGAAAGCATGT 

AATTAGTAGTCTCATTCTTTTGTTTGTGGTCAATTCTATGTTTATCTCATATAAAATCTG 

AGTTAAACCTATCTGAGGAGAGAGTAAATAAAGAGGTTAAGAA 

>G1751 Amino Acid Sequence (domain in AA coordinates: TBD) 

MHYPNNRTEFVGAPAPTRYQKEQLSPEQELSVIVSALQHVISGENETAPCQGFSSDSTVI 

SAGMPRLDSDTCQVCRIEGCLGCJJYFFAPNQRIEKNHQQEEEITSSSNRRRESSPVAKKA 

EGGGKIRKRKNKKNGYRGVRQRPWGKFAAEIRDPKRATRVWLGTFETAEDAARAYDRAAI 

GFRGPRAKLNFPFVDYTSSVSSPVAADDIGAKASASASVSATDSVEAEQWNGGGGDCNME 

EWMNMMMMMDFGNGDSSDSGNTIADMFQ* 

>G1752 (25.. 756) 

AAAAAAAAAAAAAAAAAAAAACTTATGGAATATTCCCAATCTTCCATGTATTCATCTCCA 
AGTTCTTGGAGCTCATCACAAGAATCACTCTTATGGAACGAGAGCTGTTTCTTGGATCAA 
TCATCTGAACCTCAAGCCTTCTTTTGCCCTAATTATGATTACTCCGATGACTTTTTCTCA 
TTTGAGTCACCGGAGATGATGATTAAGGAAGAAATTCAAAACGGCGACGTTTCTAACTCC 
GAAGAAGAAGAAAAGGTTGGAATTGATGAAGAAAGATCATACAGAGGAGTGAGGAAAAGG 
CCGTGGGGGAAATTTGCAGCGGAGATAAGAGATTCAACGAGGAATGGAATTAGGGTTTGG 
CTCGGGACATTTGACAAAGCCGAGGAAGCCGCTCTTGCTTATGATCAAGCGGCTTTCGCC 
ACAAAAGGATCTCTTGCAACACTTAATTTCCCGGTGGAAGTGGTTAGAGAGTCGCTAAAG 
AAAATGGAGAATGTGAATCTTCATGATGG AGGATCTCCGGTTATGGCCTTGAAGAGAAAA 
CATTCTCTTCGAAACCGGCCTAGAGGGAAAAAGCGATCCTCTTCTTCTTCTTCTTCTTCT 
TCTAATTCTTCTTCTTGCTCTTCTTCTTCGTCTACTTCTTCAACATCAAGAAGTAGTAGT 
AAGCAGAGTGTTGTGAAGCAAGAAAGTGGTACACTTGTGGTTTTTGAAGATTTAGGTGCT 
GAGTATTTAGAACAACTTCTTATGAGCTCATGTTGATCTTGTAATTGATTTCAGCAAAAG 
CCACTATTAAACTTTAATTTTGTGATAATTAATCTTGAAATTTGTTTTGTTCATTCTGCA 
ATTTCTTTGGTTCTCTTATTTTTTGTTTGTTGTATCCAAATGAAATTATTGGAAGAGATG 
GTGATGTTAAAGTGTATATATATAAAAAAAAAA 

>G1752 Amino Acid Sequence (domain in AA coordinates: TBD) 
MEYSQSSMYSSPSSWSSSQESLLWNESCFLDQSSEPQAFFCPNYDYSDDFFSFESPEMMI 
KEEIQNGDVSNSEEEEKVGIDEERSYRGVRKRPWGKFAAEIRDSTRNGIRVWLGTFDKAE 
E AAL AYD Q AAF ATKG S LATLNFP VE WRE S LKKMENVNLHDGG S P VMALKR KHS LRNRPR 

GKKRSSSSSSSSSNSSSCSSSSSTSSTSRSSSKQSWKQESGTLWFEDLGAEYLEQLLM 
SSC* 

>G1763 (33.. 977) 

GTACATTTTTTTTTGTATTTCAGGAAACTCCGATGGCGGATCTCTTCGGTGGTGGCCACG 
GCGGCGAGCTTATGGAAGCACTTCAACCTTTTTACAAAAGTGCTTCCACGTCTGCTTCAA 
ATCCTGCGTTTGCGTCCTCAAACGATGCGTTTGCGTCTGCCCCTkAACGACCCATTTTCTT 
CTTCTTCTTACTATAATCCTCATGCATCTTTCTTCCCTTCACATTCCACAACCACTTACC 
CGGATATTTATTCTGGATCCATGACCTATCCATCTTCATTCGGGTCGGATCTTCAACAAC 
CCGAAAACTACCAAffCTCAGTTCCATTACCAAAACACTATCACTTACACTC^CCAAGAC^ 
ACAACACTTGCATGCTCAACTTCATTGAGCCGAGCCAACCGGATTTTATGACCCAACCGG 
GTCCGAGTTCGGGTTeGGTTTCAAAACCGGCTAAGCTCTATAGAGGAGTGAGGCAAAGAC 
ATTGGGGAAAATGGGTCGCGGAGATCCGTTTACCCAGGAACCGAACCCGACTTTGGCTCG 
GAACATTCGACACGGCTGAAGAAGCCGCGTTGGCTTATGATCGCGCCGCGTTTAAGCTTC 
GTGGTGACTCGGCTCGGCTTAACTTCCCAGCTCTCCGATACCAAACCGGCTCGTCTCCGT 
CTGACGTTGGCGAATACGGACCTATTCAAGCTGCCGTTGACGCCAAGCTAGAAGCCATAT 
TAGCTGAGCCGAAGAATCAGCCGGGCAAAACGGAGAGGACGTCGAGGAAACGAGCTAAAG 
CCGCGGCTTCTTCAGCTCAGCAGCCGTCAGCGCCACAACAACATTCCGGGTCGGGTGAAA 
GTGATGGGTCGGGTTCACCGACTTCGGATGTTATGGTGCAGGAGATGTGCCAAGAGCCAG 
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AGATGCCATGGAATGAAAATTTCATGCTCGGCAAGTGTCCTTCTTATGAGATAGATTGGG 

CTTCAATTTTATCGTGAAAAATTAGGATTCAATTCATTTTTATTCATTTTAACTTGTTTG 

TATTTTCTTTTAAACTTTAGGGTTATTAGCTGTGCGTAAAATTTGTAATTTAGCATTTTG 

TATGAATGTAATGCAAGTGTGTAAATTATGGACAGCTCAAGCTTTTTTGTTAAAA 

>G1763 Amino Acid Sequence (conserved domain in AA coordinates : 140-209) 

MADLFGGGHGGELMEALQPFYKSASTSASNPAFASSNDAFASAPNDPFSSSSYYNPHASF 

FPSHSTTTYPDIYSGSMTYPSSFGSDLQQPENYQSQFHYQNTITYTHQDNWTCMLNFIEP 

SQPDFMTQPGPSSGSVSKPAKLYRGVRQRHWGKKVAEIRLPRNRTRLWLGTFDTAEEAAL 

AYDRAAFKLRGDSARLNFPALRYQTGSSPSDVGEYGPIQAAVDAKLEAILAEPKNQPGKT 

ERTSRKRAKAAASSAEQPSAPQQHSGSGESDGSGSPTSDVMVQEMCQEPEMPWNENFMLG 

KCPSYEIDWASILS* 

>G1766 (32. .1216) 

AGGCTATTCTCGGAAAAACAAAGAATAAAGAATGAATTCGTTTTCACAAGTACCTCCTGG 
CTTCAGATTTCATCCTACTGATGAAGAACTTGTAGACTACTACTTGAGGAAAAAAGTTGC 
ATCAAAGAGAATAGAAATCGATATCATCAAGGATGTTGATCTTTACAAGATTGAGCCATG 
TGATCTTCAAGAGTTATGCAAGATAGGAAACGAAGAGCAGAGCGAATGGTACTTCTTTAG 
TCATAAAGACAAGAAGTATCCCACGGGAACTCGAACCAATAGAGCCACGAAAGCAGGATT 
TTGGAAAGCCACTGGAAGAGACAAGGCTATATATATAAGACATAGTCTTATCGGTATGAG 
GAAAACACTTGTGTTTTACAAAGGAAGAGCCCCAAATGGTCAGAAATCCGATTGGATCAT 
GCACGAATATCGCTTAGAAACAAGTGAAAATGGAACCCCTCAGGAAGAAGGATGGGTAGT 
ATGTAGGGTATTCAAGAAGAAATTGGCAGCGACAGTGAGGA7U\ATGGGAGATTACCATTC 
ATCACCATCGCAGCATTGGTACGATGATCAGCTCTCTTTTATGGCCTCCGAGATCATTTC 
TAGTTCTCCACGACAGTTTCTTCCCAATCATCATTATAACCGCCACCATCACCAGCAGAC 
ATTGCCTTGTGGCCTCAATGCATTCAACAACAACAATCCTAACTTGCAATGCAAGCAAGA 
GCTCGAGTTACATTACAATCAAATGGTACAACATCAACAACAAAACCATCATCTTCGTGA 
ATCTATGTTTCTCCAGCTTCCTCAGCTCGAAAGCCCTACCAGTAATTGCAATTCTGACAA 
CAACAATAACACAAGAAATATTAGTAACTTGCAGAAATCATCAAATATATCTCATGAGGA 
ACAATTGCAACAAGGGAATCAAAGTTTCAGCTCTCTGTATTACGATCAAGGAGTAGAGCA 
AATGACTACTGACTGGAGAGTTCTCGATAAATTTGTTGCTTCACAGCTTAGCAATGATGA 
AGAGGCTGCAGCCGTGGTTTCTTCTTCTTCTCATCAAAACAACGTCAAGATTGACACGAG 
AAACACGGGTTATCATGTGATAGATGAGGGAATAAATTTGCCGGAGAATGATTCTGAAAG 
GGTTGTTGAAATGGGAGAAGAGTATTCAAATGCTCATGCTGCTTCTACTTCTTCAAGTTG 
TCAGATTGATCTCTAGAAATAGTGATAGAGAGATGAAAAAGATGCAAGGTGAATATATAT 
GAAAATACATGCACACTAGTGTTATTTATACTTAAAGATGGAAGGGGAAAAACAAGGAGT 
TATTTCCTGGATTTATGGAGGTTTTGTACATAATAAAAACCTACAACCATATGGTATTTT 
CTTTTGAAAAAAAAAAAAAAAAAAAAAAAA 

>G1766 Amino Acid Sequence (domain in AA coordinates: 10-153) 

MNSFSQVPPGFRFHPTDEELVDYYLRKKVASKRIEIDIIKDVDLYKIEPCDLQELC 

EEQSEWYFFSHKDKKYPTGTRTNRATKAGFWKATGRDKAIYIRHSLIGmKTLVFYKGRA 

PNGQKSDWIMHEYRLETSENGTPQEEGWWCRWKKKIiAATvTlKM 

LSFMASEI I SS SPRQFLPNHHYNRHHHQQTLPCGLNAFNNl^ 

HQQQNHHLRESMFLQLPQLESPTSNCNSDNimm^ISl^QKSSNISHEEQI.QQGNQSFS 
SLYYDQGVEQMTTDWRVXDKFVASQLSN^ 
INLPENDSER WEMGEEYSNAHAASTS S S CQ IDL * 
>G1767 (1..1596) 

ATGGATACTCTCTTTAGACTAGTCAGTCTCCAACy^CAACAACAATCCGATAGTATCATT 
ACAAATCAATCTTCGTTAAGCAGAACTTCCACCACCACTACTGGCTCTCCACAAACTGCT 
TATCACTACAACTTYCCACAAAACGACGTCGTCGAAGAATC 

GAAGAAGACCTTTCCTCTTCTTCTTCTCACCACAACCATCACAACCACAACAATCCTAAT 
ACTTACTACTCTCCTTTCACTACTCCC^^ 

TCCTCCACCGCCGC^GCCGCAGCTTTAGCCTCGCCTTACTCCTCCTCCGGCCACCATAAT 
GACCCTTCCGCGTTCTCC^TACCTC^^CTCCTCCGTCCTTCGACTTCTCAGCCAATGCC 
AAGTGGGCAGACTCGGTCCITCTTGAAGCGGCACGTGCCTTCTCCGACAAAGACACTGCA 
CGTGCGCAACAAATCCTATGGACGCTCAACGAGCTCTCTTCTCCGTACGGAGACACCGAG 
CAAAAACTGGCTTCTTACTTCCTCCAAGCTCTCTTCAACCGCATGACCGGTTCAGGCGAA 
CGATGCTACCGAACCATGGTAACAGCTGCAGCCACAGAGAAGACTTGCrCCTTCGAGTCA 
ACGCGAAAAACTGTACTAAAGTTCCAAGAAGTTAGCCCCTGGGCCACGTTTGGACACGTG 
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GCGGCAAACGGAGCAATCTTGGAAGCAGTAGACGGAGAGGCAAAGATCCACATCGTTGAC 
ATAAGCTCCACGTTTTGCACTCAATGGCCGACTCTTCTAGAAGCTTTAGCCACAAGATCA 
GACGACACGCCTCACCTAAGGCTAACCACAGTTGTCGTGGCCAACAAGTTTGTCAACGAT 
CAAACGGCGTCGCATCGGATGATGAAAGAGATCGGAAACCGAATGGAGAAATTCGCTAGG 
CTTATGGGAGTTCCTTTCAAATTTAACATTATTCATCACGTTGGAGATTTATCTGAGTTT 
GATCTCAACGAACTCGACGTTAAACCAGACGAAGTCTTGGCCATTAACTGCGTAGGCGCG 
ATGCATGGGATCGCTTCACGTGGAAGCCCTAGAGACGCTGTGATATCGAGTTTCCGACGG 
TTAAGACCGAGGATTGTGACGGTCGTAGAAGAAGAAGCTGATCTTGTCGGAGAAGAAGAA 
GGTGGCTTTGATGATGAGTTCTTGAGAGGGTTTGGAGAATGTTTACGATGGTTTAGGGTT 
TGCTTCGAGTCATGGGAAGAGAGTTTTCCAAGGACGAGCAACGAGAGGTTGATGCTAGAG 
CGTGCAGCGGGACGTGCGATCGTTGATCTTGTGGCTTGTGAGCCGTCGGATTCCACGGAG 
AGGCGAGAGACAGCGAGGAAGTGGTCGAGGAGGATGAGGAATAGTGGGTTTGGAGCGGTG 
GGGTATAGTGATGAGGTGGCGGATGATGTCAGAGCTTTGTTGAGGAGATATAAAGAAGGT 
GTTTGGTCGATGGTACAGTGTCCTGATGCCGCCGGAATATTCCTTTGTTGGAGAGATCAG 
CCGGTGGTTTGGGCTAGTGCGTGGCGGCCAACGTAA 

>G1767 Amino Acid Sequence (domain in AA coordinates: 255-272) 

MDTLFRLVSLQQQQQSDSIITNQSSLSRTSTTTTGSPQTAYHYNFPQNDWEECFNFFMD 

EEDL S S S S S HHNHHNHNNPNTYYS P FTTP TQ YHPATS S TPS STAAAAALAS P YS S S GHHN 

DPSAFS I PQTPPS FDFS ANAKWADS VLLEAARAFSDKDTARAQQILWTLNELS S P YGDTE 

QKLASYFLQALFNRMTGSGERCYRT^m•AAATEKTCSFESTRKTVLKFQEVSPWATFGHV 

AANGAILEATOGEAKIHIVDISSTFCTQWPTLLEAIiATRSDDTPHLRLTlWVANKFVND 

QTASHRmKEIGNRMEKFARLMGVPFKFNIIHHVGDLSEFDLNELDVKPDEVLA 

MHGIASRGSPRDAVISSFRRLRPRIVTWEEEADLVGEEEGGFDDEFLRGFGECLRWFRV 

CFESWEESFPRTSNERLMLERAAGRAIVDLVACEPSDSTERRETARKWSRRMRNSGFGAV 

GYSDEVADDVRALLRRYKEGVWSMVQCPDAAGIFIiCWRDQPWWASAWRPT* 
>G1778 (1..627) 

ATGATGGGATACCAAACAAACTCTAATTTCTCCATGTTTTTTTCCTCGGAAAATGACGAC 

CAAAACCACCACAACTACGATCCTTATAATAATTTCTCTTCATCAACTTCTGTTGATTGC 

ACTCTCTCACTTGGAACACCCTCTACTCGTCTCGACGACCACCATAGATTTTCTTCTGCT 

AATTCTAACAACATCTCCGGCGACTTTTATATTCACGGAGGAAACGCTAAGACTTCTTCG 

TACAAGAAGGGTGGTGTTGCTCATAGCCTACCTCGCCGTTGTGCTAGCTGCGACACCACT 

TCAACTCCTCTATGGAGAAACGGACCAAAAGGACCTAAGTCGTTATGTAACGCGTGTGGA 

ATCCGATTCAAGAAAGAGGAGAGGCGTGCGACGGCCAGAAACTTAACGATCTCCGGTGGA 

GGTTCATCAGCGGCAGAAGTCCCAGTAGAGAATTCGTACAACGGAGGTGGAAACTATTAC 

AGTCATCATCATCATCACTATGCCTCGTCGTCGCCGTCGTGGGCTCATCAGAACACACAA 

AGAGTTCCATATTTCTCACCGGTTCCGGAGATGGAATATCCCTACGTGGATAACGTCACG 
GCTTCTTCTTTTATGTCTTGGAATTGA 

>G1778 Amino Acid Sequence (domain in AA coordinates : 94-119) 
MMGYQTNSNFSMFFSSENDDQNHHNYDPYNNFSSSTSVDCTLSLGTPSTRLDDH^ 
NSNNI SGDFY IHGGNAKTS S YKKGGVAHSLPRRCASCDTTSTPLWRNGPKGPKSLCNACG 
IRFKKEERRATARNLTI SGGGSSAAEVPVENS YNGGGNYYSHHHHHYAS S S PSWAHQNTQ 
RVPYFSPVPEMEYP YVDNVTAS S FMS WN* 
>G1789 (108.. 413) 

CAAGGACTCTGCGACATCTGTGCAACATATCATTTCCTCAGAATCTCTTTCTTTTCTAGG 

TTTATTACTACACAAAACCAAACATCATCAACTTTAGTTACTAAACAATGGC^ 

CAATGTCTTCTTATGGCTCTGGCTCATGGACTGTTAAGCAGAACAAAGCCTTTGAGCGTG 

CTCTAGC^GTCTATGACCAAGACACTCCGGACCGTTGGCACAATGTTGCTAGAGCTGT^ 

GTGGTAAAACACCA3AAGAAGCTAAGAGACAGTATGACCTTCTAGTTCGTGACATCGAAA 

GCATCGAGAATGGTCACGTGCCATTCCCTGACTACAAGACTACTACAGGAAACAGCAACA 

GAGGCAGGCTGCGTGATGAGGAAAAGAGGATGAGAAGCATGAAGCTGCAGTGAGACAAGA 

AGCAACAAAACCTAACTACGTATGATCGTCAAAATAAAAGAGAATCACTTCAGA 

TGTTTTTTTGAATGTCTGACGAATCAATGTTO 

TAAGAAATGGTTTTTTTTTCGAGGCAACAAAAAAAAAA 

>G1789 Amino Acid Sequence (domain in AA coordinates: 1-50) 
MASGSMSSYGSGSWTVKQNKAFERALAAnnDQDTPDRWHNVARAVGGKTPEEAKRQYDLLV 
RDIESIENGHVPFPDYKTTTGNSNRGRLRDEEKRMRSMKLQ* 
>G1790 (63.. 1346) 
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GAAAAAGACTTCACTTTTTTTTTTTACTAATTAATTAGTTTTTTTTTCTCCTTTCCAAAA 
CAATGGAGAATTTCGTCGACGAGAATGGTTTTGCTTCTCTAAACCAAAACATCTTCACAC 
GTGATCAAGAACACATGAAAGAAGAAGATTTTCCATTCGAAGTCGTCGACCAATCAAAAC 
CTACAAGCTTTCTTCAAGATTTTCACCATCTTGATCATGATCATCAGTTTGATCATCATC 
ATCATCATGGCTCCTCATCTTCACATCCTTTGCTCAGCGTCCAAACTACGTCTTCTTGTA 
TCAATAATGCTCCTTTCGAGCATTGCTCTTACCAAGAAAACATGGTCGATTTCTATGAAA 
CTAAACCAAATTTGATGAATCATCATCATTTCCAAGCAGTGGAAAACTCATACTTCACTC 
GTAATCATCATCATCATCAAGAGATCAATTTGGTCGATGAACATGATGATCCTATGGACT 
TGGAGCAAAACAACATGATGATGATGAGGATGATCCCTTTTGATTACCCTCCTACAGAGA 
CTTTCAAACCTATGAACTTCGTAATGCCAGATGAAATTTCATGTGTTTCTGCAGATAATG 
ATTGTTATAGAGCAACGAGTTTCAACAAGACCAAACCATTTCTTACACGAAAGTTGTCTT 
CTTCTTCTTCATCATCATCATGGAAAGAAACCAAAAAGTCAACCTTAGTCAAAGGACAAT 
GGACTGCTGAAGAAGACAGGGTACTGATTCAACTCGTGGAGAAGTATGGATTGCGTAAAT 
GGTCGCATATCGCTCAAGTGTTACCGGGAAGAATCGGGAAACAATGTAGAGAGAGGTGGC 
ATAACCATTTGAGACCTGACATTAAGAAAGAAACATGGAGTGAAGAAGAGGACAGAGTGT 
TGATAGAATTTCACAAAGAGATTGGAAACAAATGGGCAGAGATTGCGAAAAGACTCCCGG 
GAAGAACAGAGAACTCGATCAAGAACCATTGGAACGCAACAAAAAGAAGACAATTCTCTA 
T^AAGAAAATGTAGATCTAAGTATCCAAGACCTTCTCTGTTGCAGGATTACATCAAGAGCT 
TGAATATGGGAGCTTTGATGGCTTCTTCTGTTCCTGCAAGAGGTAGACGCAGAGAGAGTA 
ATAACAAGAAGAAGGATGTTGTTGTTGCGGTTGAGGAGAAGAAGAAGGAAGAGGAGGTGT 
ATGGACAAGACAGGATTGTGCCTGAATGTGTGTTTACTGATGATTTTGGATTCAATGAGA 
AGCTGCTTGAGGAAGGATGTAGCATTGACTCTTTGCTTGATGACATTCCTCAGCCTGACA 
TTGATGCTTTTGTTC^TGGGCTCTGATTTGTATTTTTTATTCTGCTTGTTTCAGTTTTGT 
TGTTTTTTGTTTGTCTTTTTATACGAGACAGATTCCACCAAACTTCAATAATTTGAAAAG 
ATATAAAATATTTTGCTTTTTAAAAAAAAAAAAAAAAAAAAAAA 

>G1790 Amino Acid Sequence (conserved domain in AA coordinates : 217-316) 
MENFTOENGFASLNQNIFTRDQEHMKEEDFPFEVVD 

HHGSSSSHPLLSVQTTSSCI1^APFEHCSYQENMVT>FYETKPNLMNHHHFQAVENSYFTR 
NHHHHQEINLVT)EHDDPMDLEQN^ 

CYI^TSFNKTKPFLTRKLSSSSSSSSWKETKKSTLVKGQWTAEEDRV^IQLv^KYGLRKW 

SHIAQVLPGRIGKQCRERWHNHLRPDIKKETWSE 

RTENSIKmiWNATKRRQFSKRKCRSKyPRPSLLQD^^ 

NKKKDVAA7AVEEKKKEEEWGQDRIVPECVFTDDFGFNEKLLEEGCSIDSLLDDIPQPDI 

DAFVHGL* 

>G1791 (36.. 455) 

ATGTACATGC^UU^AACAAAAACCTTAAAAGCTTTCATGGAACGTATAGAGTCTTATAA^ 
CGAATGAGATGAAATACAGAGGCGTACGAAAGCGTCCATGGGGAAAATATGCGGCGGAGA 
TTCGCGACTCAGCTAGACACGGTGCTCGTGTTTGGCTTGGGACGTTTAACACAGCGGAAG 
ACGCGGCTCGGGCTTATGATAGAGCAGCTTTCGGCATGAGAGGCCAAAGGGCCATTCTCA 
ATTTTCCTCACGAGTATCAAATGATGAAGGACGGTCCAAATGGCAGCCACGAGAATGCAG 
TGGCTTCCTCGTCGTCGGGATATAGAGGAGGAGGTGGTGGTGATGATGGGAGGGAAGTTA 
TTGAGTTCGAGTATTTGGATGATAGTTTATTGGAGGAGCTTTTAGATTATGGTGAGAGAT 
CTAACCAAGACAATTGTAACGACGCAAACCGCTAGATC^TCACTACTTACTTACAGTGTA 
ATGTTTTTGGAGTAAAGAGTAATAATCAATATAATATACTTTAGTTTAGGAAAAAAAAAA 
AAAAAAAAA 

>G179i Amino Acid Sequence (domain in AA coordinates: TBD) 

MERIESYNTNEMKYRGWKRPWGKYAAEIRDSARHGARWLGTFOTAED 

MRGQRAILNFPHEYQMMKDGPNGSHENAVASSSSGYRGGGGGDDGREVIEFEYLDDSLLE 

ELLDYGERSNQDNCNDANR* 

>G1793 (59.. 1783) 

AGTGATTTATTGATTAACCCAAACACAAAATAAACA 

GAATTCTAACAACTGGCTTGGCTTTCCTCTTTCACCGAACAACTCTTCT^ 

TGAATACAACCTITGGCTTGGTCAGCGACCATATGGACAACCCTTTTCAAAC^ 

GAATATGATCAATCCACACGGTGGAGGAGGAGATGAAGGAGGAGAGGTTCCAAT^AGTGGC 

CGATTTTCTCGGTGTGAGC^^CCGGACGAAAACCAATCCAACC^CCTAGTAGCTTAC^ 

CGACTC^GACTACTACTTCC^TACCAATAGCrrGATGCCTAGC^^ 

CGTTGTAGCAGCTTGTGACTCC7VATACTCCTAACAACAGTAGCTATCATGAGCTTCAAGA 



92 



BNSDOCIE* <WO_03013227A2_IA> 



WO 03/013227 



PCT/US02/25805 



GAGTGCTCACAATCTACAGTCACTTACTTTGTCCATGGGGACCACCGCTGGTAATAATGT 

TGTAGACAAAGCTTCACCATCCGAGACCACCGGGGATAACGCTAGCGGTGGAGCACTAGC 

CGTTGTTGAGACGGCCACGCCAAGACGTGCATTGGACACTTTCGGACAACGAACCTCGAT 

CTATCGTGGTGTCACAAGACATCGATGGACTGGTCGATATGAGGCTCATCTATGGGATAA 

TAGTTGTAGAAGGGAAGGCCAGTCTAGGAAAGGAAGACAAGTTTACTTGGGTGGATATGA 

CAAAGAAGATAAAGCAGCAAGATCATATGATCTAGCTGCACTTAAGTACTGGGGTCCTTC 

AACTACTACTAATTTCCCCATTACAAACTACGAGAAAGAAGTAGAGGAAATGAAGCACAT 

GACGAGACAAGAGTTCGTGGCTGCCATTAGAAGGAAAAGTAGTGGATTTTCGAGAGGCGC 

TTCGATGTATCGAGGAGTTACAAGGCATCACCAACATGGAAGATGGCAAGCAAGGATCGG 

CCGAGTCGCCGGAAACAAAGACCTCTACTTGGGAACTTTTAGCACTGAGGAAGAAGCAGC 

AGAAGCTTACGATATAGCTGCAATAAAGTTTAGAGGACTTAATGCAGTGACCAACTTCGA 

GATCAACCGGTACGACGTGAAAGCCATTCTAGAGAGTAGCACTCTTCCCATCGGAGGAGG 

CGCAGCTAAACGGCTCAAAGAAGCTCAAGCTCTTGAGTCTTCAAGGAAACGCGAGGCGGA 

GATGATAGCCCTTGGTTCAAGTTTCCAGTACGGTGGTGGCTCGAGCACAGGCTCTGGCTC 

CACCTCATCAAGACTTCAGCTTCAACCTTACCCTCTAAGCATTCAACAACCATTAGAGCC 

TTTTCTATCTCTTCAGAACAATGACATCTCTCATTACAACAACAACAATGCTCACGATTC 

CTCCTCTTTTAATCACCATAGCTATATCCAGACAC7^ACTTCATCTCCACCAACAGACCAA 

CAATTACTTGCAGCAACAGTCGAGCCAGAACTCTCAGCAGCTCTACAATGCGTATCTTCA 

TAGCAATCCGGCTCTGCTTCATGGACTTGTCTCTACCTCTATCGTTGACAACAATAATAA 

CAATGGAGGCTCTAGTGGGAGCTACAACACTGCAGCATTTCTTGGGAACCACGGTATTGG 

TATTGGGTCCAGCTCGACTGTTGGATCGACCGAGGAGTTTCCAACCGTTAAAACAGATTA 

CGATATGCCTTCCAGTGATGGAACCGGAGGGTATAGTGGTTGGACCAGTGAGTCTGTTCA 

GGGGTCAAACCCTGGTGGTGTTTTCACTATGTGGAATGAGTAAACAAGGATCTCTTTCTT 

GCGGCACAAGGAATGGGT 

>G1793 Amino Acid Sequence (conserved domain in AA coordinates : 179-255, 281- 
MNSNNWLGFPLSPNNSSLPPHEYNIXSLVSDH^NPFQTQEWNMINPHGGGGDEGGEV 
ADFLGVSKPDENQSNHLVAYNDSDYYFHTNSLMPSVQSNDVWAACDSNTPNNSSYHELQ 
ESAHNLQSLTLSMGTTAGNIW/DKASPSETTGD^ 

IYRGVTRHRWTGRYEAHLWDNSCRREGQSRKGRQVYLGGYDKEDKAARSYC 

S TTTNFP ITNYEKE VTSEMKHMTRQEFVAAIRRKSSGFSRGASM I 

GRVAGNKDLYLGTFSTEEEAAEAYDI AAI KFRGLNA VTNFE INRYDVKAI LESSTLP I GG 

GAAKRIiKEAQALESSRKREAEMIALGSSFQYGGGSSTGSGSTSSRLQLQPYPLSIQQPLE 

PFLSLQNNDISHYNNNNAHDS S S FNHHS YI QTQLHLHQQTNNYLQQQS SQNS QQLYNAYL 

HSNPALLHGLVSTSIVTDNNNNNGGSSGSYNTAAFLGNHGIGIGSSSTVGSTEEFPTVKTD 

YDMPSSDGTGGYSGWTSESVQGSNPGGVFTMWNE* 
>G1795 (27. .422) 

ACAAACACGCAAAAAGTCATTAATATATGGATCAAGGAGGTCGAGGTGTCGGTGCCGAGC 
ATGGAAAGTACCGGGGAGTTCGGAGACGACCTTGGGGAAAATATGCAGCAGAGATACGAG 
ATTCGAGGAAGCACGGTGAACGTGTGTGGCTTGGAACGTTCGATACGGCAGAGGAAGCGG 
CTAGAGCCTATGACCAAGCTGCTTACTCCATGAGAGGCCAAGCAGCAATCCTTAACTTCC 
CTCATGAGTATAACATGGGGAGTGGTGTCTCTTCTTCCACCGCCATGGCTGGATCTTCCT 
CCGCCTCCGCCTCCGCTTCTTCTTCTTCTAGGCAAGTTTTTGAATTTGAGTACTTGGATG 
ATAGTGTTTTGGAGGAGCTCCTTGAGGAAGGAGAGAAACCTAACAAGGGCAAGAAGAAAT 
GAGCGAGATATAATTCATGATTATTTCTAA 

>G1795 Amino Acid Sequence (domain in AA coordinates: 12-80) 
MDQGGRGVGAEHGKYRGVRRRPWGKYAAEIRDSRKHGERVWLGTFDTAEEAARAYDQAAY 
SMRGQAAILNFPHEYimGSGVSSSTJ^GSSSASASASSSSRQVFEFEYLDDSVLEELLE 
EGEKPNKGKKK* — 
>G1800 (61. -894) 

CCATTATCATATCCTeTTCTTCCTTCTTC^ 

ATGGAGAAATCATCCTCAATGAAACAATGGAAGAAGGGTCCTGCTCGGGGTAAAGGCGGT 
CCACAAAACGCTCTTTGTCAGTACCGTC 

GCTGAGATCAGAGAGCCCAAGAAGAGGGCAAGACTTTGGCTTGGCTCTTTCGCTACAGCT 
GAAGAAGCAGCTATGGCTTATGATGAGGCTGCCTTGAAACTCTATGGGCACGACGCATAC 
CTCAACTTACCTCATCTTCAGCGGAATACAA 

AAATCGGTACCTTCAAGGAAGTTTATATCTATGTTTCCTTCATGTGGTATGCTAAACGTG 
AATGCTCAGCCTAGTGTTCACATAATCCAGCAAAGACTAGAAGAACTCAAGAAAACTGGA 
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CTTTTATCTCAATCCTATTCTTCTAGTTCTTCCTCCACCGAATCAAAAACTAATACTAGC 
TTTCTTGATGAGAAGACCAGCAAGGGAGAAACAGACAATATGTTCGAAGGTGGTGATCAG 
AAGAAACCAGAGATCGACCTGACCGAGTTTCTTCAGCAACTAGGAATCTTGAAGGATGAA 
AATGAAG CAGAAC CAAGTGAGGTAGCAGAGTGTCATTCCC CTCC ACCATGGAACGAGCAA 
GAAGAAACTGGAAGTCCTTTCAGAACTGAGAATTTCAGCTGGGATACCCTGATCGAGATG 
CCAAGAAGTGAAACCACAACTATGCAATTTGACTCCAGCAACTTCGGAAGCTATGATTTT 
GAGGATGATGTATCCTTCCCTTCCATCTGGGACTACTACGGAAGCTTAGATTGAGTAAAA 
GCAATTTAAGGTAGATCAAGATTCAGAAGTACACAAATGGTTTTGGATTTAGTGTAGCGT 
TTTGGAAAAGAGACATAGGTAGTGAGAGTGCAGTCTTTTATTATGCAGCAATAAAGTGAG 

CGCTAAAAAAAAAAAAAAAAAAAAAAA 

>G1800 Amino Acid Sequence (domain in AA coordinates: TBD) 

MEKS S SM KQW KKG PARGKGGPQNALCQYRGVRQRTWGKWVAEI RE PKKRARLWLGSFATA 

EEAAMAYDEAALKIjYGHDAYLNLPHLQRNTRPSLSNSQRFK>A7PSRKFISMFPSCGML^ 

NAQPSVHIIQQRLEELKKTGLLSQSYSSSSSSTESKTNTSFLDEKTSKGETDNMFEGGDQ 

KKPEIDLTEFLQQLGILKDENEAEPSEVAECHSPPPWNEQEETGSPFRTENFSWDTLIEM 

PRSETTTMQFDSSNFGSYDFEDDVSFPSIWDYYGSLD* 

>G1806 (1..1356) 

ATGCAGAGCAGCTTCAAAACCGTTCCTTTCACTCCTGATTTCTACTCTCAATCCTCTTAC 
TTCTTCAGAGGAGATAGTTGTCTTGAGGAGTTTCATCAACCAGTCAATGGTTTTCACCAT 
GAAGAAGCTATCGATTTAAGTCCAAATGTCACTATTGCTTCAGCTAACTTACACTACACG 
ACGTTTGATACGGTTATGGATTGTGGTGGTGGTGGTGGTGGCTTGAGGGAGAGACTTGAA 
GGAGGAGAAGAGGAGTGTTTGGACACAGGGCAATTAGTGTACCAGAAAGGGACAAGATTA 
GTAGGAGGAGGAGTAGGAGAAGTGAACAGCAGTTGGTGTGATTCGGTTTCAGCTATGGCT 
GATAACAGTCAACATACTGACACTTCCACAGATATTGATACTGATGACAAGACTCAGTTG 
AATGGAGGTCATCAAGGGATGCTATTGGCTACAAATTGTTCAGATCAATCCAATGTGAAA 
TCTAGTGATCAAAGGACACTTCGTCGACTTGCTCAGAACCGGGAGGCTGCTAGGAAAAGT 
CGGTTGAGGAAAAAGGCCTATGTTCAGCAACTTGAGAATAGTCGAATCAGGCTTGCACAG 
CTAGAGGAAGAGCTCAAAAGAGCTCGCCAACAGGGATCTTTGGTTGAAAGAGGAGTTTCA 
GCGGATCACACGCATTTGGCAGCAGGAAATGGTGTCTTTTCATTTGAATTGGAATATACA 
CGTTGGAAGGAGGAACATCAAAGAATGATCAACGACTTAAGATCGGGTGTGAATTCGCAG 
TTAGGTGACAACGATCTACGCGTTCTAGTGGATGCTGTGATGAGTCACTATGATGAAATA 
TTCAGGCTAAAGGGAATTGGCACTAAAGTTGAAGTCTTTCATATGCTCTCAGGCATGTGG 
AAGACACCTGCCGAGAGATTTTTCATGTGGTTAGGTGGATTTAGATCATCAGAGTTACTT 
AAGATATTGGGGAACCATGTGGATCCATTGACGGACCAGCAGTTGATAGGCATTTGCAAC 
CTTCAGCAATCGTCTCAACAAGCAGAGGATGCATTGTCACAAGGCATGGAAGCTCTACAA 
CAATC^CTTCTCGAGACGCTTTCTTCTGCTTC 

GCAGATTATATGGGTCATATGGCTATGGCTATGGGCAAACTTGGCACTCTTGAAAACTTC 
CTTCGCCAGGCTGATTTATTGAGGCAACAAACTCTGCAACAGCTTCACAGAATTCTCACC 
ACACGACAAGCTGCTCGCGCCTTTTTGGTCATCCACGATTATATTTCTCGGCTTAGAGCA 
CTTAGCTCTCTATGGTTAGCCAGACCTAGAGACTAA 

>G1806 Amino Acid Sequence (domain in AA coordinates 165-225) 

MQSSFKTVPFTPDFYSQSSYFFRGDSCLEEFHQPVNGFHHEEAIDLSPNVTIASANLHYT 

TFDTVMDCGGGGGGLRERLEGGEEECLDTGQLVYQKGTRLVGGGVGEVNSSWCDSVSAMA 

DNSQHTDTSTDIDTDDKTQLNGGHQGMLIATNCSDQSNVKSSDQRTLRRLAQNREAARKS 

RLRKKAYVQQLENSRIRLAQLEEELKRARQQGSLVERGVSADHTHLAAGNGVFSFELEYT 

RWKEEHQRMINDLRSGVNS QLGDNDLRVLVDAVMSHYDE I FRLKG IGTKVEVFHMLSGMW 

KTPAERFFMWLGGF^SSELLKILGNHVDPLTDQQLIGIO^QQSSQQAEDALSQGMEALQ 

QSLLETLSSASMGPNSSAI^ADYMGHMAMAMGKLGTLENFLRQADLLRQQTLQQLHRIIiT 

TRQAARAFLVIHDYISRLRALSSLWLARPRD* 

>G1811 (93.. 827) 

AAAGGAGCATTGGTATCTCAAACAATATTTGCCCTTTCTCTATCTCTCTCTCA 

TTGCCATCTCTTTCTCTCTCCCTCTCT^^ 

TCCACTACCATTCTCT(^TGTGGCAACAAC^ 

TCGTGGAAGAAAAAGAAGCTCTTTTCGAGAAACCCITAACCCCAAGTGACGTCG 

tca^ccgcctcgtcatccca;uu\cagcacgccgagagatacttcccactagcggccgccg 
ccgcagacgccgtggagaaaggacrtctcctcrgctttgaggacgaggaaggtaaaccat 
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GGAGATTCAGATACTCGTACTGGAACAGTAGCCAGAGTTATGTCTTGACCAAAGGCTGGA 

GCAGATACGTCAAGGAGAAGCACCTTGACGCCGGAGACGTCGTTCTCTTCCATCGACACC 

GTTCAGACGGCGGAAGATTCTTCATTGGCTGGAGAAGACGCGGTGACTCTTCTTCCTCCT 

CCGACTCTTATCGCCATGTTCAATCCAATGCCTCGCTCCAATATTATCCTCATGCAGGGG 

CTCAAGCGGTGGAGAGCCAAAGAGGCAACTCGAAGACATTAAGACTGTTCGGAGTGAACA 

TGGAGTG CCAG CTAGATTCGGACTGGTCCGAG CCATCCACACCTGACGGTTCTAACACAT 

ATACAACCAATCACGACCAGTTTCATTTCTACCCTCAACAACAACACTATCCTCCTCCGT 

ACTACATGGACATAAGTTTCACAGGAGATATGAACCGGACGAGCTAGAAGCCCACAAGGA 

TTAAAAAAAAGCTTCACATCTGGTCCTGTTATGTTGTCATAGATGTTGATTCCTTAATTT 

TACACAAGCTTCATTTTGCATTATTTAAAGTAAAATCGTATTTTGATTCTTCTTTAAATC 

TCTCTCAATTTTCACTCTCTTCCTTTTTCTTCTTATGTATTAGATTCTTTTACATAGCTA 

ACACTTGTATAGAGAATTCAAAGTTCTGGCTATTTTCGAAAGTTATCTTTTCTCTTAAAA 
AAAAAlAA 

>G1811 Amino Acid Sequence (domain in AA coordinates: TBD) 
MSINQYSSDFHYHSLMWQQQQQQQQHQNDWEEKEALFEKPLTPSDVGICLNRLVIPKQHA 
ERYFPLAAAAADAVEKGLLLCFEDEEGKPWRFRYSYW1ISSQSYVLTKGWSRYVKEKHLDA 
GDWLFHRHRSDGGRFFIGWRRRGDSSSSSDSYRHVQSNASLQYYPHAGAQAVESQRGNS 

KTLRLFGVNMECQLDSDHSEPSTPDGSNTYTTNHDQFHFYPQQQHYPPPYYMDISFTGDM 
NRTS* 

>G182 (74.. 1366) 

CGTCGACGATCAGATTCTTGCGTATAGCTGTATATATACACCAAGATACACTCATCATCG 

TCATATATAGATTATGTGCAGCGTCTCTGAGCTTCTTGACATGGAAAACTTCCAAGGAGA 

CTTAACCGACGTCGTACGAGGAATCGGAGGCCACGTGTTATCACCGGAGACTCCTCCCTC 

GAACATCTGGCCTCTTCCTCTGTCACATCCAACACCATCACCGTCAGATCTTAACATAAA 

CCCCTTCGGAGATCCCTTTGTGAGCATGGACGATCCACTCCTCCAAGAACTAAACTCCAT 

CACAAACTCCGGCTATTTCTCCACCGTAGGAGATAACAACAACAACATTCACAACAACAA 

TGGTTTCTTGGTTCCAAAGGTATTTGAGGAGGATCATATAAAGAGTCAATGTAGTATCTT 

CCCAAGAATCCGGATCTCGCATAGTAACATCATCCACGATTCTTCTCCGTGTAATTCTCC 

GGCCATGTCGGCTCACGTTGTCGCAGCCGCAGCAGCCGCCTCGCCGAGAGGCATCATCAA 

CGTAGACACAAACAGTCCTAGAAACTGTCTATTGGTTGATGGTACCACGTTCTCCTCGCA 

GATTCAGATATCTTCCCCTCGGAATCTAGGCCTTAAAAGAAGGAAGAGTCAGGCAAAGAA 

GGTGGTGTGTATTCCGGCCCCGGCTGCAATGAACAGCCGATCAAGCGGAGAAGTGGTTCC 

ATCGGATCTATGGGCTTGGCGTAAATACGGTCAAAAACCTATCAAAGGCTCTCCTTTTCC 

AAGGGGTTATTATAGATGCAGCAGCTCAAAAGGTTGTTCAGCAAGAAAGCAAGTCGAAAG 

AAGCCGAACCGATCCAAACATGTTGGTGATTACATATACCTCCGAACATAACCATCCTTG 

GCCCATCCAACGCAACGCTCTCGCCGGCTCCACACGCTCCTCCACCTCCTCCTCATCTAA 

CCCTAATCCTTCCAAACCCTC^VACCGCAAACGTAAACTCCTCATCCATTGGCTCCCAAAA 

CACCATCTACTTGCCTTCCTCCACCACTCCTCCTCCTACCCTCTCATCCTCCGCCATCAA 

AGATGAACGAGGGGACGATATGGAGTTGGAAAACGTAGATGATGATGATGATAACCAGAT 

TGCTCCATACAGACCGGAGCTTCATGATCATCAGCACCAACCAGATGATTTCTTTGCAGA 

TCTTGAAGAGCTAGAAGGAGATTCTCTAAGCATGTTGCTTTCTCATGGCTGTGGCGGCGA 

CGGGAAGGATAAAACGACCGCGTCCGATGGGATCAGCAATTTCTTCGGGTGGTCGGGAGA 

TAATAATTATAATAATTACGACGACCAAGACTCAAGGTCGTTATAGTATAGTGTTAATTA 

CAGGTAAACAAATTATATTAAATTAAGTTGAGCTTGTGAAAATGAAGATCATATGGTCTG 
GTCAGGTTGGGGGC 

>G182 Amino Acid Sequence (conserved domain in AA coordinates -217-276) 

MCSVSELLDMENFQGDLTDWRG1GGHVLSPETPPSNIWPLPLSHPTPSPSDLNINPFGD 

PFVSMDDPLLQELNSITNSGYFSTVGDNNNNIHNNWGFLVPKVFEEDHIKSQCSIFPRIR 

ISHSN1IHDSSPCWSPAMSAHWAAAAAASPRGI1NVDTNSPRNCLLVDGTTFSSQIQIS 

SPRNLGLKRRKSQAKKWCIPAPAAMNSRSSGEWPSDLWAWRKYGQKPIKGSPFPRGYY 

RCSSSKGCSARKQVERSRTDPNMLVITYTSEHNHPWPIQRNALAGSTRSSTSSSSNPNPS 

KPSTANVNSSSIGSQNTIYLPSSTTPPPTLSSSAIKDERGDDMEliENVDDDDDNQIAPYR 

^^HQHQPDDFFADLEELEGDSLSMI^HGCGGDGKDKTTASDGISNFFGWSGDNNYN 

>G1835 (1..969) 

ATGATTGGAACAAGCTTCCCCGAGGATCTTGATTGTGGCa^CTTCTTTGACAAGATGGAT 
GATCTCATGGACTTTCCCGGTGGAGATATCGATGTCGGTTTCGGCATAGGTGACTCCGAC 
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TCTTTCCCTACCATCTGGACCACTCATCACGACACGTGGCCTGCCGCTTCTGATCCTCTC 
TTCTCTTCCAACACCAACTCTGATTCATCACCTGAGCTCTATGTTCCGTTTGAGGACATT 
GTTAAGGTGGAAAGACCTCCAAGCTTTGTAGAGGAAACATTGGTTGAGAAGAAGGAAGAT 
TCGTTTTCGACAAACACTGATTCATCATCTTCTCATAGCCAATTCAGGAGCTCAAGTCCA 
GTGTCGGTTCTCGAGAGCAGCTCCTCCTCGTCTCAAACCACCAACACAACCTCCCTTGTT 
CTCCCTGGAAAGCACGGTCGTCCACGCACAAAACGCCCTCGTCCACCTGTCCAGGATAAA 
GATAGAGTCAAAGACAATGTGTGCGGTGGTGACTCGCGCCTCATCATTAGAATACCGAAA 
CAGTTTCTCTCTGATCACAACAAGATGATCAACAAGAAGAAGAAGAAGAAGGCCAAGATT 
ACTTCTTCCTCTTCTTCGTCCGGGATTGATCTTGAAGTCAATGGAAACAACGTCGATTCG 
TATTCTTCAGAGCAATATCCGCTTAGGAAATGTATGCACTGTGAGGTCACCAAGACTCCA 
CAGTGGAGGCTTGGTCCAATGGGTCCAAAGACACTTTGCAATGCGTGCGGTGTACGTTAC 
AAATCAGGGAGGCTTTTCCCGGAGTACCGTCCAGCTGCTAGTCCAACATTTACTCCAGCT 
CTTCACTCAAACTCACACAAGAAAGTGGCTGAAATGAGAAACAAGAGATGCAGTGATGGT 
AGCTACATAACCGAAGAGAATGATCTGCAAGGGCTGATTCCGAACAATGCCTACATTGGC 
GTAGACTAA 

>G1835 Amino Acid Sequence (domain in AA coordinates: 224-296) 

MIGTSFPEDLDCGNFFDNMDDLMDFPGGDIDVGFGIGDSDSFPTIWTTHHDTWPAASDPL 

FSSNTNSDSSPELYVPFEDIVKYERPPSFVEETLVEKKEDSFSTNTDSSSSHSQFRSSSP 

VSVLESSSSSSQTTNTTSLVLPGKHGRPRTKRPRPPVQDKDRVKDNVCGGDSRLIIRIPK 

QFLSDHNKMINKKKJOCKAKITSSSSSSGIDLEW^ 

QWRLGPMGPKTLCNACGTOYKSGRLFPEYRPAASPTFTPALHSNSHKKVAEMRNKRCSDG 

SYITEENDLQGLIPNNAYIGVD* 

>G1836 (47.. 610) 

ATAACAAGCCTAGAACACTAGAAACTTCAAAAAAGAAAAAAATCTTATGGAGAACAACAA 
CGGCAACAACCAGCTGCCACCGAAAGGTAACGAGCAACTGAAGAGTTTCTGGTCAAAAGA 
GATGGAAGGTAACTTAGATTTCAAAAATCACGACCTTCCTATAACTCGTATCAAGAAGAT 
TATGAAGTATGATCCGGATGTGACTATGATAGCTAGTGAGGCTCCAATCCTCCTCTCGAA 
AGCATGTGAGATGTTTATCATGGATCTCACGATGCGTTCGTGGCTCCATGCTCAGGAAAG 
CAAACGAGTCACGCTACAGAAATCTAATGTCGATGCCGCAGTGGCTCAAACTGTTATCTT 
TGATTTCTTGCTTGATGATGACATTGAGGTAAAGAGAGAGTCTGTTGCCGCCGCTGCTGA 
TCCTGTGGCCATGCCACCTATTGACGATGGAGAGCTGCCTCCAGGAATGGTAATTGGAAC. 
TCCTGTTTGTTGTAGTCTTGGAATCCACCAACCACAACCACAAATGCAGGCATGGCCTGG 
AGCTTGGACCTCGGTGTCTGGTGAGGAGGAAGAAGCGCGTGGGAAAAAAGGAGGTGACGA 
CGGAAACTAATAAGTGGAATACGTTTTAGGGTATTTTCAAGGGAATATGTAGTAAATAGT 
CATGGATC 

>G1836 Amino Acid Sequence (domain in AA coordinates: 30-164) 
MENNNGNNQLPPKGNEQLKS FWSKEMEGNLDFKNHDLPI TRI KKIMKYDPDVTMI ASEAP 
ILLSKACEMFIMDLTMRSWLHAQESKRVTLQKSNVDAAVAQTVIFDFLLDDDIEVKRESV 

AAAADPVAMPPIDDGELPPGMVIGTPVCCSLGIHQPQPQMQAWPGAWTSVSGEEEEARGK 
KGGDDGN* 

>G1838 (132.. 1628) 

TTCCTTGGCATTCTCTTTAGAACTTTCGTACAAAATGCAAAACC 

AAAAAAAAGATTAGAGACTGTAACTGCTTTTATCAGATTTTCAACTAGGAAAAAAGTTAC 

AATCTTTTTTGATGGCTCCTCCAATGACGAATTGCTTAACGTTTTCTCTGTCACCAATGG 

AGATGTTGAAATCAACTGATCAGTCTCACTTCTCTTCTTCTTACGACGATTCT 

CTTATCTCATCGATAACITCTATGCTTTCAAAGAAGAAGCTGAGATAGAAGCTGCTGCTG 

CTTCAATGGCGGATTCAAC^CCTTATCTACTTTTTTCGATCAT 

CAAAGCTGGAAGATTTCCTCGGTGATTCCTTTGTCCGTTACTCTGATAACCAAACAGAGA 
CCCAAGACTCTTCTTCTCTCACTCCATTCTACGATCCACGTCACCGCACCGTTGCCGAAG 
GAGTTACAGGGTTCTTCTCTGATCATCATCAGCCAGATTTCAAGACGATAAACTCGGGAC 
CAGAAATCTTCGATGACTCAACAACTTCCAACATCGGTGGTACTCATCTCTCCAGTCACG 
TGGTGGAGTCATCAACGACGGCGAAGTTAGGGTTTAACGGTGATTGCACCACCACCGGAG 
GAGTTTTGTCTCTAGGGGTTAACAACACATCAGATC 

AGAGAGGTGGAAACAGTAACT^GAAGAAAACAGTTTCTAAGAAGGAAACATCAGATGATT 
CAAAGAAGAAGATTGTCGAAACATTGGGACAAAGAACTTCAATTTATCG 
GACATAGATGGACTGGAAGATACGAAGCGCATCTATGGGATAACAGCTGTAGGAGGGAAG 
GTCAAGCCAGAAAAGGACGTC7U\GTGTACTTAGGTGGATATGACAAGGAAGATAGAGCAG 
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CTAGAGCCTATGACTTGGCAGCTTTAAAATACTGGGGTTCTACTGCTACTACAAATTTTC 
CGGTCTCGAGTTATTCAAAAGAACTTGAGGAAATGAATCACATGACCAAGCAAGAGTTTA 
TTGCATCTCTTAGGAGGAAAAGTAGCGGTTTTTCGAGAGGAGCTTCAATATATAGAGGTG 
TCACAAGGCATCATCAACAAGGTCGCTGGCAAGCAAGAATCGGCCGTGTCGCAGGAAACA 
AAGATCTTTACCTCGGAACCTTTGCAACCGAAGAGGAAGCAGCAGAGGCTTATGACATTG 
CAGCCATAAAGTTCAGAGGAATCAACGCAGTAACTAACTTTGAGATGAACAGGTATGACA 
TTGAAGCTGTCATGAATAGTTCTTTACCTGTAGGAGGAGCAGCTGCGAAACGCCACAAAC 
TCAAACTCGCTCTTGAATCTCCTTCTTCATCATCCTCTGACCATAACCTCCAACAACAAC 
AGTTGCTTCCGTCCTCTTCTCCCTCGGATCAAAACCCTAACTCAATCCCATGTGGCATTC 
CATTTGAGCCTTCAGTTCTCTATTACCACCAGAACTTCTTTCAGCATTATCCTTTGGTCT 
CTGACTCTACAATTCAAGCTCCTATGAACCAAGCTGAGTTTTTCTTGTGGCCTAACCAGT 
CTTACTAAATCATTTGGTTCGTTCTTGCTTAGACTTCTATTCACCGCACTAACCGATGAC 
CCGAGGCTTATCTTCTTGATTCTGGCTATAAGGATGAATCTTTCAAGTTCCTTTTTTAAC 
TGTAGGCTAAGACAGAAGTAGAGGGGAGAAAAGTTGAAGAATCTGAAACTTTTGGGGTCA 
ATTTTGTATTAATGTTTTTCTTTTGTCAAGGGTGGATTATCGGTTTTATTACTTATTTTT 
TGAATGTAATCGGCCTATAACGGTATAACTCTGTTTCCATTTATGAATATTTTTCTCAAA 
TTGAAAAAAAAAAAAAAAAAA 

>G1838 Amino Acid Sequence (conserved domain in AA coordinates : 229-305 , 330-400) 

MAPPMTNCLTFSLSPMEMLKSTDQSHFSSSYDDSSTPYLIDNFYAFKEEAEIEAAAASMA 

DSTTLSTFFDHSQTQIPKLEDFLGDSFVRYSDNQTETQDSSSLTPFYDPRHRTVAEGVTG 

FFSDHHQPDFKTINSGPEIFDDSTTSNIGGTHLSSHWESSTTAKLGFNGDCTTTGGVLS 

LGVNNTSDQPLS CNWGERGGNSNKKKTVSKKETSDDSKKKIVETLGQRTS I YRGVTRHRW 

TGRYEAHLWDNS CRREGQARKGRQ VYLGGYDKEDRAARAYDLAALKYWGSTATTNFPVS S 

YSKELEEMNHMTKQEFIASLRRKSSGFSRGASIYRGVTRHHQQGRWQARIGRVAGNKDLY 

LGTFATEEEAAEAYDIAAIKFRGINAVTNFEMl^YDIEAVMNSSLPVGGAAAKRH^ 

LESPSSSSSDHNIiQQQQLLPSSSPSDQNPNSIPCGIPFEPSVIiYYHQNFFQHYPLVSDST 

IQAPMNQAEFFLWPNQSY* 

>G1843 (51.. 653) 

CAGACATCACAATCAAATTAGGTCAGAAGAATTAGTCGGAGAAAACAGCCATGGGAAGAA 
GAAAAGTAGAGATCAAACGAATTGAGAACAAAAGCTCTCGACAAGTTACTTTCTGTAAAC 
GACGAAATGGTCTCATGGAGAAAGCTCGTCAACTCTCAATTCTTTGTGAATCCTCCGTCG 
CTCTTATCATCATCTCTGCCACCGGAAGACTCTACAGCTTCTCCTCAGGTGATAGCATGG 
CCAAGATCCTCAGTCGTTATGAATTAGAACAGGCTGATGATCTTAAAACCTTGGATCTAG 
AAGAAAAAACTCTTAATTATCTTTCGCACAAGGAGTTGCTAGAAACAATCCAATGCAAGA 
TTGAAGAAGCGAAAAGCGATAATGTAAGTATAGATTGTCTAAAGTCCCTGGAAGAGCAGC 
TCAAGACTGCTCTGTCTGTAACTAGAGCTAGGAAGACAGAACTAATGATGGAGCTTGTGA 
AGACCCATCAAGAGAAGGAGAAGCTGCTGAGAGAGGAGAACCAGAGTTTGACTAACCAGC 
TTATAAAGATGGGGAAGATGAAGAAGTCTGTGGAAGCAGAGGATGCAAGAGCAATGTCAC 
CGGAAAGTAGCTCTGACAACAAGCCACCGGAGACTCTCCTGCTTCTCAAGTAACCACCAT 
CACCAACGACTGATTCGAAAAATAAAAATTGTAAAAATTATGATTTGTAGTTCATAAGGA 
AAGCTACATACTGTATGTTAAAAATCCTCTTCTTCCCCCTGCTACGGAAAAGTCATCCAA 
GGAGATGCATCAAATAAAGTAATTGATTTTTATTGTTA 

>G1843 Amino Acid Sequence (domain in AA coordinates: 2-57) 

MGRRKVEIKRIENKSSRQVTFCKRRNGLMEKARQLS ILCES SVALI I ISATGRLYSFSSG 

DSMAKILSRYELEQADDLKTLDLEEKTLNYLSHKELLETIQCKIEEAKSDNVSIDCLKSL 

EEQLKTALSVTRARKTELMMELVKM^ 

AMSPESSSDNKPPETLLLLK* 

>G1853 (1. .186^) 

ATGAGAGGTTCTTGGTACAAGAGTGTTTCCTCTGTTTTTGGTCTCAGACC^CGGATCAGA 

GGGTTGTTATTCTTCATTGTTGGTGTTGTGGCTCTAGTTACTATTTTAGCACCATTGACA 

TCTAATTCGTATGATTCTTCGTCAAGTTCGACACTTGTGCCGAACATTTATAGTAACTAT 

AGGAGGATAAAGGAGCAAGCTGCTGTTGATTATCTTGATCTGAGGTCTCTTTCTTTAGGG 

GCTAGTTTAAAAGAGTTTCCTTTTTGTGGTAAAGAAAGAGA 

AACATAACTGGGAATTTGCTTGCTGGGCTTCAAGA 

GAGTTTGAAAGAGAGAAGGAAAGATGTGTAGTTCGTCCTCCGAGAGATTATAAAATACCA 
CTTAGGTGGCCACTTGGTAGAGATATCATATGGAGTGGGAACGTGAAGATTACCAAAGAC 
CAGTTTCTTTCTTCAGGAACTGTGACAACGAGGTTAATGTTGCTTGAAGAGAATCAAATA 
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ACCTTTCACTCGGAGGACGGCCTGGTCTTTGATGGGGTCAAAGACTATGCTCGTrAAnTT 

GCTGAGATGATAGGTTTAGGAAGTGATACTGAArTTGCTCAtG^ 

TTAGACATTGGTTGCGGATTTGGTAGCTTTGGTGCTCATTTAG^ 

g^cttcctgcaatgattggcaatttcttttc^cagcttcctStcSS 

TTTGATATGGTCCATTGTGCrCAATGTGG(^CTACTTGGGATATCAAAG 
™GGAAGTGGATCGTGTTC T GAAACCCGGGGGA™^^^ 

AACAAAGCACAGGGAAACTTACCAGATACCAAGAAAACGAGCATCTCAACMG^TGMT 

gagttatctaagaaaatctgttggagtctaacagctcagcaggSga^S 



aaagatggagatagcgttccgtattaccacccattggttccatgStSg?gS 



TG 



55S5SS5SSSS5SSS 

a^cgtcgtcccagtcaatgcacgtaatactcttcctatcatacttgatcg^ 
ggtgttctacatgactggtgtgaaccattcccgacatatcctcgaaSa^^^^ 

TTGGAGATGGACCGGATTCTTCGCCCTGAGGGATGGGTTGTTCTAAGTOACAAMTOrpa 



^IKEQAAVDYLDLRSLSLGASLKEFPFCGKERESYVPCTOITG^^GLQEGEE^R^ 



>G1855 (1..1902) 

======= 



GGTGACCGCTOAGArrCCCTGGTGGTGGTACTATG^TCCrCGTGGAS 



Si™™- S5S==2££~ 

^CAGAGGAGGATTTGAAGAAAGAGCAAGAl 



tgtataacaccattaccacsaaacaaacaatccagatgactSgca^ 
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ATGAACGCGGAGAAATTTAGAGAAGACAACGAGGTTTGGAAAGAGAGAATAGCACATTAC 
AAGAAGATAGTCCCTGAGCTTTCACATGGAAGATTCAGGAACATTATGGACATGAACGCT 
TTTCTCGGCGGATTCGCTGCTTCCATGCTGAAATATCCCTCATGGGTCATGAACGTTGTC 
CCGGTCGATGCAGAGAAACAAACGTTAGGTGTGATCTACGAACGTGGATTGATAGGGACG 
TATCAAGATTGGTGTGAAGGATTCTCAACGTATCCAAGAACTTATGATATGATTCATGCA 
GGAGGATTGTTCAGCTTATACGAACATAGGTGTGATTTGACGTTGATATTGTTGGAGATG 
GATCGAATTTTGAGACCAGAAGGAACAGTTGTGTTGAGAGATAATGTGGAGACGTTGAAT 
AAGGTAGAGAAGATAGTGAAGGGAATGAAGTGGAAGAGTCAAATTGTTGATCATGAGAAA 
GGTCCTTTTAATCCTGAGAAGATTCTTGTTGCTGTTAAAACTTATTGGACTGGTCAACCT 
TCTGACAAGAACAACAACAACAACAACAACAACAACAACTAG 

>G1855 Amino Acid Sequence (domain in AA coordinates: entire protein) 
MAKENSGHHHQTEARRKKLTLILGVSGLCILFYVLGAWQANTVPSSISKLGCETQSNPSS 
SSSSSSSSESAELDFKSHNQIELKETNQTIKYFEPCELSLSEYTPCEDRQRGRRFDRNMM 
KYRERHCPVKDELLYCLIPPPPNYKIPFKWPQSRDYAWYDNIPHKELSVEKAVQNWIQVE 
GDRFRFPGGGTMFPRG ADAY I DD I ARL I PLTDGG I RTAI DTGCGVAS FG AYLLKRD I MAV 
SFAPRDTHEAQVQFALERGVPAI IGIMGSRRLPYPARAFDLAHCSRCLI PWFKNDGLYLM 
EVDRVLRPGGYWILSGPPIl^KQYWRGWERTEEDLKKEQDSIEDVAKSLCWKKVTEKGDL 
SIWQKPLNHIECKKLKQNNKSPPICSSDNADSAWYKDLETCITPLPETNNPDDSAGGALE 

DWPDRAFAVPPRI IRGTIPEMNAEKFREDNEVWKER IAHYKKI VPELSHGRFRNIMDMNA ' 

FLGGFAASMLKYPSWVMNVVPVDAEKQTLGVIYERGLIGTYQDWCEGFSTYPRTYDMIHA 

GGLFSLYEHRCDLTLILLEMDRILRPEGTVVLRDNVETLNKVEKIVKGMKWKSQIVDHEK 

GPFNPEKILVAVKTTWTGQPSDKNNTO^ * 
>G187 (118.. 1074) 

TAGACCTCTTAGGAAAAAAACCTAAAAACCTAATCCCCAAACCTAAAAGGCTTATCTCAT 

CTCTTCTTCTTTGTCTTCTTTACTCTTTTTTTACCTCTCTCTTCATTGTTCTTCACCATG 

TCTAATGAAACCAGAGATCTCTACAACTACCAATACCCTTCATCGTTTTCGTTGCACGAA 

ATGATGAATCTGCCTACTTCAAATCCATCTTCTTATGGAAACCTCCCATCACAAAACGGT 

TTTAATCCATCTACTTATTCCTTCACCGATTGTCTCCAAAGTTCTCCAGCAGCGTATGAA 

TCTCTACTTCAGAAAACTTTTGGTCTTTCTCCCTCTTCCTCAGAGGTTTTCAATTCTTCG 

ATCGATCAAGAACCGAACCGTGATGTTACTAATGACGTAATCAATGGTGGTGCATGCAAC 

GAGACTGAAACTAGGGTTTCTCCTTCTAATTCTTCCTCTAGTGAGGCTGATCACCCCGGT 

GAAGATTCCGGTAAGAGCCGGAGGAAACGAGAGTTAGTCGGTGAAGAAGATCAAATTTCC 

AAAAAAGTTGGGAAAACGAAAAAGACTGAGGTGAAGAAACAAAGAGAGCCACGAGTCTCG 

TTTATGACTAAAAGTGAAGTTGATCATCTTGAAGATGGTTATAGATGGAGAAAATACGGC 

CAAAAGGCTGTAAAAAATAGCCCTTATCCAAGGAGTTACTATAGATGTACAACACAAAAG 

TGCAACGTGAAGAAACGAGTGGAGAGATCGTTCCAAGATCCAACGGTTGTGATTACAACT 

TACGAGGGTCAACACAACCACCCGATTCCGACTAATCTTCGAGGAAGTTCTGCCGCGGCT 

GCTATGTTCTCCGCAGACCTCATGACTCCAAGAAGCTTTGCACATGATATGTTTAGGACG 

GCAGCTTATACTAACGGCGGTTCTGTGGCGGCGGCTTTGGATTATGGATATGGACAAAGT 

GGTTATGGTAGTGTGAATTCAAACCCTAGTTCTCACCAAGTGTATCATCAAGGGGGTGAG 

TATGAGCTCTTGAGGGAGATTTTTCCTTCAATTTTCTTTAAGCAAGAGCCTTGATCGATC 

ATTGTTATAACTACATATATTATATATATTGAGAGAGAGAGGTAGAGAAAAAAAAA 

>G187 Amino Acid Sequence (domain in AA coordinates: 172-228) 

MSNETRDLYNYQYPSSFSLHEMMNLPTSNPSSYGNLPSQNGFNPSTYSFTDCLQSSPAAY 

ESLLQKTFGLS PS S SEVFNS S IDQEPNRDVTNDVINGGACNETETRVS PSNS S S SE ADHP 

GEDSGKSRRKRELVGEEDQI S KKVGKTKKTEVKKQREPRVSFMTKSEVTJHLEDGYRWRKY 

GQKAVKNSPYPRSYYRCTTQKCNVKKRVERSFQDPTWITTYEGQHl^ 

AAMFSADLMTPRSFAHDMFRTAAYTNGGSVAAALDYGYGQSGYGSVNSNPSSHQVYHQGG 

EYELLREIFPSIFFKQEP* 
>G1881 (1. .519) • 

ATGCGAATTTTGTGTGATGCTTGTGAGAGCGCCGCCGCTATCGTCTTTTGCGCCGCCGAC 
GAAGCTGCCCTCTGTTGCTCCTGCGACGAAAAAGTTCATAAGTGCAACAAGCTGGCTAGT 
CGGCATCTTCGTGTAGGCTTAGCTGATCCGAGTAATGCACCAAGCTGTGACATATGCGAA 
AATGCACCCGCATTCTTTTACTGTGAGATAGATGGTAGTTCCCTTTGTCTACAATGTGAT 
ATGGTGGTACATGTTGGTGGGAAGAGAACACATAGGCGGTTTCTATTACTGAGACAGAGA 
ATTGAGTTTCCAGGCGATAAGCCTAATCATGCTGACCAACTGGGACTACGGTGTCAAAAG 
GCTTCCTCTGGTCGTGGTCAAGAATCAAATGGGAATGGTGATCATGATCATAATATGATC 
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GATCTTAACTCCAATCCTCAAAGAGTACACGAGCCTGGATCACATAACCAAGAGGAGGGT 
ATTGATGTAAATAACGCAAACAATCACGAGCATGAATAG 

>G1881 Amino Acid Sequence (domain in AA coordinates : 5-28, 56-79) 

MRILCDACESAAAIVFCAADEAALCCSCDEKVHKCNKLASRHLRVGLADPSNAPSCDICE 

NAPAFFYCEIDGSSLCLQCDMWHVGGKRTHRRFLLLRQRIEFPGDKPNHADQLGLRCQK 

ASSGRGQESNGNGDHDHNMIDLNSNPQRVHEPGSHNQEEGIDWNANNHEHE* 

>G1882 (1..1200) 

ATGGTTTTTTCTTCATTTCCTACTTATCCTGATCATTCATCAAACTGGCAACAACAACAT 

CAACCAATCACAACCACCGTTGGATTCACGGGAAATAACATCAACCAACAGTTTCTTCCT 

CACCATCCCCTCCCACCGCAACAGCAACAAACGCCTCCGCAGCTTCACCACAACAACGGT 

AACGGCGGAGTCGCTGTTCCCGGTGGACCTGGCGGGTTAATCCGACCAGGTTCGATGGCG 

GAAAGAGCAAGGCTAGCCAACATACCATTACCTGAAACAGCCTTGAAGTGTCCAAGATGT 

GACTCAACTAACACCAAATTCTGTTACTTCAACAACTACAGTCTCACTCAACCTCGCCAC 

TTCTGCAAAGCATGCCGTCGTTACTGGACACGTGGCGGTGCTCTAAGGAGCGTTCCCGTC 

GGTGGCGGTTGCCGTAGAAACAAAAGAACCAAAAACAGCAGCGGTGGAGGTGGCGGTAGC 

ACCAGTAGCGGTAACAGCAAGTCACAAGACAGCGCCACGAGCAACGACCAATACCACCAC 

CGAGCCATGGCTAACAATCAGATGGGACCACCTTCTTCGTCATCGTCTCTAAGCTCGTTG 

CTGTCTTCTTACAACGCAGGGTTAATCCCCGGACATGATCATAACAGCAATAACAACAAC 

ATACTTGGACTTGGATCATCTTTGCCTCCTCTTAAGCTTATGCCTCCTTTAGACTTCACA 

GACAACTTCACCTTACAATACGGTGCCGTTTCAGCTCCTTCTTATCATATAGGCGGTGGA 

AGCAGTGGAGGAGCGGCGGCTCTTTTAAACGGTTTTGACCAGTGGAGATTCCCGGCAACA 

AACCAACTTCCTTTAGGCGGTTTAGACCCGTTTGATCAACAACATCAAATGGAGCAGCAG 

AATCCAGGTTACGGATTGGTTACCGGGTCGGGTCAGTATCGACCTAAGAACATTTTCCAT 

AACCTTATCTCCTCTTCTTCGTCTGCTTCATCAGCTATGGTTACAGCCACCGCGTCGCAA 

TTAGCTTCAGTGAAAATGGAAGATAGTAACAATCAGCTCAACTTGTCTAGACAACTTTTT 

GGAGACGAACAACAGCTCTGGAATATTCATGGCGCTGCTGCAGCATCCACCGCAGCTGCA 

ACAAGTTCGTGGAGTGAAGTCTCTAATAATTTCAGTTCTTCTTCTACTAGCAATATATAA 

>G1882 Amino Acid Sequence (domain in AA coordinates : 97-125) 

MVFSSFPTYPDHSSNWQQQHQPITTTVGFTGNNINQQFLPHHPLPPQQQQTPPQLHHNNG 

NGGVAVPGGPGGL I RPGSMAERARLANI PIiPETALKCPRCDSTNTKFCY FNNYSLTQPRH 

FCKACRRYWTRGGALRSVPVGGGCRRNKRTKNSSGGGGGSTSSGNSKSQDSATSNDQYHH 

RAMAl^QMGPPSSSSSLSSLLSSYWAGLIPGHDHNSNl^ILGIiGSSLPPLKLMPPLDFT 

DNFTLQYGAVSAPSYHIGGGSSGGAAALLNGFDQWRFPATNQLPLGGLDPFDQQHQMEQQ 

NPGYGLVTGSGQ YRPKNI FHNL I S S S SS AS S AMVTATASQLAS VKMEDSNNQLNLSRQLF 

GDEQQLWNIHGAAAASTAAATSSWSEVSNNFSSSSTSNI * 

>G1883 (1..1110) 

ATGGACGCTACGAAGTGGACACAGGGTTTTCAAGAAATGATGAACGTTAAACCAATGGAG 
CAGATCATGATTCCTAATAACAACACACATCAACCAAACACCACATCCAATGCAAGGCCA 
AACACCATTCTCACATCTAACGGCGTCTCAACTGCTGGAGCAACCGTCTCCGGCGTAAGC 
AACAACAATAACAATACGGCGGTTGTGGCGGAGAGGAAAGCAAGACCACAAGAGAAACTA 
AATTGTCCAAGATGCAACTCAACCAAGA.CAAAG'^^ 

ACACAACCAAGATACTTCTGCAAAGGTTGTCGAAGGTATTGGACCGAAGGTGGATCTCTT 

AGGAATGTTCCTGTGGGAGGAAGCTCAAGAAAGAACAAGAGATCATCTTCATCTTCTTCA 

TCAAACATCCTTCAGACAATACCATCTTCACTTCCAGATCTAAACCCGCCAATACTCTTC 

TCAAACCAAATCCATAATAAATCGAAAGGGTCATCACAAGATCTCAACTTGTTGTCTTTC 

CCAGTCATGCAAGATCAACATCATCATCATGTCCATATGTCTCAGTTTCTTCAGATGCOT 

AAGATGGAGGGAAATGGTAACATAACTCATCAGCAGCAGCCTTCATCATCTTCTTCTGTC 

TATGGTTCCTCGTCQTCTCCTGTTTCAGCTCTTGAACTTTTAAGAACCGGAGl^AATGTT 

TCTTCAAGATCAGGGATTAACTCATCGTTCATGCCTTCCGGTTCAATGATGGAOT 

ACTGTGCTTTACACTTCTTCAGGGTTTCCAACAATGGTGGATTACAAGCC^ 

TCCTTCTCTACCGATCATCAAGGGCTTGGACACAATAGCAACAATAGGTC 

CATAGTGATCATCACCAACAAGGTAGAGTTTTGTTTCCATTTGGGGATCAAA^ 

CTTTC^TCAAGCATAACAC^GAAGTTGATCATGATGATAATCAACAACAGAAGAGTCAT 

GGAAATAATAATAATAATAATAACTCAAGCCCTAATAATGGATATTGGAGTGGGATGTTC 

AGTACTACAGGAGGAGGATCTTCATGGTGA 

>G1883 Amino Acid Sequence (domain in aa coordinates: 82-124) 
MDATKWTQGFQEMMNVKPMEQIM^ 
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NNNNNTAVVAERKARPQEKLNCPRCNSTNTKFCYYNTJYSLTQPRYFCKGCRRYWTEGGSL 

RNVPVGGSSRKNKRSSSSSSSNILQTIPSSLPDLNPPILFSNQIHNKSKGSSQDLNLLSF 

PVMQDQHHHHVHMSQFLQMPKMEGNGNITHQQQPSSSSSVYGSSSSPVSALELLRTGVNV 

SSRSGINSSFMPSGSMMDSNTVLYTSSGFPTMVDYKPSNLSFSTDHQGLGHNSNNRSEAL 

HSDHHQQGRVLFPFGDQMKELSSSITQEVDHDDNQQQKSHGl^^ 

STTGGGSSW* 

>G1884 (1..741) 

ATGATGACGTCATCCCATCAGAGCAACACCACCGGCTTTAAACCGCGGCGGATCAAGACG 
ACGGCGAAGCCACCACGTCAGATCAATAACAAAGAACCATCTCCGGCGACGCAGCCGGTG 
CTCAAGTGTCCGAGATGTGATTCAGTCAACACCAAATTCTGCTACTACAACAACTACAGC 
TTGTCTCAGCCACGTCACTACTGCAAGAACTGTCGTCGTTACTGGACACGTGGCGGCGCC 
CTCCGTAACGTTCCCATCGGTGGCTCCACTCGAAACAAGAACAAGCCTTGCAGCCTCCAA 
GTCATCTCTTCTCCTCCTTTGTTCTCGAACGGGACGTCATCGGCGTCTCGTGAGCTTGTA 
AGAAACCATCCATCGACGGCAATGATGATGATGAGTTCTGGTGGATTCTCCGGCTATATG 
TTTCCGTTGGATCCTAACTTCAACCTTGCCTCGTCTTCTATCGAGTCTTTGAGTTCTTTT 
AACCAAGATTTGCACCAGAAGCTTCAGCAACAAAGACTCGTCACTTCCATGTTTCTCCAA 
GATTCTCTTCCGGTTAACGAGAAAACGGTTATGTTTCAGAACGTAGAGTTGATTCCTCCT 
TCGACGGTGACGACGGATTGGGTTTTCGATAGGTTCGCCACTGGAGGAGGTGCAACAAGT 
GGCAATCATGAAGATAATGATGATGGGGAGGGTAATTTGGGAAATTGGTTCCATAATGCT 
AATAATAATGCTCTGCTCTAA 

>G1884 Amino Acid Sequence (domain in AA coordinates : 43-71) 
MMTSSHQSNTTGFKPRRIKTTAKPPRQINNKEPSPATQPVLKCPRCDSVNTKFCYYNNYS 
LSQPRHYCKNCRRYWTRGGALRNVPIGGSTRNKNKPCSLQVISSPPLFSNGTSSASRELV 
RNHPSTAMMMMSSGGFSGYMFPLDPNFNIASSSIESLSSFNQDLHQKLQQQRLVTSM 

DSLPVNEKTVMFQNVELIPPSTVTTDWVFDRFATGGGATSGNHEDNDDGEGNLGNWFHNA 
NNNALL* 

>G1891 (1..750) 

ATGGATAACTTGAATGTTTTCGCAAATGAAGACAATCAAGTGAATGATGTGAAGCCCCCA 
CC^CCACCACCTCGAGTGTGTGCAAGGTGTGATTCTGATAATACTAAATTTTGTTATTAC 
AACAACTACTGTGAGTTTCAGCCACGATACTTCTGCAAGAACTGTCGTAGATACTGGACT 
CATGGTGGGGCTTTAAGAAACATACCAATTGGTGGAAGTAGTCGTGCCAAACGGGCAAGG 
GTAAATCAACCTTCGGTTGCTCGGATGGTTTCTGTTGAGACCC^CGAGGTAAC^TCAA 
CCTTTCTCTAATGTTCAAGAAAACGTTCATCTTGTTGGATCTTTTGGTGCTTCATCTTCA 
TCTTCTGTTGGTGCTGTTGGGAACCTTTTTGGTTCTTTGTATGATATTCATGGTGGTATG 
GTAACAAATTTGCATCCAACTCGAACTGTTCGACCAAATCATCGCTTAGCTTTCCATGAT 
GGATCATTTGAGCAAGACTATTACGATGTTGGGTCCGATAATCTTTTGGTCAACCAACAA 
GTTGGTGGCTACGGTTATCACATGAATCCAGTGGATCAATTC7VAGTGGAACCAGAGCTTC 
AACAACACTATGAACATGAATTATAATAACGATAGCACTAGTGGAAGTAGCAGAGGATCT 
GACATGAATGTGAACCATGATAACAAGAAGATCAGATACCGCAACTCTGTGATTATGCAT 
CCTTGTCATCTGGAGAAGGATGGTCCTTGA 

>G1891 Amino Acid Sequence (domain in aa coordinates: 27-69) 
I^NLNVFTtfTEDNQVNDVKP^ 

HGGALRNI P I GGSSRAKRARVNQPS VARMVS VETQRGNNQPFSNVQENVHLVGS FGAS S S 

SSVGAVGNLFGSLYDIHGGMVTNLHPTRTVRPNHRLAFHDGSFEQDYYD 

VGGYGYHMNPVDQFKWNQSFNNTMN^^ 

PCHLEKDGP* 

>G1896 CU.951) 

ATGTCCTCCCATAe€AATCTCCCCTCTCCCAAACCAGTTCCTAAACCAGATCACCGTATC 
TCCGGTACATCCCAAACCAAGAAACCACCGTCTTCCTCCGTAGCTCAAGACCAAGAA7VAC 
CTAAAATGCCCTCGTTGCAACTCTCCAAACACAAAGTTCTGTTACTACAAC^ACTACAGT 
CTCTCTCAACCTCGTCACTTCTGCAAATCTTGTCGCCGTTACTGGACACGTGGCGGTGCT 
CTAAGAAACGTCCCCATCGGTGGTGGTTGCCGGAAAACCAAAAAATCTATCAAACCTAAT 
TCCTCCATGAACACACTTCCTTCGTCTTCTO 

GAAGATTCATCCAAATTCTTCCCTCCTCCGACAACAATGGATTTTCAGCTGGCCGGATTA 
TCTCTCAACAAAATGAACGATCTTCAACTTTTGAATAAC 

CCCATGATGTCCTCGGGCCGAGAAAACACACCCGTTGATGTCGGGTCGGGTTTATCCCTA 
ATGGGTTTTGGAGATTTCAACAACAACCATTCACCGACGGGGTTCACAACCGCCGGAGCA 
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AGCGACGGAAACTTAGCTTCTTCTATAGAGACTTTGAGTTGTTTAAACCAAGATTTACAC 

TGGAGGCTTCAGCAACAGAGGATGGCGATGCTTTTTGGTAATTCTAAGGAAGAAACTGTT 

GTCGTCGAGAGGCCACAACCTATTCTTTATCGGAATCTTGAGATCGTAAACTCATCATCG 

CCGTCGTCGCCGACGAAGAAAGGAGATAATCAGACAGAGTGGTATTTTGGTAATAACAGT 

GATAATGAAGGAGTGATTAGTAATAATGCTAATACAGGAGGAGGAGGAAGTGAATGGAAC 

AATGGAATTCAAGCTTGGACTGATCTTAATCATTATAATGCATTGCCTTGA 

>G1896 Amino Acid Sequence (domain in aa coordinates: 43-85) 

MSSHTNLPSPKPVPKPDHRISGTSQTKKPPSSSVAQDQQNLKCPRCNSPNTKFCYYNNYS 

LSQPRHFCKSCRRYWTRGGALRNVPIGGGCRKTKKSIKPNSSMNTLPSSSSSQRFFSSIM 

EDSSKFFPPPTTMDFQLAGLSLNIO^INDLQLLNNQEVLDLRPMMSSGRENTPVDVGSGLSL 

MGFGDFNNNHSPTGFTTAGASDGNLASSIETLSCLNQDLHWRLQQQRMAMLFGNSKEETV 

VVERPQPILYRm,EITOSSSPSSPTKKGDNQTEWYFGNNSDNEGVISNNANTGGGGSEWN 
NGIQAWTDLNHYNALP* 

>G1898 (1..630) 

ATGCCGTCGGAACCAAACCAAACCCGACCCACCAGAGTTCAGCCCTCAACGGCGGCTTAC 
CCACCGCCAAATCTGGCTGAGCCTCTTCCTTGTCCTCGCTGCAACTCCACCACCACCAAG 
TTCTGTTACTACAACAACTATAACCTCGCTCAGCCTCGCTACTACTGCAAATCTTGCCGC 
CGTTACTGGACTCAAGGTGGTACACTCCGTGACGTCCCCGTCGGTGGTGGAACTCGTCGA 
AGCTCCTCAAAACGTCACCGTTCTTTCTCCACCACTGCCACCTCCTCTTCCTCCTCTTCT 
TCCGTCATCACCACCACGACACAAGAACCAGCCACGACTGAAGCGAGTCAAACTAAGGTT 
ACTAATTTAATTTCAGGTCATGGAAGCTTTGCTTCTCTGTTAGGTTTAGGAAGTGGAAAT 
GGTGGGTTGGATTACGGGTTTGGGTACGGGTACGGGCTTGAGGAGATGAGTATTGGGTAT 
CTTGGAGATTCTTCCGTAGGAGAGATTCCGGTGGTTGATGGTTGTGGTGGTGACACGTGG 
CAGATTGGGGAGATTGAAGGTAAAAGTGGAGGAGACAGTTTGATATGGCCTGGTCTTGAG 
ATCTCAATGCAAACCAACGATGTTAAGTGA 

>G1898 Amino Acid Sequence (domain in AA coordinates: 31-59) 
MPSEPNQTRPTRVQPSTAAYPPPNLAEPLPCPRCNSTTTKFCYYNIJYlSniiAQPRYYCKSCR 
RYWTQGGTLRDVPVGGGTRRSS SKRHRSFSTTATS SS SS S S VITTTTQEPATTEASQTKV 

TOLISGHGSFASLLGLGSGNGGLDYGFGYGYGLEEMSIGYLGDSSVGEIPWDGCGGDTW 

QIGEIEGKSGGDSLIWPGLEISMQTNDVK* 

>G1902 (1..615) 

ATGCAGGATCCAGCAGCATATTACCAGACGATGATGGCGAAGCAACAACAAGAACAAC^ 

CCACAGTTTGCAGAGCAAGAACAGTTAAAGTGTCCTCGTTGTGACTCACCAAACACTAAA 

TTCTGTTACTACAACAACTACAATCTCTCACAGCCTCGTCACTTTTGCAAAAGCTGTCGT 

CGTTACTGGACTAAAGGCGGCGCTCTCCGTAACGTTCCCGTCGGTGGTGGTTCTCGTAAG 

AACGCAACCAAACGATCCACTTCTTCTTCTTCTTCTGCTTCCTCTCCTTCCAACAGTAGC 

CAAAACAAGAAGACGAAAAACCCGGATCCGGATCCTGATCCACGTAATTCTCAAAAACCG 

GATTTGGATCCGACCCGGATGCTTTACGGGTTTCCGATCGGTGACCAAGACGTGAAGGGT 

ATGGAGATTGGTGGAAGCTTTAGCTCGTTGTTGGCGAATAATATGCAGCTTGGTCTTGGA 

GGAGGAGGGATCATGCTTGACGGGTCGGGTTGGGATCATCCGGGTATGGGTTTGGGTTTG 

AGGAGAACCGAACCGGGTAATAATAATAATAACCCATGGACCGATCTGGCTATGAACAGA 
GCGGAGAAAAACTGA 

>G1902 Amino Acid Sequence (domain in AA coordinates : 31-59) 

MQDPAAYYQTMMAKQQQQQQPQFAEQEQLKCPRCDSPNTKFCYYN1TYNLSQPRHFCKSCR 

RYWTKGGALRNVPVGGGSRKNATKRSTSSSSSASSPSNSSQNKKTKNPDPDPDPRNSQKP 

DIJDPTRMLYGFPIGDQDVKGMEIGGSFSSLLANNMQLGLGGGGIMLDGSGWDHPGMGLGL 

RRTEPGNNWNNPWTDLAMNRAEKN* 

>G1904 (1..924* 

ATGCAAGATATTCATGATTTCTCCATGAACGGAGTTGGTGGTGGGGGAGGAGGAGGAGGG 
AGGTTTTTCGGTGGAGGAATCGGCGGCGGAGGAGGTGGTGATCGAAGGATGAGAGCTCAT 
CAGAACAATATACTTAACCATCATCAATCTCTCAAGTGTCCTCGTTGTAATTCTCTTAAC 
ACAAAGTTCTGTTACTAC^CAATTACAATCTTTCTCAGCCTCGTCACTTTTGCAAGAAC 
TGTTOTCGTTACTGGACTAAAGGTGGTGTTCTCCGTAACGTTCCCGTCGGAGGTGGTTGC 
CGGAAAGCTAAACGTTCGAAAACAAAACAGGTTCCGTCGTCGTCATCAGCCGACAAACCA 
ACGACGACGCAAGATGATCATCACGTGGAGGAGAAATCGAGTACAGGATCTCACTCTAGC 
AGCGAGAGCTCTTCTCTCACCGCTTCTAACTCTACCACCGTCGCCGCCGTCTCCGTCACC 
GCGGCGGCGGAAGTTGCTTCGTCGGTTATTCCAGGTTTTGATATGCCTAATATGAAAATT 
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TACGGTAACGGGATCGAGTGGTCGACGTTACTTGGACAAGGCTCATCGGCCGGTGGTGTT 
TTCTCGGAGATCGGTGGTTTTCCGGCGGTTTCAGCTATTGAAACTACACCGTTTGGATTC 
GGGGGTAAATTCGTAAATCAAGATGATCATCTGAAGTTAGAAGGTGAAACTGTACAGCAG 
CAACAGTTTGGAGATCGAACGGCTCAGGTTGAGTTTCAAGGAAGATCTTCGGATCCGAAT 
ATGGGATTTGAACCGTTGGATTGGGGAAGTGGCGGTGGAGATCAAACACTGTTTGATTTA 
ACCAGTACCGTTGATCATGCATACTGGAGTCAAAGTCAATGGACGTCGTCTGACCAAGAT 
CAGAGTGGTCTCTACCTTCCTTGA 

>G1904 Amino Acid Sequence (domain in aa coordinates: 53-95) 

MQD I HDF S MNGVGGGGGGGGRF FGGG IGGGGGGDRRMRAHQNN I LNHHQS LKCPRCN SLN 

TKFCYYNNYNLSQPRHFCKNCRRYWTKGGVLRNVPVGGGCRKAKRSKTKQVPSSSSADKP 

TTTQDDHHVEEKSSTGSHSSSESSSLTASNSTTVAAVSVTAAAEVASSVIPGFDMPNMKI 

YGNGIEWSTLLGQGSSAGGVFSEIGGFPAVSAIETTPFGFGGKFVNQDDHLKLEGETVQQ 

QQFGDRTAQVEFQGRSSDPNMGFEPLDWGSGGGDQTLFDLTSTVDHAYWSQSQWTSSDQD 

QSGLYLP* 

>G1906 (1..795) 

ATGGTGGAACGTGCTCGGATCGCAAAAGTCCCATTGCCTGAAGCAGCTCTAAATTGCCCT 
AGATGTGACTCAACCAATACTAAGTTCTGTTACTTCAATAACTATAGCCTTACTCAACCT 
CGCCATTTCTGCAAAACATGTCGTCGCTATTGGACACGTGGCGGTTCCTTGAGGAATGTT 
CCTGTTGGAGGAGGCTTTAGGAGGAACAAGAGAAGCAAATCCAGATCGAAATCTACGGTC 
GTGGTCTCGACTGATAATACTACTAGTACTTCATCACTTACTTCTCGCCCAAGTTACTCA 
AACCCTAGCAAGTTTCATAGCTACGGTCAAATCCCGGAGTTTAATTCCAACTTGCCCATC 
TTGCCTCCTCTCCAAAGCCTTGGAGATTACAATTCAAGCAACACTGGATTAGATTTTGGT 
GGAACTCAAATAAGCAACATGATAAGTGGTATGAGTTCTAGTGGTGGGATCTTGGATGCA 
TGGAGAATACCTCCATCACAACAAGCTCAGCAATTCCCTTTCTTGATCAACACTACCGGA 
TTGGTGCAATCTTCAAACGCGTTATATCCATTACTAGAAGGCGGGGTTAGCGCCACGCAA 
ACAAGAAATGTGAAGGCGGAAGAGAATGATCAGGATCGGGGTAGGGATGGGGATGGAGTG 
AATAACTTATCAAGAAACTTTTTGGGTAATATCAACATAAACTCAGGCAGGAACGAGGAA 
TACACATCATGGGGAGGTAACAGTTCTTGGACCGGTTTCACCTCCAACAACTCAACAGGC 
CATCTCTCATTCTAA 

>G1906 Amino Acid Sequence (domain in AA coordinates : 19-47) 
MV^RARIAKOTLPEAALNCPRCDSTNTKFCT 

PVGGGFRRNKRSKSRSKSTVWSTDNTTSTSSLTSRPSYSNPSKFHSYGQIPEFNSNLPI 

LPPLQSLGDYNSSNTGLDFGGTQISNMISGMSSSGGILDAWRIPPSQQAQQFPFLINTTG 

LVQSSNALYPLLEGGVSATQTRNVKAEENDQDRGRDG 

YTS WGGNS S WTGFTSNNSTGHLS F * 

>G1913 (1..744) 

ATGGAGAGAGCAGAGGCCTTGACATCATCGTTTATATGGCGGCCAAACGCAAACGCAAAC 
GCGGAGATCACGCCGAGTTGTCCAAGATGTGGATCCTCTAACACAAAGTTCTGTTACTAC 
AACAACTATAGCCTCACTCAGCCTCGCTACTTCTGCAAAGGCTGCCGCAGATATTGGACC 
AAAGGTGGTTCCCTCCGCAATGTTCCTGTAGGCGGTGGCTGTCGAAAATCCCGCCGCCCC 
AAATCATCTTCTGGTAACAATACTAAAACTAGCCTAACCGCTAATTCTGGCAACCCCGGT 
GGTGGTTCACCAAGCATCGATCTTGCTCTTGTTTACGCCAATTTCTTGAATCCAAAGCCT 
GACGAATCTATACTACAAGAAAATTGCGACTTAGCCACTACGGATTTTTTGGTAGATAAT 
CCTACCGGCACTTCCATGGACCCTTCATGGAGTATGGACATCAATGATGGTCATCATGAT 
CATTATATTAATCCGGTGGAACACATTGTGGAGGAATGTGGTTATAATGGCTTGCCTCCA 
TTTCCTGGTGAAGAGCTTCTCTCTTTAGACACTAATGGTGTTTGGTCTGATGCTTTGTTG 
ATTGGTCATAACCATGTAGACGTTGGCGTGACTCCGGTTCAGGCTGTACACGAACCGGTG 
GTTCATTTCGCTGACGAATCCAATGATTCCACCAATCTCTTGTTTGGAAGTTGGAGCCCT 
TTTGATTTCACTGCCGATGGATGA 

>G1913 Amino Acid Sequence (domain in AA coordinates: 27-55) 

I^RAEALTSSFIWRPNANANAEITPSCPRCGSSOTKFCYYNl^SLTQPRYFCKGCRRYWT 

KGGSLRNVPVGGGCRKSRRPKSSSGIJNTKTSLTANSGNPGGGSPSIDLALVYANFLNP^ 

DESILQENOTI^TTDFLVDNPTGTSMDPSWSMDINDGHHDHYINPVEHIVEECGYN 

FPGEELLSLDTNGWSDALLIGHNHVDVGVTPVQAVHEPVVHFADESNDSTNLLFGSWSP 

FDFTADG* 

>G1914 (1..945) 

ATGGAGAGATACAAGTGTAGATTTTGCTTCAAGAGCTTCATCAATGGAAGAGCTTTAGGT 
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GGTCACATGAGATCTCACATGCTTACTCTTTCTGCAGAACGTTGTGTAATAACTGGTGAA 
GCAGAAGAAGAAGTAGAGGAACGGCCGAGTCAACTCTGTGACGACGACGACGATACCGAG 
TCCGATGCTTCTTCTTCTTCTGGTGAGTTTGATAATCAAAAGATGAATCGTCTTGATGAT 
GAATTGGAGTTTGATTTCGCTGAAGACGACGACGTTGAAAGTGAAACCGAGTCGTCCAGG 
ATTAACCCAACTCGGCGACGATCTAAACGAACTCGGAAACTTGGATCGTTTGATTTCGAC 
TTTGAGAAGCTAACAACGAGCCAACCCAGTGAGTTAGTGGCCGAGCCAGAGCATCACAGC 
TCAGCTTCTGATACAACAACGGAGGAAGATCTCGCCTTTTGTCTCATTATGCTGTCCAGA 
GACAAATGGAAGCAACAGAAGAAGAAGAAGCAACGTGTAGAAGAAGATGAGACAGATCAT 
GACAGTGAAGATTACAAATCAAGCAAGAGCAGAGGGAGATTCAAGTGTGAGACTTGTGGT 
AAAGTGTTTAAATCGTATCAAGCATTAGGAGGACACAGAGCAAGCCACAAGAAGAACAAG 
GCATGCATGACGAAAACAGAGCAAGTTGAAACAGAGTACGTTCTTGGAGTAAAGGAGAAG 
AAAGTTCATGAATGTCCGATCTGTTTTAGGGTTTTTACTTCAGGGCAAGCACTTGGAGGT 
CATAAGAGATCTCACGGAAGTAACATCGGAGCAGGAAGAGGATTGTCAGTAAGTCAAATT 
GTCCAAATCGAAGAAGAAGTATCAGTGAAACAGAGGATGATTGATCTTAATCTTCCTGCA 
CCTAATGAAGAAGATGAAACTTCTTTGGTGTTTGATGAATGGTGA 

>G1914 Amino Acid Sequence (domain in AA coordinates : 195-216 , 245-266) 

MERYKCRFCFKSFINGRAIiGGHMRSHMLTLSAERCVITGEAEEEVEERPSQLCDDDDDTE 

SDASSSSGEFDNQKMNRLDDELEFDFAEDDDVESETESSRINPTRRRSKRTRKLGSFDFD 

FEKLTTSQPSELVAEPEHHSSASDTTTEEDLAFCLIMLSRDKWKQQKKKKQRVEEDETDH 

DSEDYKSSKSRGRFKCETCGKVFKSYQALGGHRASHKKNKACMTKTEQVETEYVLGVKEK 

KVHECPICFRVFTSGQALGGHKRSHGSNIGAGRGLSVSQIVQIEEEVSVKQRMIDLNLPA 

PNEEDETSLVFDEW* 

>G1925 (1..945) 

ATGGAAGAAAATCTTCCTCCGGGGTTCAGATTTCATCCTACAGACGAGGAGCTCATAACG 

CATTATCTATGTCGGAAAGTCTCCGATATAGGATTCACCGGTAAAGCTGTCGTCGACGTT 

GATCTCAACAAGTGTGAACCTTGGGATTTGCCAGCCAAGGCTTCAATGGGAGAGAAAGAG 

TGGTATTTCTTCAGCCAAAGGGATCGGAAATATCCAACCGGTTTAAGAACAAACCGGGCA 

ACAGAAGCTGGTTACTGGAAAACCACCGGGAAAGATAAAGAAATATACCGAAGTGGAGTG 

TTGGTTGGGATGAAGAAAACCCTAGTTTTCTACAAAGGAAGAGCTCCCAAAGGTGAGAAA 

AGCAATTGGGTTATGCATGAGTACAGGCTTGAGAGCAAACAACCTTTCAACCCCACGAAT 

AAGGAGGAATGGGTAGTGTGTAGGGTTTTCGAAAAGAGCACGGCAGCAAAGAAAGCACAA 

GAACAACAACCTCZAATCTTCTCAACCATCTTTTGGATCTCCATGCGATGCAAACTCATCA 

ATGGCAAATGAGTTTGAAGATATTGATGAGCTTCCGAATCTGAATTCAAACTCATCAACC 

ATCGATTACAATAATCATATCCATCAATATTCGCAACGCAATGTTTACTCAGAAGACAAC 

ACAACAAGTACGGCTGGTCTCAACATGAACATGAACATGGCTAGTACTAATCTTCAGTCT 

TGGACAACAAGTCTCCTTGGTCCGCCTTTATCTCCAATCAACTCTTTGTTGCTCAAGGCT 

TTCCAAATC^GGAACTCTTATAGTTTCCCAAAAGAGATGATCCCCAGTTTCAATGA 

TCTCTTCAACAAGGAGTCTCCAATATGATCCAAAATGCTTCAAGTTCGTCT 

CCCCAACCGCAAGAGGAAGCGTTTAATATGGACTCCATATGGTGA 

>G1925 Amino Acid Sequence (conserved domain in AA coordinates : 6-150) 

MEENLPPGFRFHPTDEELITHYLCRKVSDIGFTGKAWDV^ 

WYFFSQRDRKYPTGLRTNRATEAGYWKTTGKDKEIYRS 

SNWVMHEYRLESKQPFNPTNKEEWWCRV^ 

MANEFEDIDELPNLNSNSSTIDYNNHIHQYSQRlT\re 

VJTTSLLGPPLSPINSLLLKAFQIRNSYSFPKEMIPSFNHSSLQQGVSNMIQNASSSSQVQ 
PQPQEEAFNMDS IW* 
>G1929 (1..366) 

ATGTGTAGAGGCTTGAATAATGAAGAGAGCAGAAGAAGTGACGGAGGAGGTTGCCGGAGT 
CTCTGCACGAGACCGAGTGTTCCGGTAAGGTGTGAGCTTTGCGACGGAGACGCCTCCGTG 
TTCTGTGAAGCGGACTCGGCGTTCCTCTGTAGAAAATGTGACCGGTGGGTTCATGGAGCG 
AATTTTCTAGCTTGGAGACACGTAAGGCGCGTGCTATGCACTTCTTGTCAGAAACTCACG 
CGCCGGTGCCTCGTCGGAGATCATGACTTCCACGTTGTTTTACCGTCGGTGACGACGGTC 
GGAGAAACC^CCGTGGAGAATAGAAGTGAACAAGATAATCATGAGGTTCCGTTTGTTTTT 
CTCTGA 

>G1929 Amino Acid Sequence (domain in AA coordinates : 31-53) 
MCRGLNNEESRRSDGGGCRSLCTRPSVPWC^LCDGDASWCEADSAFLCRKCDRWVHGA 

NFIAWRHVRRvIjCTSCQKLTRRCLVGDHDFHVVLPS 
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L* 

>G1930 (76.. 1077) 

ATTCACATTACTAATCTCTCAAGATTTCACAATTTTCTTGTGATTTTCTCTCAGTTTCTT 
ATTTCGTTTCATAACATGGATG CCATGAGTAG CGTAGACGAGAGCTCTACAACTACAGAT 
TCCATTCCGGCGAGAAAGTCATCGTCTCCGGCGAGTTTACTATATAGAATGGGAAGCGGA 
ACAAGCGTGGTACTTGATTCAGAGAACGGTGTCGAAGTCGAAGTCGAAGCCGAATCAAGA 
AAGCTTCCTTCTTCAAGATTCAAAGGTGTTGTTCCTCAACCAAATGGAAGATGGGGAGCT 
CAGATTTACGAGAAACATCAACGCGTGTGGCTTGGTACTTTCAACGAGGAAGACGAAGCA 
GCTCGTGCTTACGACGTCGCGGCTCACCGTTTCCGTGGCCGCGATGCCGTTACTAATTTC 
AAAGACACGACGTTCGAAGAAGAGGTTGAGTTCTTAAACGCGCATTCGAAATCAGAGATC 
GTAGATATGTTGAGAAAACACACTTACAAAGAAGAGTTAGACCAAAGGAAACGTAACCGT 
GACGGTAACGGAAAAGAGACGACGGCGTTTGCTTTGGCTTCGATGGTGGTTATGACGGGG 
TTTAAAACGGCGGAGTTACTGTTTGAGAAAACGGTAACGCCAAGTGACGTCGGGAAACTA 
AACCGTTTAGTTATACCAAAACACCAAGCGGAGAAACATTTTCCGTTACCGTTAGGTAAT 
AATAACGTCTCCGTTAAAGGTATGCTGTTGAATTTCGAAGACGTTAACGGGAAAGTGTGG 
AGGTTCCGTTACTCTTATTGGAATAGTAGTCAAAGTTATGTGTTGACCAAAGGTTGGAGT 
AGATTCGTTAAAGAGAAGAGACTTTGTGCTGGTGATTTGATCAGTTTTAAAAGATCCAAC 
GATCAAGATCAAAAATTCTTTATCGGGTGGAAATCGAAATCCGGGTTGGATCTAGAGACG 
GGTCGGGTTATGAGATTGTTTGGGGTTGATATTTCTTTAAACGCCGTCGTTGTAGTGAAG 
GAAACAACGGAGGTGTTAATGTCGTCGTTAAGGTGTAAGAAGCAACGAGTTTTGTAATAA 
CAATTTAACAACTTGGGAAAGAAAAAAAAGCTTTTTGATTTTAATTTCTCTTCAACGTTA 
ATCTTGCTGAGATTA 

>G1930 Amino Acid Sequence (domain in AA coordinates: 59-124) 

MDAMSSVDESSTTTDSIPARKSSSPASLLYRMGSGTSVVLDSENGVEVEVEAESRKLPSS 

RFKGWPQPNGRWGAQIYEKHQRWLGTFNEEDEAARAYDVAAHRFRGRDAVTNFKDTTF 

EEEVEF^AHSKSEIVDMLRKHTYKEELDQRKRNRD^ 

LLFEKTVTPSDVGKLNRLVIPKHQAEKHFPLPLGNNNVSVKGMLLNFE 

YWNSSQSYVLTKGWSRFVKEKRLCAGDLISFKRSNDQDQKFFIGWKSKSGLDLETGRVMR 

LFGVDI SLNAVWVKETTEVLMS SLRCKKQRVL * 

>G195 (51.. 1031) 

TTTTCTTTTTCTTTCTTTTTGGTTTAAGfTTTTTTCTCTTTGTTCTTCGTCATGTCTCATG 
AAATC^U^GATCTTAACAACTATCACTACACTTCATCGTATAATCATTACAATATCAACA 
ACCAAAATATGATTAATCTCCCTTACGTTTCTGGTCCATCTGCTTATAATGCAAACATGA 
TCTCATCATCACAAGTAGGTTTTGATCTACCCTCGAAGAACTTGAGTCCTCAAGGAGCCT 
TCGAGTTGGGTTTCGAGCTTTCTCCATCTTCTTCTGACTTTTTTAATCCTTCCCTCGATC 
AAGAGAACGGTTTGTATAATGCTTATAATTATAATAGTAGTCAAAAGAGTCATGAAGTTG 
TCGGTGATGGTTGTGCAACC^TTAAGAGTGAAGTTAGGGTTTCAGCATCTCCTTCTTCAA 
GTGAGGCCGATCATCATCCAGGAGAAGATTCCGGCAAGATCCGGAAGAAAAGAGAAGTTC 
GCGATGGAGGAGAAGATGATCAACGCTCTCAGAAAGTAGTTAAAACAAAGAAGAAAGAGG 
AGAAGAAAAAAGAGCCACGAGTCTCGTTCATGACTAAGACCGAAGTTGATCATCTCGAAG 
ACGGCTATCGTTGGAGAAAGTATGGCCAAAAAGCAGTCAAAAACAGTCCTTATCCGAGGA 
GTTACTATAGATGCACGACTCAGAAGTGCAACGTGAAGAAGAGAGTGGAGAGATCTTACC 
AAGACCCAACGGTCGTCATCACAACCTACGAGAGTCAACACAACCATCCGATCCCGACCA 
ATCGTCGGACAGCAATGTTCTCTGGAACCACCGCATCTGATTATAACCCATCATCGTCTC 
CAATATTCTCCGATCTCATCATCAATACTCCAAGAAGCTTCTCAAATGATGATCTCTTCC 
GTGTGCCATACGCTAGTGTGAACGTGAACCCTAGTTATCATCAACAGCAACATGGATTTC 
ATCAACAGGAGAGTGAGTTCGAGCTCTTGAAGGAGATGTTTCCTTCGGTTTTCTTCAAAC 
AAGAGCCTTGATGATATAATATAATATAGAAACAATTTTTTTTCTGCTAAGAAATATAGA 
ACAAAACTTGGATGCATAATAAGTGATGATAGTGTTATTTATTTTTTGCATGTATATATT 
ATACATGTTTTGTTAACTAGCTATAGGATATACTGGTAGTAATTAAGCATAAATATGGAG 
CCCTTCGACTTATTAC^TAATTTTTGGTATGGAAAAANTTNGNTACATGCCTGCCTTTT 
NNNTTNNGG 

>G195 Amino Acid Sequence (domain in AA coordinates: 183-239) 
MSHEIKDLNNYHYTSSYNHYNIN^ 

QGAFELGFELSPS S SDFFNPSLDQENGLYNAYNYNS SQKSHEWGDGCATIKSEVRVSAS 
PS S SEADHHPGEDSGKI RKKRE VRDGGEDDQRSQKVVKTKKKEEKKKEPRVS FMTKTE VD 
HLEDGYRWRKYGQKAVKNSPYPRSYYRCTTQKCNVKKRV^ 
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IPTNRRTAMFSGTTASDYNPSSSPIFSDLIINTPRSFSNDDLFRVPYASVNVNPSYHQQQ 

HGFHQQESEFELLKEMFPSVFFKQEP* 

>G1954 (196. .1440) 

ATTTATGACTTCTCAATACAAAAAGCTCCCCTCACTTTTTTAAGTTTTGTCTTCTCTAAT 
CCGTCTTCTTCTACTATCTTGCATGTCTTGCGTCTTTTATATACATCTCTCGTAAACCCT 
AGCAAATCATACAAGGTCAAGAAGCTTGACCTTCATTAGACTTAAGCAGTTTATAATCAA 
CTACCACGAATAGCAATGGATAAAGATTACTCGGCACCAAACTTCTTAGGTGAATCCTCA 
GGCGGTAACGATGATAACAGCTCTGGTATGATAGACTATATGTTCAATAGAAACCTTCAA 
CAACAACAAAAGCAATCGATGCCACAACAGCAGCAACATCAACTCTCTCCTTCCGGATTT 
GGAGCAACACCCTTTGATAAAATGAACTTCTCTGATGTGATGCAGTTTGCGGACTTCGGT 
TCGAAACTTGCGTTGAACCAGACCAGAAACCAAGACGATCAAGAAACCGGGATTGACCCC 
GTTTATTTCTTGAAGTTCCCTGTCTTGAACGACAAAATAGAGGACCATAACCAAACCCAA 
CATCTCATGCCTTCTCATCAGACGTCTCAAGAAGGAGGTGAGTGTGGAGGAAACATAGGC 
AATGTGTTTCTTGAAGAAAAAGAAGATCAAGACGATGACAACGACAACAACTCCGTGCAA 
CTACGTTTTATTGGAGGAGAAGAAGAAGATAGGGAGAACAAGAATGTTACGAAAAAGGAG 
GTGAAGAGCAAGAGGAAGAGAGCTAGAACGAGCAAGACCAGCGAAGAAGTGGAAAGCCAA 
CGGATGACTCATATCGCGGTCGAAAGAAACCGTAGGAAGCAAATGAATGAGCATCTTCGT 
GTCCTTAGATCTCTCATGCCTGGCTCCTACGTTCAAAGGGGAGACCAAGCGTCAATCATA 

ggaggagcaatagagtttgtgagagagctcgagcaactcctacaatgtcttgaatcacag 
aagcgtcgaagaatcttaggagaaaccggtagggacatgacaacgacaacgacttcttct 
tcttctcccataactacggtagcgaaccaagcacaaccgctcattattacgggaaatgta 
accgagctagagggcggaggagggcttcgggaggagactgcggagaacaagtcgtgcttg 
gctgacgtggaggtgaagctgctagggtttgacgccatgatcaagatactttcaagaaga 
aggccgggacagctgattaagactatagctgctttggaggatcttcatctctctattctt 
cacactaacatcactaccatggaacaaaccgtcctctactcctttaatgtcaagataa(^ 
agtgaaacgaggtttacggcagaagacatagcaagttccatccaacagatatttagtttc 
a:tt(^tgc^u^taccaacatatctggaagctctaacctgggaaatattgtgtttacttga 
aaatcatcacacggcgacaactttgtacactggtgaagattacagtacgtaataatctct 
acatattgggttttattctccaagcatttggaagagtgtttaagttaaagggagtgctta 

CTTTATTTTTTTGGGGCTTTTTTC^TGCAATTTAAATTTTAGTGATGATTGTGTCGCTTG 

TAATGTTAGAACTCGTTGTTGTGATTTCTGCTGCTTTGATTTGTAGGTTTTGAACAAGCG 

GTTTAGAATGCTAAACCACTTATTTACTTGAAATAACTTTTTTCACAAAAAAAAAAAAAA 
AAGAAAAAAA 

>G1954 Amino Acid Sequence (domain in AA coordinates : 187-259) 
MDKDYSAPNFLGESSGGNDDNSSGMIDYMFNRNLQQQQKQSMPQQQQHQLSPSGFGATPF 
DKMNFSDVMQFADFGSKLALNQTRNQDDQETGIDPVYFLKFPVLNDKIEDHNQTQH^ 
HQTSQEGGECGGNIGNVFLEEKEDQDDDNDl^S 

KRARTSKTSEEVESQR^m^IAVERNRRKQMNEHLRVLRSLMPGSYVQRGDQASIIGGAIE 
FVRELEQLLQCLESQKRRRILGETGRDMTTTTTSSSSPITTVANQAQPLIITGNVTELEG 
GGGLREETAENKSCLADVEVKLLGFDAMIKILSRRRPGQLIKTIAALEDLHLSILHTNIT 
TMEQTVLYSFNVKITSETRFTAEDIASSIQQIFSFIHANTNISGSSNLGNIVFT* 
>G1958 (107.. 1336) 

GTACCGTCGACCGATTATCCCCAAGAGGAGAATCCTCATAATCATTTTCTCCGATTCGAT 

TCGTCTTCCTTGGTCCTGGATTGCTTCATGAATTTCTAGGACAACAATGGAGGCTCGTCC 

AGTTCATAGATCAGGTTCGAGAGACCTCACACGCACTTCTTC^TCCCATCTACACAAAA 

ACCTTCACCAGTAGAAGATAGTTTCATGAGATCAGATAACAACAGTCAGTTAATGTCTAG 

ACCATTAGGACAAACCTACCATTTACTTTCATCTAGTAACGGTGGAGCTGTTGGAC^ 

ATGTTCTTCTTCA1?€ATCTGGTTTTGC^^ 

TGAGAAACAACAACACTACACAGGAAGCAGCAGTAATAATGCTGTGCAGACACCAAGCAA 
C^CGATAGTGCTTGGTGTCATGATTCATTC^ 

CAACCCGGCGATTCAAAACAACTGTCAGATTGAGGATGGTGG^ 

TGACATTCAAAAACGAAGTGATTGGCATGAATGGGCTGACCATTTGATCACTGATGATGA 
TCCTTTGATGTCTACTAACTGGAATGATCTCTTGCTTGAAACAAATTCCAATTCAGATTC 
AAAGGACCAGAAGACT^CTGCAAATTCCGCAACCTCAGA 

GTCTGTGGAATTGCGACCTGTTAGCAC^(^TCTTCAAACAGCAATAACGGAACGGGC^ 
GGCACGAATGCGTTGGACGCCAGAGCTT(^CGAGGCTTTTGTTGAGGCTGTCAACAGTCT 
TGGCGGTAGTGAAAGAGCTACrCCTAAAGGGGTACTGAAGATTATGAAAGTTGAAGGCTT 
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GACTATATATCATGTTAAAAGCCATTTACAGAAATATAGGACAGCTAGATATCGGCCAGA 
ACCATCAGAAACTGGTTCGCCAGAAAGGAAGTTGACACCGCTTGAACATATAACATCTCT 
TGATTTGAAAGGTGGGATAGGTATTACAGAGGCTCTACGACTTCAGATGGAAGTACAGAA 
GCAACTCCATGAGCAGCTCGAGATTCAAAGAAACCTGCAACTCCGAATAGAAGAACAAGG 
CAAGTACCTGCAAATGATGTTCGAGAAGCAAAACTCTGGTCTTACCAAAGGGACAGCCTC 
AACATCAGATTCCGCAGCCAAATCTGAACAAGAAGACAAGAAGACTGCTGATTCGAAGGA 
GGTTCCAGAAGAAGAAACCAGGAAATGTGAGGAACTAGAATCTCCACAGCCAAAGCGTCC 
CAAAATCGATAATTGAAAGTATTGGTCTTTTGCTGGATAATCTCGGAGTTTCAGAGTTAA 
CAGTGATAGAGAGAACGAGCTCTTATCTTGAGGTTCTTCAGGACTTCTCTCGCGGCCGCT 

CTAG 

>G1958 Amino Acid Sequence (domain in AA coordinates: 230-278) 

MEARPVHRSGSRDLTRTSSIPSTQKPSPVEDSFMRSDNNSQLMSRPLGQTYHLLSSSNGG 

AVGHICSSSSSGFATNLHYSTMVSHEKQQHYTGSSSNNAVQTPSNNDSAWCHDSLPGGFL 

DFHETNPAIQNNCQIEDGGIAAAFDDIQKRSDWHEWADHLITDDDPLMSTNWNDLLLETN 

SNSDSKDQKTLQIPQPQIVQQQPSPSVELRPVSTTSSNSNNGTGKARMRWTPELHEAFVE 

AVNSLGGSERATPKGVLKIMKVEGLTIYHVKSHLQKYRTARYRPEPSETGSPERKLTPLE 

HITSLDLKGGIGITEALRLQMEVQKQLHEQLEIQRNLQLRIEEQGKYLQMMFEKQNSGLT 

KGTASTSDSAAKSEQEDKKTADSKEVPEEETRKCEELESPQPKRPKIDN* 

>G196 (111.. 1421) 

TCGACATCAGATTTCTCTCACGGATTCCTAATCATTTTTATTATATTTGGATATTTGCTA 
ATTCTTCCCGTGTATAAATCTCATATAAACACGCATCATACATATATATTATGTGCAGCG 
TCTTTGAGTTTCAAGACATGGACAACTTCCAAGGAGATCTAACAGACGTCGTACGAGGAA 
TAGGATCAGGCCACGTGTCACCATCTCCTGGACCACCGGAAGGTCCATCTCCGAGCAGCA 
TGTCTCCGCCGCCAACATCAGATCTCCACGTGGAATTCCCCTCCGCCGCTACTTCTGCCA 
GCTGTCTCGCAAATCCCTTCGGAGACCCGTTCGTAAGCATGAAGGATCCTCTCATCCACC 
TCCCGGCCAGCTACATCTCCGGCGCCGGTGATAATAAAAGCAACAAAAGTTTTGCAATCT 
TTCCAAAGATTTTTGAGGATGATCATATTAAGAGTCAATGCAGTGTCTTCCCAAGAATTA 
AGATCTCGCAAAGTAACAATATCCACGATGCCTCCACGTGTAATTCTCCGGCCATAACCG 
TCTCCTCTGCCGCCGTAGCAGCTTCGCCGTGGGGCATGATCAACGTTAATACCACTAACA 
GTCCAAGAAACTGTTTACTTGTCGATAATAATAACAACACGTCATCATGCTCACAGGTTC 
AGATCTCTTCTTCCCCTCGGAATCTCGGAATTAAGAGAAGGAAGAGCCAGGCAAAGAAAG 
TGGTGTGCATACCGGCTCCAGCCGCTATGAACAGCCGGTCCAGTGGAGAAGTTGTTCCGT 
CTGATCTATGGGCTTGGCGAAAGTACGGTCAAAAACCTATCAAAGGTTCTCCTTATCCAA 
GGGGTTACTACAGATGTAGCAGCTCAAAAGGTTGTTCAGCTAGGAAACAAGTCGAACGTA 
GCCGCACTGATCCAAACATGTTAGTCATTACTTACACCTCTGAGCATAACCACCCATGGC 
CTACTCAACGCAACGCTCTCGCAGGTTCCACTCGTTCCTCTTCCTCCTCCTCTTTAAACC 
CTTCTTCCAAATCCTCAACCGCAGCCGCCACTACTTCTCCCTCATCCAGAGTTTTCCAAA 
ACAACAGCAGCAAAGACGAACCCAATAACTCCAACTTGCCTTCCTCTTCCACTCATCCTC 
CTTTTGACGCCGCCGCAATTAAGGAGGAGAACGTGGAAGAGCGTCAGGAAAAGATGGAGT 
TCGATTATAATGACGTTGAAAATACCTATAGACCGGAGTTGTTGCAAGAGTTTCAACATC 
AGCCGGAGGATTTCTTTGCCGATCTCGACGAGCTTGAGGGAGATTCTTTGACTATGTTGC 
TCTCTCACAGTAGCGGCGGAGGCAACATGGAAAACAA^ 

GTGATTTCTTTGACGACGACGAGTCCTCAAGGTCGTTATAAATATTGTTGTTAATGTATA 
A 

>G196 Amino Acid Sequence (conserved domain in AA coordinates: 223 -283) 
MCSVFEFQDMDNFQGDLTDWRGIGSGHVSPSPGPPEGPSPSSMSPPPTSDLHVEFPSAA 
TSASCLANPFGDPFVSMKDPLIHLPASYISGAGDNKSNKSFAIFPKIFEDDHIKSQCSVF 
PRIKISQSNNIHDASTC^SPAITVSSAAVTUVSPWGMINVl^^ 

S QVQI SS S PRNLG I KRRKSQAKKWCI PAPAAMNSRS SGEWPSDLWAWRKYGQKP I KGS 

PYPRGYYRCS S SKGCS ARKQ V^RSRTDPNMLV ITYTSEHNHPWPTQRNALAGSTRS S S S S 

SLNPSSKSSTAAATTSPSSRVFQNNSSKDEPNNSNLPSSSTHPPFDAAAIKEENVEERQE 

KMEFDYNDVENTYRPELLQEFQHQPEDFFADI^ 

DVFSDFFDDDESSRSL* 

>G1965 (1..609) 

ATGGATAACTTCAATGTTGTTGCCAATGAAGACAATCAAGTGAATGATGTGAAGCCTCCA 
CCACCCCCACCGCGAGTGTGTGCAAGATGTGATTCTGATAACACAAAATTTTGTTACTAC 
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^^ GGGC ^ TAAGAAACGTACC ^ TTGGTGG GAGTAGTCGTGC^^ 
jGTTCTTTGTCTCATATTCATGGTGGTATGGTAACAAATGTGCATCCAACT^ 

gggtctoataatcttttggtaaaccaacaagttggtggatatgttS 

TATCACATGAATCAAGTGGATCAATACAACTGGAACCAGAGCTTcS^S^^ 

RpZf^^f S ^ TRINQPSVWSVG1 ^^ 
>G1976 (1..1152) 



™*™ GGA ^^^ 



X3CAGTCCGAGGGGAC 

>G2057 (27.. 1289) 



TTATCAAAAAGGCTAAGACTTCCA1TG— ^^^^^CGTTGATTCGC 



FGACGAGCTCGCTGAGCTTCCTCCCTGGAATCCCG 
VCACTATAAAGTCGTTTTTTCCGGTGATTG 
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CCGCCTCCGAGCCTACTCTGTTCTACGGACAGAGCAATCCGTTAGGGTTTGACACATCGA 

GTTGGGAGCAGCAGTCGTCGGAATTCGGAAGGATTCAGAGACTAGTGGCTTGGAACAGCG 

GCGGTGGCGGCGGAGCAACCGATACAGGAAACGGAGGAGGGTTTCTGTTCGCTCCTCCTA 

CTCCTTCAACGACGTCGTTTCAGCCAGTTCTTGGCCAAAGCCAACAGCTTTATTCTCAGA 

GGGGTCCCCTTCAGTCCAGTTACAGTCCCATGATCCGTGCTTGGTTTGATCCTCACCATC 

ATCACCAATCCATCTCCACCGACGATCTCAACCACCACCATCACCTTCCTCCACCGGTTC 

ACCAATCAGCAATCCCCGGAATCGGATTCGCCTCAGGTGAATTCTCTTCGGGTTTTCGCA 

TACCAGCACGGTTTCAGGGCCAAGAAGAGGAGCAGCACGACGGTCTCACTCACAAGCCGT 

CCTCTGCTTCCTCTATTTCTCGCCATTGACAATCGAAACTAATCCTC 

>G2057 Amino Acid Sequence (domain in AA coordinates: TBD) 

MSDDQFHHPPPPSSMRHRSTSDAADGGCGEIVEVQGGHIVRSTGRKDRHSKVCTAKGPRD 

RRWLSAHTAIQFYDVQDRLGFDRPSKAVDWLIKKAKTSIDELAELPPWNPADAIRLAAA 

NAKPRRTTAKTQISPSPPPPQQQQQQQQLQFGVGFNGGGAEHPSNWESSFLPPSMDSDSI 

ADTIKSFFPVIGSSTEAPSNHNLMHNYHHQHPPDLLSRTNSQNQDLRLSLQSFPDGPPSL 

LHHQHHHHTSASASEPTLFYGQSNPLGFDTSSWEQQSSEFGRIQRLVAWNSGGGGGATDT 

GNGGGFLFAPPTPSTTSFQPVLGQSQQLYSQRGPLQSSYSPMIRAWFDPHHHHQSISTDD 

LNHHHHLPPPVHQSAIPGIGFASGEFSSGFRIPARFQGQEEEQHDGLTHKPSSASSISRH 
* 

>G2107 (79.. 624) 

ACCACAAAACAGAGCAACACACAACACAAAGCTTCATTTCAATTCTGTTTCGAGAACCCT 
TTGAGAACCAGATCGGAGATGGAAAACGACGATATCACCGTGGCGGAGATGAAGCCAAAG 
AAGCGTGCTGGACGGAGGATTTTCAAGGAGACACGTCACCCAATCTACAGAGGCGTGCGG 
CGTAGGGACGGCGACAAATGGGTATGCGAAGTCCGTGAACCGATTCATCAGCGTCGAGTC 
TGGCTCGGAACTTATCCGACGGCAGATATGGCCGCACGTGCTCACGACGTGGCGGTTCTT 
GCTCTGCGCGGGAGATCCGCGTGTTTGAATTTCTCCGATTCTGCTTGGAGGTTGCCGGTG 
CCGGCATCCACTGATCCGGACACGATCAGGCGCACGGCGGCCGAAGCAGCGGAGATGTTC 
AGGCCGCCGGAGTTTAGTACAGGAATTACGGTTTTACCCTCAGCCAGTGAGTTTGACACG 
TCGGATGAAGGAGTCGCTGGAATGATGATGAGGCTCGCGGAGGAGCCGTTGATGTCGCCG 
CCAAGATCGTACATTGATATGAATACGAGTGTGTACGTGGACGAAGAAATGTGTTACGAA 
GATTTGTCACTTTGGAGTTACTAAAATACGTATGTGTTAAAAAACCAAAGATCGTATGTG 
TATGTATGCATAATAAATGGGCTTAATGATGGGCATAGATATGATAGGTCCAGCCTATAT 
GTTAAATGTGTTTTATTTTTTGGTTTATCTAGTTTCCTAGGTATTTACCAAATTGTATTA 
GTATAAGTTTTATTAAGAAATAATCAAAAATGTTGTTGCCAAAAAAAAAAAAAAAAAAAA 
AAAAA 

>G2107 Amino Acid Sequence (domain in AA coordinates: TBD) 

MENDD I TVAEMKPKKRAGRR I FKETRHP I YRGVRRRDGDKWVCEVREP IHQRRVWLGTYP, 

TADMAARAHDVAVLALRGRSACLNFSDSAWRLPVPASTDPDTIRRTAAEAAEMFRPPEFS 

TGITVLPSASEFDTSDEGVAGM^RLAEEPLMSPPRSYIDMNTSVYVDEE 

Y* 

>G211 (1..750) 

ATGATGTCATGTGGTGGGAAGAAGCCAGTGTCTAAGAAAACAACGCCGTGTTGCACGAAG 
ATGGGGATGAAGAGAGGACCATGGACGGTGGAGGAAGACGAGATTCTTGTGAGCTTCATT 
AAGAAAGAAGGTGAAGGACGGTGGCGATCGCTTCCTAAGAGAGCTGGTTTACTCAGATGT 
GGAAAGAGCTGTCGTCTACGGTGGATGAACTATCTCCGACCCTCGGTTAAACGTGGAGGA 
ATTACGTCGGACGAGGAAGATCTCATCCTCCGTCTTCACCGCCTCCTCGGCAACAGGTGG 
TCATTGATCGCGGGAAGGATACCGGGAAGGACTGATAATGAAATTAAGAACTATTGGAAC 
ACTCATCTTCGTAAGAAACTTTTAAGGCAAGGAATTGATCCTCAAACCCACAAGCCTCTT 
GATGGAAACAACATCC^TAAACCAGAAGAAGAAGTTT^ 

GAGCCTATTTCTAGTTCTCATACTGATGATACCACTGTTAATGGCGGGGATGGAGATAGC 
AAGAACAGTATCAATGTCTTTGGTGGTGAACACGGCTACGAAGACTTTGGTTTCTGCTAC 

AATATTATCCC^TATCTG^CCTTTGCyVGATGGATGATTGTAAGGATGGGATTGTTGGA 
GCGTCGTCTTCTAGCTTAGGACATGACTAG 

>G211 Amino Acid Sequence (conserved domain in AA coordinates : 24 -137) 
MMSCGGKKPVSKKTTPCCTKMGMKRGPWTVEEDEILVSFIKKEGEGRWRSLPKRAGLLR 
GKSOTLRWMNYLRPSVKRGGITSDEEDLILRLHRLLGNRWSLIAGRIPGRTDNE 
THIiRKKLLRQG IDPQTHKPLDANN IHKPEEEVSGGQKYPLEPI S S SHTDDTTVNGGDGDS 
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>G2133 (26.. 457) 



===== 



>G2134 {36.. 644) 



>G2151 (236.. 1321) 



\GACTTGTAATTTTGGAGTTTTTAATGCATGGA 

:tcgcattctcagtactatcttcaaagac 



mtcatcaccctcaggctaacaatccaggacc 
^catgggagtggtctcctctgcttctgatgc 



GTGC^CCCCOTrAarcn^™ ^^GGCACTGGAACCATTTCTTCAGTCACTCT 
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TCATAATAACAATAACAAGACCATCAGACAAGAAAAGGAACCAAATGAAGAGGACAACAA 

TAGTGAAATGGAGACCACACCGGGTAGTGCAGCTGAACCAGCAGCATCTGCGGGTCAGCA 

GACGCCACAGAACTTCTCTTCTCAGGGAATAAGGGGGTGGCCCGGTTCAGGCTCAGGCTC 

TGGCAGATCACTTGACATTTGCAGAAACCCACTCACTGATTTTGATTTGACTCGTGGATG 

ATATACACTATTAGTCTTTGAAGCAGCAGCATACAAAATGTGATTGCTGTACATATGTTA 

TTGTAGATTTCTCTCTGGGAATGTTGAAATCAGACATTTAAGGATTGATACTAGATCTCT 

CAGCTCCTTCTAACATTGTTAATGTAACAGAACCCTCCCACTTTCATGCTATTTGC 

>G2151 Amino Acid Sequence (domain in AA coordinates : 93-113 124-144) 

MDGREAMAFPGSHSQYYLQRGAFTNLAPSQVASGLHAPPPHTGLRPMSNPNIHHPQANNP 

GPPFSDFGHTIHMGWSSASDADVQPPPPPPPPEEPMVICRKRGRPRKYGEPMVSNKSRDS 

SPMSDPNEPKRARGRPPGTGRKQRLANLGEWMNTSAGLAFAPHVISIGAGEDIAAKVLSF 

SQQRPRALCIMSGTGTISSVTLCKPGSTDRHLTYEGPFEIISFGGSYLVNEEGGSRSRTG 

GLSVSLSRPDGSIIAGGVDMLIAANLVQWACSFVYGARAKTHNNNNKTIRQEKEPNEED 

NWSEMETTPGSAAEPAASAGQQTPQNFSSQGIRGWPGSGSGSGRSLDICRNPLTDFDLTR 
G* 

>G2154 (82.. 1317) 

GCAAAAAGAAAAAATGAAAAAAAATCCCTAACTCTCTCTCTCTAGAAATTCTTATTTTTG 

TGCGTATCTCTCTAAAAAGGAATGGATCCTAACGAAAGCCACCATCACCACCAACAACAA 

CAGCTCCATCACCTCCACCAACAGCAACAGCAACAGCAGCAGCAGCAACGACTCACTTCT 

CCTTACTTCCACCACCAACTACAGCACCATCACCACCTTCCAACCACCGTAGCAACCACC 

GCTTCTACCGGAAACGCCGTTCCATCTTCCAACAATGGGCTTTTCCCTCCGCAGCCTCAG 

CCACAGCACCAGCCTAATGATGGGTCATCTTCTCTCGCGGTGTACCCTCATTCAGTTCCG 

TCCTCGGCTGTGACGGCGCCGATGGAGCCGGTAAAGAGGAAGAGGGGTCGACCAAGAAAG 

TATGTGACGCCGGAACAAGCCCTAGCGGCTAAGAAATTGGCGTCTTCTGCGAGTAGTTCG 

TCTGCTAAACAGAGGCGAGAGCTTGCTGCTGTTACCGGTGGTACGGTATCGACTAATTCC 

GGGTCATCCAAGAAATCTCAGCTTGGTTCTGTCGGGAAAACTGGACAATGTTTTACTCCG 

CATATTGTTAATATAGCTCCTGGCGAGGATGTGGTCCAGAAAATTATGATGTTCGCAAAC 

CAAAGCAAGCATGAACTATGCGTTCTTTCTGCATCAGGCACTATCTCTAATGCATCCTTG 

CGCCAACCGGCTCCATCAGGAGGCAACTTACCATATGAGGGTCAATACGAGATTCTCTCA 

CTATCTGGATCCTATATCCGAACTGAACAAGGTGGTAAATCCGGCGGCCTTAGCGTTTCT 

TTATCTGCTTCAGATGGTCAGATCATCGGTGGAGCGATTGGTAGCCATCTCACAGCTGCT 

GGCCCGGTTCAGGTGATTCTTGGTACGTTTCAGCTTGATAGAAAGAAGGATGCCGCCGGG 

AGTGGTGGGAAAGGGGATGCTTCAAACAGTGGAAGTCGGTTAACTTCTCCTGTAAGCTCT 

GGACAGTTGCTTGGCATGGGTTTCCCTCCTGGTATGGAATCTACGGGAAGAAATCCAATG 

AGGGGAAACGACGAGCAACATGATCATCATCATCATCAAGCCGGTTTGGGTGGACCTCAT 

CATTTCATGATGCAAGCGCCGCAGGGGATACACATGACACATTCCAGGCCATCTGAATGG 

CGCGGAGGAGGCAACAGCGGTCATGATGGCAGAGGCGGTGGCGGGTATGATTTGTCAGGA 

AGGATAGGACATGAGTCGTCGGAGAATGGAGATTACGAGCAGCAAATACCGGATTAGCAG 

AGCTTCCAGGAGAAGTGTGTAGAGTTTAGATCCCAAGTAGAGAAACAGAAGGCGAGCAAA 

GAATCTGAACTGAGAGAGGACTTATTAGACAGAGACTCGTCTGAAGGGTCTTTAATCATA 

GAAAGAAGTTGCTGAGTGATTGCTTTTGTTCTTCTTCTTGGTACGGTGTATTATATTAAC 

TCCACAACCTTTTTTTTATACTTTCAGTAACGATTCTCCTTCACTTTCAATTTCATTCCT 

TTTTTTTATACTCTTTTTCTTTTCTTATAATATTTTTTTTGGTTTTTCT^ 

CTAAAAAAGGAAATGCTCTTTTTGTGAAATATATACACTTCGTTTG 

>G2154 Amino Acid Sequence (domain in AA coordinates : 97-119) 

MDPNESHHHHQQQQLHHLHQQQQQQQQQQRLTSPYFHHQLQHHHHLPTTVATTASTGNAV 

PSSNNGLFPPQPQPQHQPNDGSSSLAVYPHSVPSSAVTAPMEPVKRKRGRPRKYVTPEQA 

LAAKKLASSASSSSKKQRRBLAAVTGGTVSTNSGSSKKSQLGSVGKTGQCFTPH1VNIAP 

GEDWQKIMMFANQSKHELCVLSASGTISNASLRQPAPSGGNLPYEGQYEILSLSGSYIR 

TEQGGKSGGLSVSLSASDGQIIGGAIGSHLTAAGPVQVILGTFQLDRKKDAAGSGGKGDA 

SNSGSRLTSPVSSGQLLGMGFPPGMESTGRNPMRGNDEQHDHHHHQAGLGGPHHFMMQAP 

QGIHMTHSRPSEWRGGGNSGHDGRGGGGYDLSGRIGHESSENGDYEOOIPD* 
>G2157 (306.. 1238) 

TCTTTTGATTTTAACCTTTTTTCAGTAGCAAGCCAAAAAAAAAAAACAGACAAAGAAGTT 
CCTTTTATGATAAAGGTATGATGATAGCAAACAAATGATACCCCCATGTCTTGTGTGTCT 
GCTTCATGCAACATGTTGGTTTGGATTTGGTTAATCTAAAAGTTTAAGATAAGGTTTTCG 
GATTCTCTTCCTGTCTTGTAATAGTTTCTTGTCGGAGAGCCATCAACACaiACTTCAACA 
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AAAAAAACAAGAAAAAGAAAAAGATTCTCTTTCTCGTTTTATTTCCATTAGAGAAGAAAA 

AAAGAATGGCGAATCCTTGGTGGGTAGGGAATGTTGCGATCGGTGGAGTTGAGAGTCCAG 

TGACGTCATCAGCTCCTTCTTTGCACCACAGAAACAGTAACAACAACAACCCACCGACTA 

TGACTCGTTCGGATCCAAGATTGGACCATGACTTCACCACCAACAACAGTGGAAGCCCTA 

ATACCCAGACTCAGAGCCAAGAAGAACAGAACAGCAGAGACGAGCAACCAGCTGTTGAAC 

CCGGATCCGGATCCGGGTCTACGGGTCGTCGTCCTAGAGGTAGACCTCCTGGTTCCAAGA 

ACAAACCAAAGAGTCCAGTTGTTGTTACCAAAGAAAGCCCTAACTCTCTCCAGAGCCATG 

TTCTTGAGATTGCTACGGGAGCTGACGTGGCGGAAAGCTTAAACGCCTTTGCTCGTAGAC 

GCGGCCGGGGCGTTTCGGTGCTGAGCGGTAGTGGTTTGGTTACTAATGTTACTCTGCGTC 

AGCCTGCTGCATCCGGTGGAGTTGTTAGTTTACGTGGTCAGTTTGAGATCTTGTCTATGT 

GTGGGGCTTTTCTTCCTACGTCTGGCTCTCCTGCTGCAGCCGCTGGTTTAACCATTTACT 

TAGCTGGAGCTCAAGGTCAAGTTGTGGGAGGTGGAGTTGCTGGCCCGCTTATTGCCTCTG 

GACCCGTTATTGTGATAGCTGCTACGTTTTGCAATGCCACTTATGAGAGGTTACCGATTG 

AGGAAGAACAACAGCAAGAGCAGCCGCTTCAACTAGAAGATGGGAAGAAGCAGAAAGAAG 

AGAATGATGATAACGAGAGTGGGAATAACGGAAACGAAGGATCGATGCAGCCGCCGATGT 

ATAATATGCCTCCTAATTTTATCCCAAATGGTCATCAAATGGCTCAACACGACGTGTATT 

GGGGTGGTCCTCCGCCTCGTGCTCCTCCTTCGTATTGATTAGTTAGATAGGCGGTGGTTG 

GTGCGTTCTTTTTACTGGAATGATTATATTTTCCATTAGGATGGTTAGGCTTTTGTTTAT 

TAAAGCTATCAAGTTTCTTTTTTTTTTACGGATAATTCGGATGACAATTAGCTAGTGTTT 

GTTTGTTTGTTTTGTGGCGGCTTTTCTGACTTGACTATTTTGATCGCGGATAGCTTTGTA 

TGAAAGTGAATTGATTGTAGAATCGTCTTTTGAATTTTGATGTTGGAAAAAACCAA 

>G2157 Amino Acid Sequence (domain in AA coordinates: 82-102, 164-107) 

MANPWOTGNVAIGGVESPvTSSAPSLHHRNSNNNNPPTO 

QTQSQEEQNSRDEQPAVEPGSGSGSTGRRPRGRPPGSKNKPKSPVWTKESPNSLQSHVL 

EIATGADVAESLNAFARRRGRGVSVLSGSGIiVTirVTLRQPAASGGVVSLRGQFEILSMCG 

AFLPTSGSPAAAAGLTIYLAGAQGQWGGGVAGPLIASGPVIVIAATFCNATYERLPIEE 

EQQQEQPLQLEDGKKQKEENDDNESGimGNEGSMQPPMYNMPPNFIPNGHQMAQHDVyWG 

GPPPRAPPSY* 

>G2181 (1..1005) 

ATGATGCTTGCGGTGGAAGATGTGTTAAGCGAACTCGCCGGAGAAGAAAGGAACGAGAGA 
GGATTGCCACCTGGCTTCCGGTTTCACCCGACGGACGAAGAGCTCATTACCTTCTACTTA 
GCTTCCAAAATCTTCCATGGTGGTCTCTCCGGCATTCACATTTCCGAAGTTGATCTCAAC 
CGCTGTGAACCTTGGGAGCTACCAGAAATGGCGAAGATGGGAGAGAGAGAGTGGTACTTT 
TATAGTCTAAGGGACAGGAAATATCCGACAGGTTTGAGGACTAACAGAGCAACTACTGCT 
GGATACTGGAAAGCTACCGGCAAAGATAAGGAAGTCTTCTCCGGCGGAGGAGGACAGCTT 
GTTGGGATGAAGAAGACGTTGGTGTTCTACAAAGGTAGGGCTCCACGTGGCCTCAAGACT 
AAGTGGGTCATGCATGAGTATCGCCTCGAAAACGACCATTCACACCGCCACACGTGTAAG 
GAGGAATGGGTGATTTGCAGAGTGTTCAATAAAACAGGAGACAGAAAAAATGTTGGATTA 
ATCCATAACCAAATCAGCTACCTTCATAACCATTCACTCTCAACAACACATCATCATCAT 
CATGAAGCCTTACCTTTGCTTATAGAACCTTCCAACAAAACCCTAACCAACTTCCCATCA 
CTACTCTACGATGATCCACACC^UVAACTACAATAATAACAACTTCCTTCATGGATCATCA 
GGCCACAACATCGACGAGCTCAAAGCCTTAATCAACCCTGTCGTCTCTCAGCTCAACGGT 
ATCATCTTTCCTTCAGGGAACT^CAACAACGACGAAGACGACTTCGACTTTAACCTCGGC 
GTGAAAACAGAGCAGTCTTCGAACGGTAACGAAATTGACGTACGAGATTACTTGGAGAAC 
CCTCTGTTTCAGGAAGCGAGTTATGGTCTGTTGGGTTTTTCGTCTTCTCCTGGACCTCTT 
CACATGCTACTAGATTCTCCATGTCCTTTAGGATTCCAGCTGTAG 

>G2181 Amino Acid Sequence (conserved domain in AA coordinates : 22 -169) 
MMLAVEDVLSEIiAGEERNERGLPPGFRFHPTDEELITFYLASKIFHGGLSGIHISEVDLW 
RCEPWELPEMAKMGEREWYFYSLRDRKYPTGLRTNRATTAGYWKATGKDKEW^ 
VGMKKTLVFYKGRAPRGLKTKWVMHEYRL 

IHNQISYLHiraSLSTTHHHHHEALPLLIEPSNKTLTNFPSLLYDD 

GHNIDELKALINPWSQLNGIIFPSGNNNNDEDDFDFNLGVKTEQSSNGNEIDVRDYLEN 
PLFQEAS YGLLGFS SS PGPLHMLLDS PCPLGFQL * 
>G221 (115.. 795) 

CTCTCTTATTCTCTCACTCTT1TT1TTTTATATTCCTCTCTCTCTAAATCTATAAAATAT 
ATTTAAAAACTTGATCGTATATAATAAAGTAAATAAAGAATAATAACAAAAAAAATGGAG 
AAAAGAGGAGGAGGAAGTAGTGGAGGTTCGGGATCATCAGCAGAAGCAGAAGTGAGAAAA 
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MEKRGGGSSGGSGSSAEAEVRKGPWTMEEDLILINYIANHGDGVWNSLAKSAGLKRTGKS 

CRLRWLNYLRPDVRRGNITPEEQLIIMELHAKWGNRWSKIAKHLPGRTDNEIKNFCRTRI 

QKYIKQSDVTTTSSVGSHHSSEINDQAASTSSHNVFCTQDQAMETYSPTPTSYOHTNMEP 

NYGNYSAAAVTATVDYPVPMTVDDQTGENYWGMDDIWSSMHLLNGN* 
>G2290 (119.. 982) 



^™ Andl10 A ° id Se( 3 uence (conserved domain in AA coordinates -147 -2ns) 

TQTTGVKPTTVTSSCS^SAAVSVAVTSIMNNPSATSSSSEDPAENSTASAEKTPPPETOV 
KEKKKAQKRIRQPRFAFMTKSDVDNLEDGYRWRKYGQKAVKNSPFPRSYYRCTNSRCTVK 

kr^s S d D psivhtyegqh CT qtigfpr TO ilta^phsftshhSp^^?5yS 

LLHQLHRDNNAPS PRLPRPTTEDTPAVSTPSEEGLLGD I VPQTMRNP * 
>G2299 (231.. 941) 



GCCAAAATrTTACCAACATTTTTCTCTTCTCATATCAAAGTTTCTCTCTCATTTCTTCAT 

C^CACTTCACTGCCCTGTTTTTTTTCCTCATTTTGAATAGTTCTCAAAOTATATATm 

TCCCCCTGAAGCCTAGCTATTTCTTTTTATTTGCATTAATCTCX3GGATCCGAATC 

AAGCAATCAGAATAATAGACTTGTACGATACTTGTGCCTAAGCTAACACAATGGCAGAGG 

AATACTACAGCCTCCGCTCGGAGAGAGTAACTCAGCTTCTTGTCCCTAACTCGGAGTCTG 

ACTCAGTGAGTGACAAAAGCAAAGCTGAGCAAAGCGAGAAGAAGACTAAACGTGGGAGAG 

ACTCCGGTAAACACCCTGTTTATCGCGGAGTAAGGATGAGGAACTGGGGAAAATGGGTGT 



113 



BNSDOCID: <WO_03013227A2_IA> 



WO 03/013227 

PCT/US02/25805 



CGGAGATTCGTGAGCCGAGGAAGAAATCACGTATTTGGCTGGGAACTTTCCCGACGCCGG 
AGATGGCGGCGCGTGCACACGACGTGGCGGCTCTGAGCATTAAAGGAACGGCCGCTATAC 
TAAACTTCCCTGAACTCGCTGACTCATTCCCTCGACCCGTTTCATTAAGCCCTCGAGACA 
TTCAGACAGCAGCTCTTAAAGCAGCTCACATGGAACCGACGACGTCGTTTTCATCTTCCA 
CGTCTTCGTCGTCGTCTTTGTCTTCTACGTCTTCGCTCGAGTCTCTTGTGTTGGTGATGG 
ACCTCTCGAGGACTGAGTCGGAGGAGCTCGGTGAGATTGTGGAGCTTCCAAGTCTCGGGG 
CGAGTTACGACGTCGACTCGGCTAACCTTGGGAACGAGTTTGTCTTCTATGACTCAGTTG 
ACTACTGTTTATATCCGCCGCCGTGGGGACAGTCGTCCGAAGATAACTATGGTCACGGAA 
TTAGCCCTAATTTTGGCCATGGCTTGTCATGGGATCTCTAACAGTTTATTTTGTATCATT 
ACCATAATGTTTTGTTTAAAACAGTTTATTTTGTATCATTGCCATAATGTTTTGTTTAAT 
CACGTTTTTAAAACCCTTTGCTGTTTTTGTTTTTTTTTTGAGTTTTT 

>G2299 Amino Acid Sequence (conserved domain in AA coordinates : 48-115) 

MAEEYYSLRSERVTQLLVPNSESDSVSDKSKAEQSEKKTKRGRDSGKHPVYRGVRMRNWG 

KWVS E I REPRKKSRI WLGTFPTPEMAARAHD VAALS I KGTAAI LNFPEL ADS FPRP VS L S 

PRDIQTAALKAAHMEPTTSFSSSTSSSSSLSSTSSLESLVLVMDLSRTESEELGEIVELP 

SLGASYDVDSANLGNEFVFYDSVDYCLYPPPWGQSSEDNYGHGISPNFGHGLSWDL* 

>G2340 (274.. 1275) 

ATACAAAACTCCCTCTTCTCTATCTTCTTCATCTTAAA.GAAAAAATAAGAGATATTCGTA 
AAGAGAGAACACAAAATTTCAGTTTACGAAAAGCTAGCAAAGTCGAGTATCGAGGAATAA 
CAGAATAAGACGTATCTATCCTTGCCTTAATGTTCTTACCAAAAGATCTAGTCCTTTCTT 
TGTATGATCGATCCATCACAAGCCCACAACAACAACAACTACATCTCTTTCTCTATCTCT 
AGCTTCTATTTTTAATACATTCAAGAATCAAGAATGGTACGGACGCCGTGTTGTAGAGCA 
GAAGGGTTGAAGAAAGGAGCATGGACTCAAGAAGAAGACCAAAAGCTTATCGCCTATGTT 
CAACGACATGGTGAAGGCGGTTGGCGAACCCTTCCGGACAAAGCTGGACTCAAAAGATGT 
GGCAAAAGCTGCAGATTGAGATGGGCGAATTACTTAAGACCTGACATTAAACGTGGAGAG 
TTTAGCCAAGACGAGGAAGATTCCATCATCAACCTCCACGCCATTCATGGCAACAAATGG 
TCGGCCATAGCTCGTAAAATACCAAGAAGAACAGACAATGAGATCAAGAACCATTGGAAC 
ACTCACATCAAGAAATGTCTGGTCAAGAAAGGTATTGATCCGTTGACCCACAAATCCCTT 
CTCGATGGAGCCGGTAAATCATCTGACCATTCCGCGCATCCCGAGAAAAGCAGCGTTCAT 
GACGACAAAGATGATCAGtfy^TTCAAATAACAAAAAGTTGT 

TTTTTGAACAGAGTAGCAAACAGATTCGGTCATAGAATCAACCACAATGTTCTGTCTGAT 

ATTATTGGAAGTAATGGCCTACTTACTAGTCACACTACTCCAACTACAAGTGTTTCAGAA 

GGTGAGAGGTCAACGAGTTCTTCCTCCACACATACCTCTTCGAATCTCCCCATCAACCGT 

AGCATAACCGTTGATGCAACATCTCTATCCTCATCCACGTTCTCTGACTCCCCCGACCCG 

TGTTTATACGAGGAAATAGTCGGTGACATTGAAGATATGACGAGATTTTCATCAAGATGT 

TTGAGTCATGTTTTATCTCATGAAGATTTATTGATGTCCGTTGAGTCTTGTTTGGAGAAT 

ACTTCATTCATGAGGGAAATTACAATGATCTTTCAAGAGGATAAAATCGAGACGACGTCG 

TTTAATGATAGCTACGTGACGCCGATCAATGAAGTTGATGACTCCTGTGAAGGGATTGAC 

AATTATTTTGGATGAGTTATATTGATGATGATGAAAATTTGCATTTGGCATGTAAATC7^ 
TTAGAGTTTGATTTGCTATGGTGTT^ 

AAAAAAAAAAAAAAAAAAAAAAAAAA 

>G2340 Amino Acid Sequence (domain in AA coordinates : 14-120) 
MVRTPCCRAEGLKKGAWTQEEDQKL I AYVQRHGEGGWRTLPDKAGIiKRCGKS CRLRWANY 
LRPD I KRGEFS QDEEDS I INLHAIHGNKWSAI ARKI PRRTDNEI KNHWNTHI KKCLVKKG 
IDPLTHKSLLDGAGKSSDHSAHPEKSSVHDDKDDQN^ 

RINHNVLSD I IGSNGLLTSHTTPTTS VS EGERSTS S S STHTS SNLP INRS I TVDATSLS S 

STFSDSPDPCLYEEIVGDIEDMTRFSSRCLSHVLSHEDLLMSVESCLENTSFMREITMIF 

QEDKIETTSFNDSYVTPINEVDDSCEGIDNYFG* 

>G2346 (1..1011) 

ATGGAGTTGTTAATGTGTTCGGGTCAGGCCGAGTCAGGTGGTTCTTCTTCCACCGAGTCT 
TCTTCACTCAGTGGTGGACTCAGGTTTGGTC^ 

TCCAGAAGCAAGAACCGGGTCAATACCGTTCGTAAGTCGTCTACCACGGCGAGGTGCCAA 
GTGGAAGGTTGTAGAATGGATCTAAGCAATGTTAAAGCTTATTACTCGAGACACAAAGTT 
TGTTGCATT(^CTCTAAATCATCTAAAGTCATTC 

CAACAATGTAGCAGGTTTCACCAGCTTTCTGAGTTTGACTTGGAGAAAAGAAGTTGTCGC 
AGAAGACTCGCTTGTCATAACGAACGACGAAGAAAACCACAACCCACAACGGCTCTTTTC 
ACTTCTCAITACTCTCGAATCGCTCCATCTCTTTACGGAAACCCCAATGCTGC^WITGATT' 
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AAAAGCGTTTTGGGAGATCCTACTGCGTGGTCAACCGCAAGATCAGTGATGCAGCGGCCT 

GGACCGTGGCAGATTAATCCAGTTAGGGAAACCCATCCACACATGAATGTTTTATCACAT 

GGAAGCTCAAGCTTTACTACATGTCCAGAGATGATAAACAACAATAGCACAGATTCAAGC 

TGTGCTCTCTCTCTTCTGTCAAACTCATACCCAATTCATCAGCAGCAACTTCAGACACCA 

ACAAATACATGGCGACCATCTTCTGGTTTCGACTCGATGATCTCATTCTCCGATAAGGTT 

ACAATGGCTCAGCCACCGCCCATTTCAACCCATCAGCCGCCCATCTCAACACATCAGCAG 

TACCTCAGCCAAACTTGGGAAGTCATCGCGGGCGAAAAGAGCAATTCACATTATATGTCT 

CCTGTGAGTCAAATCTCGGAGCCAGCAGATTTCCAGATAAGCAATGGCAGTGTGTCGCCC 

TATTCTCCTCCGTCCTTACTATCTCTTGTGTGCTACTTGCGGCCGCTATAG 

>G2346 Amino Acid Sequence (domain in AA coordinates: 59-135) 

MELLMCSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQ 

VEGCRMDLSNVKAYYSRHKVCCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCR 

RRLACHNERRRKPQPTTALFTSHYSRIAPSLYGNPNAAMIKSVLGDPTAWSTARSVMQRP 

GPWQINPVRETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTP 

TNTWRPSSGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYLSQTWEVIAGEKSNSHYMS 

PVSQISEPADFQISNGSVSPYSPPSLLSLVCYLRPL* 

>G237 (1..852) 

ATGGCGAAGACGAAATATGGAGAGAGACATAGGAAAGGGTTATGGTCACCTGAAGAAGAC 

GAGAAGCTAAGGAGCTTCATCCTCTCTTATGGCCATTCTTGCTGGACCACTGTTCCCATC 

AAAGCTGGGTTACAAAGGAATGGGAAGAGCTGCAGATTAAGATGGATTAATTACCTAAGA 

CCAGGGTTAAAGAGGGATATGATTAGTGCAGAAGAAGAAGAGACTATCTTGACGTTTCAT 

TCTCCCTTGGGTAACAAGTGGTCGCAAATAGCTAAATTCTTACCGGGAAGAACAGACAAT 

GAGATAAAGAACTATTGGCACTCTCATTTGAAAAAGAAATGGCTCAAGTCTCAGAGCTTA 

CAAGATGCAAAATCTATTTCCCCTCCTTCGTCTTCATCATCATCACTTGTTGCTTGTGGA 

GAAAGAAATCCGGAAACCTTGATCTCGAATCACGTGTTCTCCCTCCAGAGACTTCTAGAG 

AACAAATCTTCATCTCCCTCACAAGAAAGCAACGGAAATAACAGCCATCAATGTTCTTCT 

GCTCCTGAGATTCCAAGGCTTTTCTTCTCTGAATGGCTTTCTTCTTCATATCCCCACACC 

GATTATTCCTCTGAGTTTACCGACTCTAAGCACAGTCAAGCTCCAAATGTCGAAGAGACT 

CTCTCAGCTTATGAAGAAATGGGTGATGTTGATCAGTTCCATTACAACGAAATGATGATC 

AACAACAGCAACTGGACTCTTAACGACATTGTGTTTGGTTCCAAATGTAAGAAGCAGGAG 

CATCATATTTATAGAGAGGCTTCAGATTGTAATTCTTCTGCTGAATTCTTTTCTCCACCA 

ACAACGACGTAAATTGCGTTTATTGTAATGTAAATCAAATTTCTAAGGCAAAACCGGAAA 
AAAAAAAAAAAAAAAAAAAA 

>G237 Amino Acid Sequence (domain in AA coordinates: 11-113) 

MAKTKYGERHRKGLWSPEEDEKLRSFILSYGHSCWTTVPIKAGLQRNGKSCRLRWINYLR 

PGLKRDMISAEEEETILTFHSPLGNKWSQIAKFLPGRTDNEIKNYWHSHLKKXWLKSQSL 

QDAKSISPPSSSSSSLVACGERNPETLISNHVFSLQRLLENKSSSPSQESNGNWSHQCSS 

APEIPRLFFSEWLSSSYPHTDYSSEFTDSKHSQAPNVEETLSAYEEMGDVDQFHYNEMMI 

NNSNWTLND I VFGSKCKKQEHH I YREASDCNS S AEFFS PPTTT * 

>G2373 (48.. 1199) 

GCAAAATCCTCAGATCGTCTTACCTTCTCCGAATCGATCGATTTTTCATGGAGGACGACG 

ACGAGATTCAGTCAATTCCATCTCCGGGAGATTCTTCCCTTTCACCACAAGCTCCTCCTT 

CTCCGCCGATTTTGCCAACAAACGACGTGACGGTGGCCGTCGTGAAGAAACCACAACCGG 

GGCTTTCTTCTCAATCTCCGTCCATGAACGCTTTAGCGTTAGTGGTTCATACTCCTTCTG 

TAACCGGTGGTGGTGGTAGCGGAAACAGAAACGGACGAGGAGGAGGAGGAGGAAGCGGTG 

GTGGTGGAGGAGGAAGAGATGATTGTTGGAGCGAAGAAGCTACAAAGGTTCTAATCGAAG 

CTTGGGGAGATCGATTCTCTGAACCAGGTAAAGGAACTTTGAAGCAACAACATTGGAAAG 

AAGTAGCTGAGATTGTGAACAAGAGTCGTCAATGCAAATACCCTAAAACTGATATTCAGT 

GTAAGAACAGAATTGATACGGTGAAGAAGAAGTATAAGCAAGAGAAAGCTAAGATTGCTT 

CTGGTGATGGACCTAGTAAATGGGTTTTCTTCAAGAAGCTTGAGAGTTTGATTGGTGGTA 

CTACAACATTCATTGCTTCTTCAAAAGCTTCAGAGAAGGCTCCTATGGGAGGAGCTCTTG 

GGAATAGCCG1TCGAGTATGTTTAAACGGCAAACTAAAGGTAATCAGATTGTGCAGCAAC 

AACAAGAGAAGAGAGGCTCTGA1TCGATGCGGTGGCATTTTAGGAAACGTAGTGCTTCT 

AGACTGAGTCTGAGTCTGATCCTGAACCTGAGGCTTCTCCTGAGGAATCTGCTGAGAGTC 

TCCCACCTTTGCAACCGATTCAACCGCTTTCGTTTCATATGCCAAAGCGGTTGAAGGTGG 

ATAAGAGTGGAGGTGGAGGGAGTGGAGTTGGAGATGTGGCGAGGGCGATACTTGGATTTA 

CGGAAGCTTATG AGAAGG CGG AAACTG CTAAGCTTAAGTTAATGG CGGAACTGGAAAAGG 
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AGAGGATGAAATTTGCTAAAGAGATGGAGTTGCAGAGAATGCAGTTCTTGAAAACTCAAT 

TGGAGATAACACAGAACAATCAAGAAGAGGAAGAGAGGAGCAGGCAGCGAGGAGAAAGGA 

GGATCGTTGATGATGATGATGATCGCAATGGCAAGAATAACGGCAATGTAAGTAGCTGAC 

AATTGAACACACAAATGTTCCTATGATATTTGCTATGATAAGCTGGATTTTAGGTTTTGA 
TGG 

>G2373 Amino Acid Sequence (domain in AA coordinates :290-350) 
MEDDDEIQSIPSPGDSSLSPQAPPSPPILPTNDVTVAWKKPQPGLSSQSPSMNALALW 

HTPSVTGGGGSGNRNGRGGGGGSGGGGGGRDDCWSEEATKVLIEAWGDRFSEPGKGTLKQ 
QHWKEVAEIWKSRQCKYPKTDIQCKNRIDTV^^ 

LIGGTTTFIASSKASEKAPMGGALGNSRSSMFKRQTKGNQIVQQQQEKRGSDSMRWHFRK 
RSASETESESDPEPEASPEESAESLPPLQPIQPLSFHMPICRLKVDKSGGGGSGVGDVARA 

ILGFTEAYEKAETAKLKLMAELEKERMKFAKEMELQRMQFLKTQLEITQNNQEEEERSRQ 
RGERRI VDDDDDRNGKNNGNVS S * 
>G2376 (39.. 1370) 

CACGAGCTTCTGACTCAGATCCGGCGATATCGAATTCCATGGAGGACGATGAAGACATCC 
GATCTCAGGGTTCCGATTCACCTGATCCGTCTTCCTCCCCGCCGGCGGGACGAATCACGG 
TTACGGTGGCTTCGGCAGGTCCGCCTTCTTATTCTCTGACTCCTCCGGGTAATTCGTCGC 
AGAAGGATCCGGATGCGTTGGCTCTGGCGCTGCTTCCGATTCAGGCCAGCGGTGGAGGGA 
ATAACAGCAGTGGGAGACCAACCGGCGGCGGCGGGAGGGAGGATTGTTGGAGCGAAGCAG 
CTACGGCTGTGTTGATTGATGCGTGGGGTGAGAGATACTTGGAGCTTAGCAGAGGGAATC 
TGAAGCAGAAGCACTGGAAAGAGGTGGCTGAGATTGTGAGCAGCAGAGAGGATTACGGTA 
AAATTCCCAAAACTGATATACAGTGTAAGAATAGGATCGATACGGTGAAGAAGAAGTATA 
AACAAGAGAAGGTGAGAATCGCTAACGGCGGTGGCCGTAGCAGATGGGTGTTCTTCGACA 
AGCTTGACCGTCTGATTGGATCAACGGCGAAGATCCCGACGGCAACTTCTGGAGTCAGCG 
GTCCTGTCGGAGGATTGCATAAGATTCCTATGGGTATTCCAATGGGAAGTCGTTCGAATC 
TGTACCATCAGCAAGCTAAGGCTGCAACACCGCCTTTCAATAATCTTGACCGGTTAATTG 
GAGCTACGGCTAGAGTCTCAGCTGCTTCTTTCGGTGGCAGTGGTGGAGGAGGCGGAGGAG 
GATCTGTCAATGTACCTATGGGAATTCCGATGAGTAGCCGTTCAGCTCCGTTTGGACAGC 
AAGGGAGGACTCTGCCACAGCAAGGTAGGACACTGCCACAGCAACAGCAGCAAGGGATGA 
TGGTGAAAAGGTGTAGTGAGTCGAAACGCTGGCGTTTCAGGAAGAGGAACGCTTCTGATT 
CAGACTCGGAATCTGAAGCAGCAATGTCAGATGATTCCGGTGACAGTTTACCACCTCCTC 
CTCTGTCGAAGAGGATGAAGACGGAGGAGAAGAAGAAGCAAGATGGTGATGGAGTGGGGA 
ACAAATGGAGGGAGCTGACTCGGGCAATCATGAGATTCGGTGAAGCTTATGAGCAAACAG 
AGAATGCGAAACTGCAACAGGTGGTTGAGATGGAGAAAGAGAGGATGAAGTTCTTGAAGG 
AGCTTGAGTTGCAGAGAATGCAGTTCTTTGTGAAGACTCAATTGGAGATATCACAACTTA 
AGCAGCAACATGGGAGGAGAATGGGAAACACCAGTAATGATCATCATCACAGCCGCAAGA 

ACAACATCAATGCGATTGTCAACAACAACAACGATTTGGGTAATAACTAGAATTTAGTGA 
TGCAGTGTCGTAATTGATATATTTTAGATTTGAG 

>G2376 Amino Acid Sequence (domain in AA coordinates : 79-178 336-408) 

MEDDEDIRSQGSDSPDPSSSPPAGRITVTVASAGPPSYSLTPPGNSSQKDPDALALALLP 

IQASGGGNNSSGRPTGGGGREDCWSEAATAVLIDAWGERYLELSRGNLKQKHWKEVAEIV 

SSREDYGKIPKTDIQCKNRIDTVKKKYKQEKVRIANGGGRSRWVFFDKLDRLIGSTAKIP 

TATSGVSGPVGGLHKIPMGIPMGSRSNLYHQQAKAATPPFNWLDRLIGATARVSAASFGG 

SGGGGGGGSVl^VPMGIPMSSRSAPFGQQGRTLPQQGRTLPQQQQQGMMVKRCSESKRWRF 

RKRNASDSDSESEAAMSDDSGDSLPPPPLSKRMKTEEKKKQDGDGVGNKWRELTRAIMRF 

GEAYEQTENAKLQQWEMEKERMKFLKELELQRMQFFVKTQLEISQLKQQHGRRMGNTSN 
DHHHSRKNNINAIVNNNNDLGNN* 

>G24 (194.. 7244- 

CGGACGCGTGGGCAAATATTAAAATAAAAAGTGTCGGTGAATTCTCAATCTTTGTCTTCT 

TTCGTCGTCTCTTTAAAACTCCTCCGTCCCTCCTTATTATGTAACCGTCTCGCCGTCAAA 

TTTTGAAAATCTCTCCCTCCGTTCATAAACCCAGATCGAAATTTATGGTTTTGTAATTTO 

TTTACCGGCGGTTATGGAGACGGAAGCGGCGGTGACAGCGACGGTTACGGCGGCGACGAT 

GGGGATTGGGACGAGGAAGAGAGATCTGAAACCGTATAAAGGAATACGAATGAGGAAATG 

GGGGAAATGGGTGGCGGAGATACGGGAACCGAATAAGAGATCAAGGATCTGGTTAGGTTC 

TTATGCGACGCCTGAAGC GGCG GCGAGAGCTTACGACACTGCT 

TCCTTCAGCGAGGCTTAATTITCCGGAGCTTTTGGCTGG 

AGGAAGAGGTGGTGATTTATCGGCGGCGTATATTAGGAGAAAAGCGGCGGAGGTTGGTGC 
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TCAGGTTGATGCGCTTGGAGCGACGGTGGTTGTGAATACCGGCGGCGAGAATCGCGGTGA 

TTACGAGAAGATTGAGAATTGTCGTAAGAGCGGTAACGGGTCATTGGAACGGGTCGATTT 

GAATAAATTACCCGACCCGGAAAATTCGGATGGTGATGATGACGAATGTGTGAAAAGAAG 

ATAGAAAAAATAAAAAGTAGTTGTAGAAGGAGAGACGAGAATGTTTGTCTTTAAGATGCG 

CTGTTGCCGCTAACATGCGCTTTCGATTTTAGTGTTAAACATGCGCCTCCATTGTTTTTG 

GGTTTTGTTTTCGTCGTCGATAATCAAAGATTTTAAAACACAATTCTCAAATTTTTCACT 

TGTTACAAACTAGATTTGCATGATCTTTGTATTAACGAATAACGATTAAGTCCTAAA 

>G24 Amino Acid Sequence (domain in AA coordinates: 25-93) 

METEAAVTATVTAATMGIGTRKKDIiKPYKGIRMRKWGKWVAEIREPNKRSRIWL.GSYATP 

EAAARAYDTAVFYLRGPSARLNFPELLAGLTVSNGGGRGGDLSAAYIRRKAAEVGAQVDA 

LGATVWNTGGENRGDYEKIENCRKSGNGSLERVDLNKLPDPENSDGDDDECVKRR* 

>G2424 (1..999) 

ATGAGGATGGAGATGGTGCATGCTGACGTGGCGTCTCTCTCCATAACACCTTGCTTCCCG 
TCTTCTTTGTCTTCGTCCTCACATCATCACTATAACCAACAACAACATTGTATCATGTCG 
GAAGATCAACACCATTCGATGGATCAGACCACTTCATCGGACTACTTCTCTTTAAATATC 
GACAATGCTCAACATCTCCGTAGCTACTACACAAGTCATAGAGAAGAAGACATGAACCCT 
AATCTAAGTGATTACAGTAATTGCAACAAGAAAGACACAACAGTCTATAGAAGCTGTGGA 
CACTCGTCAAAAGCTTCGGTGTCTAGAGGACATTGGAGACCAGCTGAAGATACTAAGCTC 
AAAGAACTAGTCGCCGTCTACGGTCCACAAAACTGGAACCTCATAGCTGAGAAGCTCCAA 
GGAAGATCCGGGAAAAGCTGTAGGCTTCGATGGTTTAACCAACTAGACCCAAGGATAAAT 
AGAAGAGCCTTCACTGAGGAAGAAGAAGAGAGGCTAATGCAAGCTCATAGGCTTTATGGT 
AACAAATGGGCGATGATAGCGAGGCTTTTCCCTGGTAGGACTGATAATTCTGTGAAGAAC 
CATTGGCATGTTATAATGGCTCGCAAGTTTAGGGAACAATCTTCTTCTTACCGTAGGAGG 
AAGACGATGGTTTCTCTTAAGCCACTCATTAAC^ 

GACCCTACCCGGTTAGCTTTGACCCACCTTGCTAGTAGTGACCATAAGCAGCTTATGTTA 
CCAGTTCCTTGCTTCCCAGGTTATGATCATGAAAATGAGAGTCCATTAATGGTGGATATG 
TTCGAAACCCAAATGATGGTTGGCGATTACATTGCATGGACACAAGAGGCAACTACATTC 
GATTTCTTAAACCAAACCGGGAAGAGTGAGATATTTGAAAGAATCAATGAGGAGAAGAAA 
CCACCATTTTTCGATTTTCTTGGGTTGGGGACGGTGTGA 

>G2424 Amino Acid Sequence (conserved domain in AA coordinates : 107-219) 

MRMEMVHADVASLSITPCFPSSLSSSSHHHYNQQQHCIMSEDQHHSMDQTTSSDYFSLNI 

DNAQHLRSYYTSHRE^DMNPNLSDYSNCNKKDTTVYRSCGHSSKASVSRGHWRPAEDTKL 

KELVAVYGPQNWNLIAEKLQGRSGKSCRLRWFNQLDPRINRRAFTEEEEERLMQAHRLYG 

NKWAMIARLFPGRTDNSVTCNHWHVIMAR 

DPTRLALTHLAS SDHKQLMLPVPCFPGYDHENES PLMVDMFETQMMVGDYI AWTQEATTF 

DFLNQTGKSEIFERINEEKKPPFFDFLGLGTV* 

>G2505 (1..1026) 

ATGGGTTCTTCGTCGAACGGAGGAGTGCCACCTGGTTTCCGGTTTCATCCGACGGACGAA 

GAGCTTCTCCATTACTACTTGAAGAAGAAAATCTCTTACCAAAAGTTTGAGATGGAAGTC 

ATCAGAGAGGTTGACTTAAACAAGCTTGAGCCTTGGGATTTGCAAGAGAGATGCAAGATA 

GGATCAACACCACAAAACGAATGGTACTTCTTCAGCCACAAGGACAGGAAATATCCGACG 

GGGTCAAGGACCAACCGTGCTACTCATGCAGGGTTCTGGAAGGCGACGGGACGTGACAAG 

TGCATAAGGAACTCTTACAAAAAGATAGGAATGAGGAAGACACTTGTGTTCTACAAAGGT 

AGAGCTCCTCATGGCCAAAAGACTGATTGGATCATGCATGAGTACCGTCTTGAAGACGCT 

GATGATCCTCAAGCCAACCCTAGTGAAGATGGATGGGTGGTATGTAGAGTGTTTATGAAG 

AAAAATTTGTTCAAGGTAGTAAATGAAGGTAGCTCAAGCATTAACTCATTGGACCAACAC 

AACCATGACGCATCTAACAACAACCATGCACTTCAAGCTCGTAGCTTTATGCACGGAGAC 

AGTCCATACCAGCTAGTACGTAACCACGGAGCCATGACATTCGAACTTAACAAGCCTGAC 

CTTGCTCTTCATCAATACCC^CCAATCTTCCACAAGCCACCTTCACTTGGATTTGACTAC 

TCTTCAGGACTTGCAAGGGACAGTGAGAGTGCGGCTAGTGAAGGGTTACAATACCAGCAA 

GCGTGTGAGCCGGGTTTAGACGTTGGTACATGTGAGACAGTGGCTAGTCATAATCATCAA 

CAAGGTCTAGGTGAATGGGCAATGATGGATAGACTTGTGACTTGTCACATGGGAAATGAA 

GATTCCTCTAGAGGGATTACGTATGAGGATGGTAACAACAATTCGTCCTCTGTGGTTCA 

CCAGTTCCCGCGACGAACCAGCTAACATTGCGTAGTGAGATGGATTTCTGGGGTTATTCT 

AAATAG 

>G2505 Amino Acid Sequence (domain in AA coordinates: 10-159) 
MGSSSNGGVPPGFRFHPTDEELLHYYLKKKISYQKFEMEVIREVXJLNKLEPWDLQERCKI 
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GSTPQNEWYFFSHKDRKYPTGSRTNRATHAGFWKATGRDKCIRNSYKKIGMRKTLVFYKG 

RAPHGQKTDWIMHEYRLEDADDPQANPSEDGWWCRVFMKKNLFKVVNEGSSSINSLDQH 

NHDASNNNHALQARS FMHRDS PYQLVRNHGAMTFELNKPDLALHQYPP I FHKPPSLGFDY 

SSGLARDSESAASEGLQYQQACEPGLDVGTCETVASHNHQQGLGEWAMMDRLVTCHMGNE 

DSSRGITYEDGNNNSSSWQPVPATNQLTLRSEMDFWGYSK* 

>G2512 (64.. 798) 

AACTTAGTGCCACTTAGACACAATAAGAAAACCGTTAACAAGAAGAAAAAAAAAAGATCG 
AAAATGGAATATCAAACTAACTTCTTAAGTGGAGAGTTTTCCCCGGAGAACTCTTCTTCA 
AGCTCATGGAGCTCACAAGAATCATTCTTGTGGGAAGAGAGTTTCTTACATCAATCATTT 
GACCAATCCTTCCTTTTATCTAGCCCTACTGATAACTACTGTGATGACTTCTTTGCATTT 
GAATCATCAATCATAAAAGAAGAAGGAAAAGAAGCCACCGTGGCGGCCGAGGAGGAGGAG 
AAGTCATACAGAGGAGTGAGGAAACGGCCGTGGGGGAAATTCGCGGCCGAGATAAGAGAC 
TCAACGAGGAAAGGGATAAGAGTGTGGCTTGGGACATTCGACACCGCGGAGGCGGCGGCT 
CTCGCTTATGATCAGGCGGCTTTCGCTTTGAAAGGCAGCCTCGCAGTACTCAATTTCCCC 
GCGGATGTCGTTGAAGAATCTCTCCGGAAGATGGAGAATGTGAATCTCAATGATGGAGAG 
TCTCCGGTGATAGCCTTGAAGAGAAAACACTCCATGAGAAACCGTCCTAGAGGAAAGAAG 
AAATCTTCTTCTTCTTCGACGTTGACATCTTCTCCTTCTTCCTCCTCCTCCTATTCATCT 
TCTTCGTCTTCTTCTTCTTTGTCGTCAAGAAGTAGAAAACAGAGTGTTGTTATGACGCAA 
GAAAGTAATACAACACTTGTGGTTCTTGAGGATTTAGGTGCTGAATACTTAGAAGAGCTT 
ATGAGATCATGTTCTTGATAATCTCTGCTTCTACAATTTTTATGTAATTTGA 

>G2512 Amino Acid Sequence (conserved domain in AA coordinates : 79-139) 
MEYQTNFLSGEFSPENSSSSSWSSQESFLWEESFLHQSFDQSFLLSSPTDNYCDDFFAFE 
SSI IKEEGKEATVAAEEEEKS YRGVRKRPWGKFAAEIRDSTRKGI RVWLGTFDTAEAAAL 
AYDQAAFALKGSLAVXNFPADWEESLRKMEN^ 

SSSSSTLTSSPSSSSSYSSSSSSSSLSSRSRKQSVVMTQESNTTIiVVLEDLGAEYLEELM 
RSCS* 

>G2513 (69.. 698) 

TTTCAACAGTAATTTAAGTTAACCGGAGTCTCTTTTTGTTTTCCGGCGAATTTTTGGTAC 
TTTGAGTTATGAATAATGATGATATTATTCTGGCGGAGATGAGGCCTAAGAAGCGTGCGG 
GAAGGAGAGTGTTTAAGGAGACACGTCACCCAGTTTACAGAGGCATAAGGCGGAGGAACG 
GTGACAAATGGGTCTGCGAAGTCAGAGAACCGACGCACCAACGCCGCATTTGGCTCGGGA 
CTTATCCCACAGCAGATATGGCAGCGCGTGCACACGACGTGGCGGTTTTAGCTCTGCGTG 
GGAGATCCGCATGTTTGAATTTCGCCGACTCCGCTTGGCGGCTTCCGGTGCCGGAATCCA 
ATGATCCGGATGTGATAAGAAGAGTTGCGGCGGAAGCTGCGGAGATGTTTAGGCCGGTGG 
ATTTAGAAAGTGGAATTACGGTTTTGCCTTGTGCGGGAGATGATGTGGATTTGGGTTTTG 
GTTCGGGTTCCGGCTCTGGTTCGGGATCGGAGGAGAGGAATTCTTCTTCGTATGGATTTG 
GAGACTACGAAGAAGTCTCAACGACGATGATGAGACTCGCGGAGGGGCCACTAATGTCGC 
CGCCGCGATCGTATATGGAAGACATGACTCCTACTAATGTTTACACGGAAGAAGAGATGT 
GTTATGAAGATATGTCATTGTGGAGTTACAGATATTAAGTGGGACTCACATATCTACTAT 
ACATAATATTTAGCTTTTATGTAAGAGGTATTTATGTGAGTTTTAAGATTGTAGATGTGT 
CCCAGGCGTTAGAAGTTTCCTTGATGGTATGGAATCTTTGTACCTATAAAATTATAAAAT 
T 

>G2513 Amino Acid Sequence (domain in AA coordinates: TBD) 

MNNDDIILAEMRPKKRAGRRVFKETRHPW 

TADMAARAHDVAVLALRGRSACLNFADSAWRLPTO 

SGITVTiPCAGDDVBLGFGSGSGSGSGSEERNSSSYGFGDYEEVSTTMMRLAEGPLMSPPR 

SYMEDMTPTNVYTEEEMCYEDMSLWSYRY* 

>G2519 (83..69i) 

CAAAGTGAAAACATAAGATCATCTTCTTCGTTGATAGATCAATATAGGAACTCCAGAAGA 

GAATCTTGATC^TTAAGTATCATGTCTCACATCGCTGTTGAAAGGAATCGAAGAAGGCA 

AATGAACGAGCATCTTAAATCCCTTCGTTCTTTGACTCCTTGTTTCTACATCAAAAGGGG 

AGATCAAGCTTCGATCATCGGAGGAGTGATAGAGTTCATC^^ 

TGAAGTTCTTGAGTCCAAGAAACGTCGAAAGACCCTA 

TCACC^GACAATCGAGCCATCCAGTTTAGGAGCCGCCACTACCCGAGTACCGTTTAGTCG 
AATCGAAAATGTGATGAC(^CAAGTACTTTCAAGGAAGTAGGAGC^TGCTGTAACTCCCC 
TCATGCTAACGTAGAAGCAAAGATTTCAGGTTCTAATGTTGTATTGAGAGTTGTCTCTAG 
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TCTTCACCTCAATATTAGTAGCATGGAGGAGACTGTCTTATACTTTTTCGTTGTTAAGAT 
AGGATTGGAGTGTCACTTAAGCTTGGAGGAGCTAACTCTTGAAGTTCAGAAAAGCTTTGT 
GTCTGATGAAGTGATCGTCTCTACCAATTAAAAACAAAATTCTACATGTACTAGAGCGTG 
TATCGTTTTTTGGGATTAATAATCATATAATCGTTACATGAGCCTTGATACTTTGCTAGA 
AATAAGCTCCTCTAAACAAAACCTTCTTTTTAAAAAAACACACTTATGTTTTACTTAGTT 
TGTTGTTGTATCCGAAGTTGATCAACGTTGTAATTTCCCACAATAAATCATGACATTTTA 

TATGCTCT 

>G2519 Amino Acid Sequence (domain in AA coordinates : 1-65) 
MSHIAVERNRRRQMNEHLKSLRSLTPCFYIKRGDQASIIGGVIEFIKELQQLVQVLESKK 
RRKTLNRPSFPYDHQTIEPSSLGAATTRVPFSRIENVMTTSTFKEVGACCNSPHANVEAK 
ISGSNWLRWSRRIVGQLVKIISVLEKLSFQVLHLNISSMEETVLYFFWKIGLECHLS 

LEELTLEVQKSFVSDEVIVSTN* 
>G2520 (133 . .1197) 

AAGGAGTTTTGCATACTCACCAAGCCACAATCATTTCTCTCTTCTCTATCTCTCTGGTTT 

TGAATCGGCGACGACTGAGTCAACTCGGTGTTGTTACTGGTTTCGTCGTATGTGTTGTAA 

CTGATTAAGTTGATGGATCCGAGTGGGATGATGAACGAAGGAGGACCGTTTAATCTAGCG 

GAGATCTGGCAGTTTCCGTTGAACGGAGTTTCAACCGCCGGAGATTCTTCTAGAAGAAGC 

TTCGTTGGACCGAATCAGTTCGGTGATGCTGATCTAACCACAGCTGCTAACGGTGATCCA 

GCGCGTATGAGTCACGCGTTGTCTCAGGCGGTTATTGAAGGTATCTCCGGCGCTTGGAAA 

CGGAGGGAAGATGAGTCTAAGTCGGCGAAGATCGTCTCCACCATTGGCGCTAGTGAAGGT 

GAGAACAAAAGACAGAAGATAGATGAAGTGTGTGATGGGAAAGCAGAAGCAGAATCGCTA 

GGAACAGAGACGGAACAAAAGAAGCAACAGATGGAACCAACGAAAGATTATATTCATGTT 

CGAGCTAGAAGAGGTCAAGCTACTGATAGTCACAGTTTAGCTGAAAGAGCGAGAAGAGAG 

AAAATAAGTGAGCGGATG^^AAATCTTGCAAGATCTTGTTCCGGGATGTAACAAGGTTATT 

GGAAAAGCACTTGTTCTAGATGAGATAATTAACTATATACAATCATTGCAACGTCAAGTT 

GAGTTCTTATCGATGAAGCTTGAAGC^GTCAACTC^lAGAATGAACCCTGGTATCGAGGTT 

TTTCCACCCAAAGAGGTGATGATTCTCATGATCATCAACTCAATCTTCTCCATTTTTTTC 

AOU^C^TACATGTTTCTATCGAGGTATTCTCGGGGTAGGAGTCTCGATGTTTATGCG 

GTTCGGTCATTTAAGCATTGCAATAAACGGAGTGACCTCTGTTTTTGCTCCTGCTCCCCA 

AAAACAGAACTTAAGACAACTATATTTTCACAAAACATGACATGTTTCTGTCGATATTCT 

CGAGTAGGAGTCGCTATTAGTTCATCTAAGCATTGCAATGAACCGTTTGGTCAGCAAGCG 

TTTGAGAATCCGGAGATACAGTTCGGGTCGCAGTCTACGAGGGAATACAGTAGAGGAGCA 

TCACCAGAGTGGTTGCACATGCAGATAGGATCAGGTGGTTTCGAAAGAACGTCTTGA 

>G2520 Amino Acid Sequence (domain in AA coordinates: 135-206) 

MDPSGMM^GGPFNLAEIWQFPLNGVSTAGDSSRRSFVGPNQFGDADLTTAANGDPARMS 

HALSQAVIEGISGAWKRREDESKSAKIVSTIGASEGENKRQKIDEVCDGKAEAESLGTET 

EQKKQQMEPTKDYIHVRARRGQATDSHSLAERARREKISERMKILQDLVPGCNKVIGKAL 

VLDEIINYIQSLQRQVEFLSMKLEAVNSRMNPGIEVFPPKEVMILMIINSIFSIFFTKQY 

MFLSRYSRGRSLDVYAVRSFKHCNKRSDLCFCSCSPKTELKTTIFSQNMTCFCRYSRVGV 

AISSSKHCNEPFGQQAFENPEIQFGSQSTREYSRGASPEWLHMQIGSGGFERTS* 

>G2533 (1..1080) 

ATGATAAGCAAGGATCCAATATCGAGTTTACCTCCAGGGTTTCGATTTCATCCAACAGAT 

GAAGAACTCATTCTCCATTACCTAAGGAAGAAAGTTTCCTCTTCCCCAGTCCCGCTTTCG 

ATTATCGCCGATGTCGATATCTACAAATCCGATCCATGGGATTTACCAGCTAAGGCTCCA 

TTTGGGGAGAAAGAGTGGTATTTTTTCAGTCCGAGGGATAGGAAATATCCAAACGGAGCA 

AGACCAAACAGAGCAGCTGCGTCTGGATATTGGAAAGCAACCGGAACAGATAAATTGATT 

GCGGTACCAAATGGTGAAGGGTTTCATGAAAACATTGGTATAAAAAAAGCTCTTGTGTTT 

TATAGAGGAAAGCCTCCAAAAGGTGTTAAAACCAATTGGATC^TGCATGAATATCGTCTT 

GCCGATTCATTATCTCCCAAAAGAATTAACTCTTCTAGGAGCGGTGGTAGCGAAGTTAAT 

AATAATTTTGGAGATAGGAATTCTAAAGAATATTCGATGAGACTGGATGATTGGGTTCTT 

TGCCGGATTTACAAGAAATCACACGCTTCATTGTC^ 

AGCAATCAAGAGCATGAGGAAAATGACAACGAAC(^TTCG 

CCAAATTTGCAAAATGATCAACCCCTTAAACGCC^^ 

TTACTAGACGCTACAGATTTGACGTTTCTCGCAAATTTTCTAAACGAAACCCCGGAAAAT 
CGTTCTGAATCAGATTTTTCTTTCATGATTGG 

AACC^TTACTTGGATC^GAAGTTACCGCAGTTGAGCTCTCCC^CTTCAGAGACAAGCGGC 
ATCGGAAGCAAAAGAGAGAGAGTGGATTTTGCGGAAGAAACGATAAACGCTTCGAAGAAG 
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ATGATGAACACATATAGTTACAATAATAGTATAGATCAAATGGATCATAGTATGATGCAA 

CAACCTAGTTTCCTG7UVCCAGGAACTCATGATGAGTTCTCACCTTCAATATCAAGGCTAG 

>G2533 Amino Acid Sequence (conserved domain in AA coordinates : 11-186) 

MISKDPISSLPPGFRFHPTDEELILHYLRKKVSSSPVPLSIIADVDIYKSDPWDLPAKAP 

FGEKEWYFFSPRDRKYPNGARPNRAAASGYWKATGTDKIiIAVPNGEGFHENIGIKKALVF 

YRGKPPKG VKTNW IMHE YRLADSLS PKR INSSRSGGS EVNNNFGDRNS KE YSMRLDDWVL 

CRIYKKSHASLSSPDVALVTSNQEHEENDNEPFVDRGTFLPNLQNDQPLKRQKSSCSFSN 

LLDATDLTFLANFLNETPENRSESDFSFMIGNFSNPDIYGNHYLDQKLPQLSSPTSETSG 

IGSKRERVT)FAEETINASKKMMNTYSYNNSIDQMDHSMMQQPSFLNQELMMSSHLQYQG* 

>G2534 (1. .975) 

ATGGATAATATAATGCAATCGTCAATGCCACCGGGATTCCGATTTCATCCGACAGAGGAA 
GAGCTTGTGGGTTATTACCTAGATAGGAAGATCAATTCAATGAAGAGTGCTTTAGATGTC 
ATTGTAGAGATTGATCTCTACAAAATGGAGCCATGGGATATACAAGCGAGGTGTAAACTA 
GGGTATGAAGAGCAAAACGAGTGGTACTTCTTTAGTCATAAGGACAGGAAGTACCCTACC 
GGGACTAGGACCAACCGAGCCACTGCGGCTGGGTTCTGGAAAGCCACGGGTAGAGACAAG 
GCGGTACTATCAAAAAACAGTGTCATCGGAATGCGGAAGACACTTGTCTACTACAAGGGT 
CGAGCTCCTAATGGAAGAAAGTCCGATTGGATCATGCACGAATACCGTCTCCAAAACTCC 
GAGCTTGCCCCGGTTCAGGAGGAAGGCTGGGTGGTGTGTCGAGCATTTAGGAAGCCAATT 
CCAAACCAGAGGCCATTAGGGTACGAGCCATGGCAGAACCAGCTCTACCACGTCGAAAGT 
AGTAACAACTACTCATCTTCAGTGACAATGAACACGAGTCATCATATCGGTGCATCTTCA 
TCAAGTCATAACCTTAATCAAATGCTCATGAGCAATAACCACTACAATCCTAATAATACA 
TCCTCATCGATGCATCAATATGGCAACATTGAGCTCCCGCAGTTGGACAGCCCGAGCTTG 
TCGCCTAGTTTAGGGACGAATAAAGATCAGAACGAGAGTTTCGAGCAAGAAGAAGAGAAG 
AGCTTTAACTGTGTGGATTGGAGAACACTAGATACCTTGCTTGAGACACAAGTCATACAT 
CCGCATAACCCTAATATTCTTATGTTCGAAACGCAGTCGTATAATCCGGCGCCAAGCTTC 
CCTTCCATGCATCAAAGCTATAATGAGGTCGAAGCTAATATTCATCATTCTCTTGGATGC 
TTCCCTGACTCGTAA 

>G2534 Amino Acid Sequence (conserved domain in AA coordinates : 10-157) 

I^NIMQSSMPPGFRFHPTEEELVGYYLDRKJINSMKSALDVIV^IDLYKMEPWDIQARCKL 

GYEEQNFIWYFFSHKDRKYPTGTRTNRATAAGFWKATC 

RAPNGRKSDWIMHEYRLQNSELAPVQEEGWWCRAFRKPIPNQRPLGYEPWQNQLYHVES 
SNNYSSSVTMNTSHHIGASSSSHNLNQM^ 

SPSLGTNKDQNESFEQEEEKSFNCVDWRTLDTLLETQVIHPHNPNILMFETQSYNPAPSF 
PSMHQS YNEVEAJMIHHSLGCFPDS * 
>G2573 (34.. 957) 

CCAGATTTAATTTGAGACTCTCAAAGAAACACCATGGAAGAAGAGCAACCTCCGGCCAAG 

AAACGAAACATGGGGAGATCTAGAAAAGGTTGCATGAAAGGTAAAGGCGGTCCAGAGAAC 

GCCACGTGTACTTTCCGTGGAGTTAGGCAACGGACTTGGGGTAAATGGGTGGCTGAGATC 

CGTGAGCCTAACCGTGGGACTCGTCTCTGGCTCGGCACGTTTAATACCTCGGTCGAGGCC 

GCCATGGCTTACGATGAAGCCGCTAAGAAACTCTATGGACACGAGGCTAAACTCAACTTG 

GTGCACCCACAACAACAACAACAAGTAGTAGTGAACAGAAACTTGTCTTTTTCTGGCCAC 

GGGTCGGGTTCTTGGGCTTATAATAAGAAGCTCGATATGGTTCATGGGTTGGACCTTGGT 

CTCGGCC^GGCAAGTTGTTCACGAGGTTCTTGCTCAGAGAGATCGAGTTTTCTACAAGAA 

GATGATGATCATAGTCATAATCGATGTTCGTCTTCAAGTGGTTCGAATCTTTGTTGGTTA 

TTACCTAAACAAAGTGATTCACAAGATCAAGAGACCGTTAATGCTACGACTAGTTATGGC 

GGTGAAGGCGGTGGTGGCTCTACGTTAACGTTTTCGACCAATTTGAAACCAAAGAATTTG 

ATGAGTCAGAATTATGGATTATACAATGGAGCTTGGTCTAGGTTTCTTGTGGGGCAAGAA 

AAGAAGACGGAACA¥GACGTGTCATCGTCGTGTGGATCGTCGGACAACAAGGAGAGTATG 

TTGGTTCCTAGTTGCGGCGGAGAGAGGATGCATAGGCCGGAGTTGGAAGAGCGAACAGGA 

TATTTGGAAATGGATGATCTTTTGGAGATTGATGATTTAGGTTTGTTGATTGGCAAAAAT 

GGAGATTTCAAGAATTGGTGTTGTGAAGAGTTTCAACATCCATGGAATTGGTTCTGAGAG 

TTTTTATTTATTACTATTATTTATCATACATATTTCTTATATTTC 

>G2573 Amino Acid Sequence (domain in AA coordinates: TBD) 

MEEEQPPAJCKR^GRSRKGCMKGKGGPENATCTFRGVRQRTO^ 

GTFNTSV^LAAMAYDEAAKKLYGHEAKLl^^ 

DMVHGLDLGLGQASCSRGSCSERSSFLQEDDDHSHNRCSSSSGSNLCWLLPKQSDSQDQE 
TV1JATTSYGGEGGGGSTLTFSTNLKPKNLMSQNYGLY 
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GSSDNKESMLVPSCGGERMHRPELEERTGYLEMDDLLEIDDLGLLIGKNGDFKN^CCEEF 
QHPWNWF* 

>G2589 (23.. 1354) 

AAAGAAAAGAAAAATAAAGATAATGAGGACGAAGACTAAGTTAGTACTCATACCTGATAG 

ACACTTTCGGAGAGCCACATTCAGGAAGAGGAATGCAGGGATAAGGAAGAAACTCCACGA 

GCTGACAACTCTCTGTGACATCAAAGCATGTGCGGTAATCTACAGTCCGTTCGAGAATCC 

AACGGTGTGGCCGTCAACCGAAGGTGTTCAAGAGGTGATTTCGGAGTTCATGGAGAAGCC 

GGCGACAGAACGGTCCAAGACGATGATGAGTCATGAGACTTTCTTGCGGGACCAAATCAC 

CAAAGAACAAAACAAACTAGAGAGTCTACGTCGTGAAAACCGAGAAACTCAGCTTAAGCA 

TTTTATGTTTGATTG CGTTGGAGGCAAGATGAGTGAGC AACAGTATGGTG CAAGGG ACCT 

TCAAGATTTAAGTCTTTTTACTGATCAATATCTTAATCAGCTTAATGCCAGGAAGAAGTT 

CCTTACAGAATATGGTGAGTCTTCTTCTTCTGTTCCTCCTCTGTTTGATGTTGCGGGTGC 

CAATCCTCCTGTTGTTGCAGATCAAGCTGCGGTAACTGTTCCTCCTTTGTTTGCTGTTGC 

GGGTGCCAATCTTCCTGTTGTTGCTGATCAAGCTGCGGTAACTGTTCCTCCTCTGTTTGC 

TGTTGCGGGTGCCAATCTTCCTGTTGTTGCAGATCAAGCTGCGGTTAATGTTCCTACTGG 

ATTTCATAACATGAATGTGAACCAGAATCAGTATGAGCCGGTTCAGCCCTATGTCCCTAC 

TGGTTTTAGTGATCATATTCAATATCAGAATATGAACTTCAATCAAAACCAACAAGAGCC 

GGTTCATTACCAGGCTCTTGCTGTTGCGGGTGCCGGTCTTCCTATGACTCAGAATCAGTA 

TGAGCCCGTTCACTACCAGAGTCTTGCTGTCGCGGGTGGCGGTCTTCCTATGAGTCAGTT 

GCAGTATGAGCCGGTTCAGCCTTATATCCCTACTGTTTTTAGTGATAATGTTCAATATCA 

GCATATGAATTTGTATCAAAATCAACAAGAGCCGGTTCACTACCAAGCTCTTGGTGTTGC 

AGGTGCCGGTCTTCCTATGAATCAGAATCAGTATGAGCCGGTTCAGCCCTATGTCCCTAC 

TGGTTTTAGTGATCATTTTCAGTTTGAGAATATGAATTTGAATCAAAATCT^ACAGGAGCC 

GGTTCAATACCAAGCTCCTGTTGATTTTAATCATCAGATTCAACAAGGAAACTATGATAT 

GAATTTGAACCAGAATATGAGTTTGGATCCAAATCAGTATCCGTTTCAAAATGATCCATT 

CATGAATATGTTGACAGAATATCCTTATGAATAAGCGGGTTATGTTGGAGAGCATGCAC 

>G2589 Amino Acid Sequence (domain in AA coordinates: TBD) 

MRTKTKLVLIPDRHFRRATFRKRNAGIRKKLHELTTLCDIKACAVIYSPFENPTVWPSTE 

GVQEVISEFMEKPATERSKTMMSHETFLRDQITKEQNKLESLRRENRETQLKHFMFDCVG 

GKMSEQQYGARDLQDLSLFTDQYLNQLNARKKFLTEYGESSSSVPPLFDVAGANPPWAD 

QAAWVPPLFAVAGANLPWADQAAWVPPLFAVAGANLPWADQAAVOTPTGFHNMNW 

QNQYEPVQPYVPTGFSDHIQYQNMNFNQNQQEPVHYQALAVAGAGLPMTQNQYEPVHYQS 

LAVAGGGLPMSQLQYEPVQPYI PTVFSDNVQYQHMNL YQNQQE P VHYQALGVAGAGLPMN 

QNQYEPVQPYVPTGFSDHFQFSNT^LNQNQQEPVQYQAPVDFNHQIQQGNYDMNLNQNM^ 

LDPNQYPFQNDPFMNMLTEYPYE* 

>G2687 (45.. 1139) 

CTCTGTCTCTCGTATCTTTCTACTACTCTGTTTCTTGAATTCTAATGAACAACATCGACG 

ACGCAAAGACGGAGACTTCAGTGTCTTC^GGTTCAAGCGACTCTTTCTTGCCTCTCAAGA 

AACGCATGAGACTTGATGACGAACCAGAAAACGCCCTAGTGGTTTCGTCTTCACCAAAGA 

CGGTTGTGGCTTCTGGCAATGTCAAGTACAAAGGAGTCGTTCAGCAACAGAACGGTCATT 

GGGGTGCCCAGATTTACGCAGACCACAAAAGGATTTGGCTTGGAACTTTCAAATCCGCTG 

ATGAAGCCGCCACGGCTTACGATAGTGCATCTATCAAACTCCGAAGCTTTGACGCTAACT 

CGCACCGGAACTTCCCTTGGTCTACAATCACTCTCAACGAACCAGACTTTCAAAACT 

ACACAACAGAGACTGTGTTGAACATGATCAGAGACGGTTCGTACCAACACAAATTCAGAG 

ATTTTCTCAGAATCAGATCTCAGATTGTTGCGAGTATCAACATCGGGGGACCAAAACAAG 

CCCGAGGAGAAGTGAATCAAGAATCAGACAAGTGTTTTTCTTGCACACAGCTTTTTCAGA 

AGGAATTGACACCGAGCGATGTAGGGAAACTAAATAGGCTTGTGATACCTAAAAAGTATG 

CAGTGAAGTATATGGCTTTCATAAGCGCTGATCAAAGCGAGAAAGAAGAGGGTGAAATAG 

TTAGGTATTGTTACTGGAAAAGTAGCCAGAGCTTTGTCTTCACCAGAGGATGGAATAGTT 
TCGTGAAGGAGAAGAATCTCAAGGAGAAGGATGTTATTGCCTTCTACACTTGCGATGTCC 
CGAAC^TGTGAAGAC^TTAGAAGGTCAAAGAAAGAACTTCTTGATGATCGATGTTC^OT 
GCTTTTC^GACAACGGTTCCGTGGTAGCTGAGGAAGTAAGTATGACGGTTCATGACAGTT 
CAGTGCAAGTAAAGAAAACAGAAAACTTGGTTAGCTCCATGTTAGAAGATAAAGAAACCA 
AATCAGAGGAGAACAAAGGAGGGTTTATGCTGTTTGGTGTAAGGATCGAATGTCCTTAGG 
GAATTTTTCTTTAAAAGTTTCTTACTTCAACT 

>G2687 Amino Acid Sequence (domain in AA coordinates: TBD) 



121 



BNSDOCID: <WO_03013227A2JA> 



W0 03/0,3227 PCT/US02/25805 



MNNIDDAKTETSVSSGSSDSFLPLKKRMRLDDEPEMALWSSSPKTVVASGNVKYKGVVQ 

QQNGHWGAQIYADHKRIWLGTFKSADEAATAYDSASIKLRSFDANSHRNFPWSTITLNEP 

DFQNCYTTETVLNMIRDGSYQHKFRDFLRIRSQIVASINIGGPKQARGEVNQESDKCFSC 

TQLFQKELTPSDVGKLNRLVIPKKYAVKYMPFISADQSEKEEGEIVGSVEDVEWFYDRA 

MRQWKFRYCYWKSSQSFVFTRGWNSFVKEKNLKEKDVIAFYTCDVPNNVKTLEGQRKNFL 

MIDVHCFSDNGSWAEEVSMTVHDSSVQVKKTENLVSSMLEDKETKSEENKGGFMLFGVR 
IECP* 

>G27 (83.. 622) 

CAAAATACCAAAAACAAAACATTTTTTTTAATCTTCCCACCAATTTTTTTCTCTTTCTCT 
CGTTACATTAAATTATCTTTAGATGCAAGACTCTTCCTCTCACGAATCGCAACGTAACCT 
CCGGTCACCGGTGCCGGAGAAAACCGGAAAGAGTTCTAAGACTAAAAATGAGCAAAAAGG 
TGTTTCTAAACAACCAAATTTTCGTGGGGTCAGAATGAGACAATGGGGAAAATGGGTGTC 
TGAAATTAGAGAACCAAGAAAGAAATCAAGAATATGGCTCGGTACTTTCTCTACGCCGGA 
GATGGCGGCGCGTGCACACGACGTGGCGGCTTTAGCCATCAAAGGTGGCTCTGCCCACCT 
TAATTTCCCGGAGCTAGCTTACCATTTGCCGAGACCGGCTAGCGCGGACCCTAAAGACAT 
TCAAGAAGCCGCCGCCGCAGCAGCTGCCGTTGACTGGAAAGCACCGGAGTCTCCGTCTAG 
CACCGTGACGTCATCTCCAGTCGCCGACGACGCTTTCTCCGATCTTCCTGATCTTTTGCT 
TGACGTGAATGATCACAACAAAAACGATGGATTCTGGGACTCGTTTCCGTACGAAGATCC 
TTTCTTCTTGGAAAATTACTAGAAGGCAAATTCTTGCCGGCGAACGGATTTTCCGGTGGT 
TTCCCGGTAAATAAGAAGACGATGTCGTTTTGTACCTTTTTTGTCTACGATGGGAAATTT 
CTTTTTTTTTTACGTGTGAGTAAAAGTTTCCGAATGTGTGATGTGTAAGTAAGTACAGGT 
TATTTAATTTCTTTTTTTTGTACAAATACGTACGTCATTACCAAAAAGTTTTCATTTATT 
GTGCTTTTATCTTCCAAATTCATTAAAAAAAAAAAAAAAAA 

>G27 Amino Acid Sequence (domain in AA coordinates: 37-104) 

MQDSSSHESQRNLRSPVPEKTGKSSKTKNEQKGVSKQPNFRGVRMRQWGKWVSEIREPRK 

KSRIWLGTFSTPEMAARAHDVAALAIKGGSAHLNFPELAYHLPRPASADPKDIQEAAAAA 

AAVDWKAPESPSSTVTSSPVADDAFSDLPDLLLDVNDHNKNDGFWDSFPYEDPFFLENY* 
>G2720 (1..894) 

ATGGAAGCGAAGAAGGAAGAGATAAAGAAAGGTCCATGGAAAGCCGAAGAAGACGAAGTA 

CTCATCAACCATGTCAAGAGATACGGTCCTCGTGATTGGAGCTCCATTCGATCCAAAGGT 

CTTCTTCAACGCACCGGCAAATCCTGTCGTCTTCGTTGGGTCAATAAACTCCGTCCCAAT 

CTCAAAAATGGATGCAAGTTCTCGGCTGACGAAGAGAGGACTGTGATTGAGTTACAATCT 

GAGTTTGGTAACAAATGGGCGAGAATCGCTACGTATCTACCGGGAAGAACTGATAACGAT 

GTGAAGAATTTCTGGAGTAGCAGACAAAAGAGACTCGCTAGGATTCTTCATAACTCCTCT 

GATGCATCGAGTTCGAGTTTCAATCCCAAATCTTCTTCTTCTCATCGACTCAAGGGCAAA 

AACGTCAAACCAATCCGTCAATCCTCTCAGGGTTTTGGTTTGGTTGAGGAAGAGGTTACA 

GTTTCTTCTTCATGTTCCCAGATGGTTCCTTATTCATCTGATCAAGTTGGTGATGAAGTC 

TTGAGGTTGCCGGATTTGGGTGTTAAGTTAGAGCATCAGCCTTTCGCTTTTGGCACTGAT 

CTTGTCCTAGCAGAGTACTCTGACTCACAGAATGATGCAAATCAGCAAGCAATCAGCCCT 

TTCTCTCCAGAAAGCAGAGAGCTTTTGGCTAGACTTGACGACCCTTTTTACTATGATATA 

CTTGGACCAGCTGATTCTTCTGAGCCATTGTTCGCTCTCCCTCAGCCGTTCTTCGAGCCT 

TCGCCTGTGCCGAGAAGATGCAGACATGTTTCAAAGGATGAAGAAGCTGATGTTTTCTTA 

GACGATTTCCCAGCTGACATGTTTGATCAGGTTGATCCAATCCCAAGTCCTTAG 

>G2720 Amino Acid Sequence (domain in AA coordinates: 10-114) 

MEAKKEEIKKGPWKAEEDEVLINHVKRYGPR^ 

LKNGCKFSADEERTVIELQSEFGNKWARIATYLPGRTDNDVKNFWSSRQKRLARILHN 
DAS S S S FNPKS S SSHRLKGKNVKP IRQS SQGFGLVEEEVTVS S S CSQMVP YS SDQVGDEV 

LRLPDLGVKLEHQPFAFGTDLVLAEYSDSQNDANQQAISPFSPESRELIiARLDDPFYYDI 
LGPADSSEPLFALPQPFFEPSPVPRRCRHVSKDEEADVFLDDFPADMFDQVDPIPSP* 
>G2787 (142.. 1584) 

TCTCAGAGCAAAAAACAAAAAAAAAGAAAAAAAAACCCTA 
CCTCTGTCTTTTTTTTTTTTGTTCTTTTTTTTT 
CTCHXSCAAAAATCTCAC^TCCATGGATCCA 
TTC^CCCCTTTTCCTCAT^ 

AATAACCATGTCGTCTTCCAACCGCAGCCGCAAACGCAAACGCAAATCCCGC^ 
ATGTTTCAGT^ATCTCCACATGTTTCAATGC^ 

GCTGCGA1TGCGGCGTTAAACGAACCGGATGGTTCGAGCAAGATGGCAATTTCGAGATAC 
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ATCGAGAGATGTTACACCGGTTTAACTTCTGCTCATGCTGCTTTGTTGACTCACCATCTC 

AAGACTTTGAAGACCAGTGGTGTTCTTTCTATGGTTAAGAAATCTTACAAAATTGCTGGT 

TCTTCTACTCCTCCTGCTAGTGTAGCTGTTGCTGCTGCTGCCGCCGCTCAAGGTCTCGAT 

GTTCCCAGATCTGAGATTCTCCATTCAAGTAACAACGATCCCATGGCTTCTGGCTCTGCT 

TCTCAGCCTCTGAAACGAGGTCGTGGTCGTCCTCCTAAGCCTAAACCTGAATCTCAACCA 

CAACCACTACAGCAACTTCCACCGACCAATCAAGTCCAGGCTAACGGACAGCCAATCTGG 

GAACAGCAGCAAGTTCAATCACCTGTTCCGGTTCCGACTCCGGTTACAGAGTCGGCGAAG 

AGAGGACCTGGTCGTCCAAGGAAGAACGGTTCTGCTGCTCCTGCTACTGCACCAATCGTT 

CAAGCTTCGGTTATGGCTGGAATTATGAAACGTAGAGGTAGACCACCGGGTCGTCGAGCT 

GCTGGGAGACAGAGGAAGCCCAAATCCGTTTCTTCTACTGCCTCTGTGTATCCTTATGTT 

GCTAATGGTGCTAGACGCAGAGGAAGGCCTAGGAGAGTTGTTGACCCTAGCAGTATTGTT 

AGTGTTGCTCCAGTAGGTGGTGAAAATGTGGCAGCGGTTGCGCCAGGGATGAAGCGTGGA 

CGTGGACGACCACCTAAGATTGGTGGTGTTATCAGTAGGCTTATTATGAAGCCTAAGAGA 

GGACGAGGACGTCCTGTAGGTAGACCCAGAAAGATTGGAACATCAGTCACGACTGGGACA 

CAAGATTCTGGAGAACTCAAGAAGAAGTTTGATATTTTTCAAGAGAAAGTGAAAGAAATT 

GTGAAGGTGTTGAAGGATGGAGTTACAAGTGAGAATCAAGCAGTGGTGCAAGCCATAAAA 

GATCTGGAAGCACTAACAGTGACGGAGACCGTTGAGCCACAAGTTATGGAAGAAGTGCAG 

CCAGAGGAGACTGCAGCACCACAGACTGAAGCTCAACAAACTGAAGCTGCTGAGACACAA 

GGAGGACAAGAAGAAGGACAAGAAAGAGAAGGAGAAACACAGACCCAGACAGAAGCAGAG 

GCAATGCAAGAAGCTCTGTTCTGAAGAATAATAATGATCTAGAAAACAACCTAGACATAA 

TAGCCTTGGTGTTTGGCGTTAGGAGTGTTTTTTTTTAGTTGTTTTAGGTGTTGGAATCGC 

ATCTTAAATTATATAAAAATCTATAAGGAATTTTAATTTTTCTAGGTTTTGTTGTCTGCA 

GAAGAAGAAATAGTAGACTCGTTAATGGTGTTGTTGTCGGTGTGTCTTTAACCAAACCAT 

AAGACGTGGCTGTAAATTAGCGATGTTTCTAGTCTTCCATCTTTAATAATCTCTTATTGC 

GTCTGTGCCTTTGTTTTT 

>G2787 Amino Acid Sequence (domain in AA coordinates: 172-192, 226-247, 256-276, 
290-311, 245-366) 

MDPSLGDPHHPPQFTPFPHFPTSNHHPLGPNPYNNHVVFQPQPQTQTQIPQPQMFQLSPH 
VSMPHPPYSEMICAAIAALNEPDGSSKMAISRYIERCT 

VLSMVKKSYKIAGSSTPPASVAVAAAAAAQGLDVPRSEILHSSNNDPMASGSASQPLKRG 
RGRPPKPKPESQPQPLQQLPPTNQVQANGQPIWEQQQVQSPVPVPTPVTESAKRGPGRPR 
KNGSAAPATAPIVQASVI^GIMKRRGRPPGRRAAGRQRKPKSVSSTASVYPYVANGARRR 
GRPRRWDPSS I VSVAPVGGENVAAVAPGMKRGRGRPPKIGGVI SRLIMKPKRGRGRPVG 
RPRIQGTSVTTGTQDSGELKKKFDIFQEKVKEIVK^KDGVTSENQAWQAIKDLEALTV 
TETVEPQVMEEVQPEETAAPQTEAQQTEAAETQGGQEEGQEREGETQTQTEAEAMQEALF 
* 

>G2789 (82.. 879) 

CTTTAGGGACACCAAATCTATTCAACCTAAAAGCCTTCTTTTCCCCTATATTGACCAACT 

TTTTAGCGAATCAGAAGAGGAATGGATGAGGTATCTCGTTCTCATACACCGCAATTTCTA 

TCAAGTGATCATCAGCACTATCACCATCAAAACGCTGGACGACAAAAACGCGGCAGAGAA 

GAAGAAGGAGTTGAACCCAACAATATAGGGGAAGACCTAGCCACCTTTCCTTCCGGAGAA • 

GAGAATATCAAGAAGAGAAGGCCACGTGGCAGACCTGCTGGTTCCAAGAACAAACCCAAA 

GCACCAATCATAGTCACTCGCGACTCCGCGAACGCCTTCAGATGTCACGTCATGGAGATA 

ACCAACGCCTGCGATGTAATGGAAAGCCTAGCCGTCTTCGCTAGACGCCGTCAGCGTGGC 

GTTTGCGTCTTGACCGGAAACGGGGCCGTTACAAACGTCACCGTTAGACAACCTGGGGGA 

GGCGTCGTCAGTTTACACGGACGGTTTGAGATTCTTTCTCTCTCGGGTTCGTTTCTTCCT 

CCACCGGCACCACCAGCTGCGTCTGGTTTAAAGGTTTACTTAGCCGGTGGTCAAGGTCAA 

GTGATCGGAGGCAG5GTGGTGGGACCGCTTACGGCATCAAGTCCGGTGGTCGTTATGGCA 

GCTTCATTTGGAAACGCATCTTACGAGAGGCTGCCACTAGAGGAGGAGGAGGAAACTGAA 

AGAGAAATAGATGGAAACGCGGCTAGGGCGATTGGAACGCAAACGCAGAAACAGTTAATG 

CAAGATGCGACATCGTTTATTGGGTCGCCGTCGAATTTAATTAACTCTGTTTCGTTGCCA 

GGTGAAGCTTATTGGGGAACGCAACGACCGTCTTTCTAAGATAATATCATTGATAATATA 

AGTTTCGTCTTCTTATTCTTTTTCACTTTTTACCTTTTTCACTTTCTTA 

AACGTTTGATTAATACCTGAAGGTTTTTGGAAAATTTTCGATCGGATAAAAGGATTTATG 

TTGCGAGCCGAAACGCGGCC 

>G2789 Amino Acid Sequence (domain in AA coordinates: 53-73, 121-165) 
MDEVSRSHTPQFLSSDHQHYHHQNAGRQKRGREEEGVEPNNIGEDLATFPSGEENIKKRR 
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PRGRPAGSKNKPKAPIIVTRDSANAFRCHVMEITNACDVMESLAVFARRRQRGVCVLTGN 
GAVTNVTVRQPGGGWSLHGRFEILSLSGSFLPPPAPPAASGLKVYLAGGQGQVIGGSVV 

GPLTASSPWVMAASFGNASYERLPLEEEEETERBIDGNAARAIGTQTQKQLMQDATSFI 

GSPSNLINSVSLPGEAYWGTQRPSF* 

>G31 (13.. 615) 

CTTTTATAAGCAATGGCTCCAAGACAGGCGAACGGTAGAAGCATTGCCGTGAGTGAAGGC 

GGCGGAGGGAAGACGATGACGATGACGACGATGCGGAAGGAAGTGCACTTTAGAGGTGTG 

AGGAAGCGTCCATGGGGTAGATACGCGGCGGAGATCCGTGACCCGGGAAAGAAAACCCGG 

GTTTGGCTCGGGACATTCGACACGGCGGAGGAAGCTGCAAGAGCTTACGACACCGCCGCT 

AGAGAGTTTCGTGGCTCCAAAGCAAAGACTAATTTCCCTCTTCCCGGAGAGTCTACTACG 

GTTAACGACGGTGGCGAGAACGATTCTTACGTCAACCGTACGACGGTGACGACGGCGCGT 

GAGATGACGCGTCAGAGATTTCCGTTTGCATGTCACCGGGAGCGTAAAGTCGTCGGTGGT 

TATGCTTCTGCTGGTTTTTTCTTCGATCCGTCAAGAGCTGCTTCGTTAAGAGCAGAGCTT 

TCTCGGGTTTGTCCGGTTCGGTTTGATCCGGTTAATATCGAGTTGAGTATTGGTATTCGA 

GAAACCGTAAAAGTTGAACCGAGAAGAGAACTAAACCTGGATCTTAACCTAGCTCCACCG 

GTGGTGGACGTTTAGATTTTTTTCTTCTTTTCATAATTTGTATTTTACATTGCCGGAAAA 
TAATTAATGTTTTCTTTAG 

>G31 Amino Acid Sequence (domain in AA coordinates- TBD) 

MAPRQANGRSIAVSEGGGGKTMTMTTMRKEVHFRGVRKRPWGRYAAEIRDPGKKTRVWLG 

TFDTAEEAARAYDTAAREFRGSKAKTNFPLPGESTTVM3GGENDSYVNRTTVTTAREMTR 

QRFPFACHRERKWGGYASAGFFFDPSRAASLRAELSRVCPVRFDPVNIELSIGIRETVK 
VEPRRELNLDLNLAPPWDV* 

>G33 (20.. 757) 

ATTCTCCCCCAACCAAAATATGACCACAGAAAAAGAGAATGTCACTACGGCCGTGGCCGT 
GAAAGACGGCGGAGAAAAGAGTAAGGAAGTGAGTGACAAGGGCGTAAAGAAGAGAAAGAA 
TGTAACTAAGGCCCTGGCCGTGAATGACGGCGGAGAAAAGAGTAAGGAAGTGCGTTACAG 
GGGTGTAAGGAGGAGACCATGGGGGAGATATGCTGCGGAGATCCGTGATCCGGTAAAGAA 
AAAACGGGTCTGGCTCGGGTCCTTCAACACGGGGGAGGAAGCCGCCAGAGCCTACGACTC 
CGCTGCCATAAGGTTTCGAGGATCGAAAGCTACTACTAACTTCCCTCTAATCGGATACTA 
TGGGATTTCTTCGGCGACGCCGGTGAACAACAACCTTTCCGAGACGGTGAGTGATGGAAA 
TGCCAACCTCCCTCTCGTTGGAGACGATGGGAATGCTTTGGCTTCTCCGGTGAACAACAC 
CCTTTCCGAAACGGCGCGTGATGGAACACTTCCATCGGATTGTCACGACATGTTATCTCC 
GGGGGTGGCTGAAGCGGTTGCTGGATTTTTCTTAGATCTGCCTGAAGTTATTGCGTTGAA 
AGAGGAGCTTGATCGAGTTTGTCCTGACCAGTTTGAGTCCATTGATATGGGGTTGACTAT 
TGGTCCTCAAACCGCCGTGGAAGAGCCTGAGACTTCCTCCGCCGTGGATTGTAAGCTGCG 
AATGGAACCGGATCTTGACCTCAACGCAAGTCCCTAAAGATTGATCTGATGTTGTTGTCC 
TTGAATAAGTTTGTTATCTTGTCGCTCTTCTGATTGTCTGTACTTCTATTGGTTGATTCG 
TGCTTTTGGAGGACAAAACAAACATTTTTTTATGTATTAAAAAAAGGTAATTGAACTA1T 
ATCGAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

>G33 Amino Acid Sequence (domain in AA coordinates: 50-117) 

MTTEKENVTTAVAVKDGGEKSKEVSDKGVKKRKNVTKALAVNDGGEKSKEVRYRGVRRRP 

WGRYAAEIRDPVKKKRVWLGSFNTGEEAARAYDSAAIRFRGSKATTNFPLIGYYGISSAT 

PVHNNLSETVSDGNANLPLVGDDGNALASPVNNTLSETARDGTLPSDCHDMLSPGVAEAV 

AGFFLDLPEVIALKEELDRVCPDQFES1DMGLTIGPQTAVEEPETSSAVDCKLRMEPDLD 
LNASP* 

>G342 (1..723) 

ATGGACGTCTACGGCATGTCTTCACCGGACTTGCTTCGTATCGACGACCTTCTCGATTTC 
TCCAACGACGAAAT^TTCTCTTCCTCTTCCACCGTCACTTCCTCCGCCGCTTCCTCCGCC 
GCTTCTTCCGAAAACCCTTTC^GCTTTCCTTCTTCC^CCTACACrTCTCCTACTCTCCTC 
ACCGACTTCACTCACGATCTCTGCGTTCCCAGTGACGACGCAGCTCATCTCGAATGGTTA 
TCGCGATTCGTTGACGATTCATTCTCCGATTTCCCAGCAAATCCTTTAACCATGACCGTT 
AGACCCGAGATTTCATTCACCGGAAAACCTAGAAGTCGCCGATCAAGAGCACCAGCACCT 
TCCGTAGCTGGAACTTGGGCTCCGATGTCTGAATCAGAGCTTTGTCACTCCGTCGCTAAA 
CCTAAACCGAAGAAAGTCTACAACGCTGAATCGGTTACGGCGGATGGAGCGAGGCGGTGC 
ACGCACTGTGCCTCGGAGAAAACGCCACAGTGGAGAACTGGACCGCTTGGACCTAAAACA 
CTTTGTAACGCTTGTGGAGTTCGTTACAAATCAGGGAGGCTTGTACCGGAATACAGACCG 
GCGTCGAGTCCGACGTTTGTATTGACTCAGCATTCGAACTCTCATCGGAAAGTTATGGAG 



124 



BNSDOCID; <WO__03013227A2_IA> 



WO 03/013227 



PCT/US02/25805 



CTCCGGCGACAGAAGGAACAACAAGAATCTTGCGTTCGAATTCCGCCGTTTCAGCCGCAG 
TAA 

>G342 Amino Acid Sequence (domain in AA coordinates: 155-190) 

MDVYGMSSPDIiLRIDDLLDFSNDEIFSSSSTVTSSAASSAASSENPFSFPSSTYTSPTLL 

TDFTHDLCVPSDDAAHLEWLSRFVDDSFSDFPANPLTMTVRPEISFTGKPRSRRSRAPAP 

SVAGTWAPMSESELCHSVAKPKPKKVYNAESVTADGARRCTHCASEKTPQWRTGPLGPKT 

LCNACGVRYKSGRLVPEYRPASSPTFVLTQHSNSHRKVMELRRQKEQQESCVRIPPFQPQ 

* 

>G352 (80.. 817) 

AATACACCACACACTTCACTCTTTCTTCATCTTCTTCTTCTTAAATAGCTCGAAATCACA 
TCTCACAGAATTAAATCTTATGGCTCTCGAGACTCTCAATTCTCCAACAGCTACCACCAC 
CGCTCGGCCTCTTCTCCGGTATCGTGAAGAAATGGAGCCTGAGAATCTCGAGCAATGGGC 
TAAAAGAAAACGAACAAAACGTCAACGTTTTGATCACGGTCATCAGAATCAAGAAACGAA 
CAAGAACCTTCCTTCTGAAGAAGAGTATCTCGCTCTTTGTCTCCTCATGCTCGCTCGTGG 
CTCCGCCGTACAATCTCCTCCTCTTCCTCCTCTACCGTCACGTGCGTCACCGTCCGATCA 
CCGAGATTACAAGTGTACGGTCTGTGGGAAGTCCTTTTCGTCATACCAAGCCTTAGGTGG 
ACACAAGACGAGTCACCGGAAACCGACGAACACTAGTATCACTTCCGGTAACCAAGAACT 
GTCTAATAACAGTCACAGTAACAGCGGTTCCGTTGTTATTAACGTTACCGTGAACACTGG 
TAACGGTGTTAGTCAAAGCGGAAAGATTCACACTTGCTCAATCTGTTTCAAGTCGTTTGC 
GTCTGGTCAAGCCTTAGGTGGACACAAACGGTGTCACTATGACGGTGGCAACAACGGTAA 
CGGTAACGGAAGTAGCAGCAACAGCGTAGAACTCGTCGCTGGTAGTGACGTCAGCGATGT 
TGATAATGAGAGATGGTCCGAAGAAAGTGCGATCGGTGGCCACCGTGGATTTGACCTAAA 
CTTACCGGCTGATCAAGTCTCAGTGACGACTTCTTAA 

>G352 Amino Acid Sequence (domain in AA coordinates: 99-119,166-186) 

MALETLNSPTATTTARPLLRYREEMEPENLEQWAKRKRTKRQRFDHGHQNQETNKNLPSE 

EEYIiALCLLMXiARGSAVQSPPLPPLPSRASPSDHRDYKCTVCGKSFSSYQALGGHKTSHR 

KPTNTSITSGNQELSNNSHSNSGSWINVTVNTGNGVSQSGKIHTCSICFKSFASGQALG 

GHKRCHYDGGNWGNGNGSSSNSVELVAGSDVSDVDNERWSEESAIGGHRGFDLNLPADQV 

SVTTS* 

>G357 (1..615) 

ATGCAGAACAAACACAAATGCAAGCTCTGTTCCAAGAGTTTCTGTAATGGCAGAGCACTT 

GGTGGTCACATGAAGTCTCACTTGGTCTCATCTCAGTCTTCAGCTCGGAAGAAACTAGGT* 

GACTCGGTCTATTCTTCTTCTTCCTCTTCCTCCGATGGTAAAGCGCTCGCCTACGGGTTA 

CGAGAGAACCCGAGGAAGAGTTTCCGGGTCTTTAATCCGGATCCTGAGTCATCCACAATT 

TACAACAGTGAGACAGAGACCGAACCTGAATCCGGAGACCCGGTTAAGAAACGGGTCAGA 

GGAGATGTTTCAAAGAAGAAGAAGAAGAAGGCAAAGAGTAAGAGAGTGTTTGAGAACTCG 

AAGAAGCAAAAGACAATTCACGAGTCACCAGAACCAGCGAGTTCTGTCTCTGATGGTTCT 

CCTGAACAAGATTTAGCTATGTGCTTGATGATGCTGTCAAGAGATTCAAGGGAGCTCGAG 

ATTAAACTGAAAAT^ACCGGAGGAAGAGAGGAAGCCGGAAAAAAGACATTTCCCTGAGCTC 

CGTCGCTGTATGATAGATCTGAATCTTCCTCCGCCGCAAGAAGCTGAAGCTGTCACCGTC 

GTTTCAGC C ATATAA 

>G357 Amino Acid Sequence (domain in AA coordinates: 7-29) 
MQNKHKCKLCS KS FCNGRALGGHMKSHIiVS SQSSARKKLGDS VYS S S S S S SDGKALAYGL 
RENPRKS FRVFNPDPES STI YNSETETEPESGDPVKKRVRGDVS KKKKKKAKS KRVFENS 
KKQKTIHESPEPASSVSDGSPEQDLAMCLMMLSRDSRELEIKLKKPEEERKPEKRHFPEL 
RRCMIDLNLPPPQEAEAVTWSAI * 
>G358 (1..855) 

ATGGGTCAAGATGAGGTTGGGAGTGATCAGACGCAAATCATAAAAGGGAAACGTACGAAG 
CGACAAAGATCGTCTTCGACGTTTGTGGTGACGGCGGCGACAACAGTGACTTCAACAAGT 
TCATCGGCCGGTGGAAGTGGAGGAGAAAGAGCTGTTTCAGATGAATACAACTCGGCGGTT 
TCGTCTCCGGTGACTACTGATTGTACGCAAGAAGAAGAAGACATGGCGATTTGTCTCATC 
ATGTTAGCTCGTGGGA(^GTTCTTCCATCGCCGGATCTCAAGAACTCGAGAAAAATTCAT 
CAGAAGATTTCGTCGGAGAATTCTAGTTTCTATGTGTACGAGTGTAAAACGTGTAACCGG 
ACGTTTTCGTCGTTCCAAGCACTrGGTGGACACAGAGCGAGCC^ 

TCGACTGAGGAAAAGACTAGACTACCCCTGACGCAACCCAAGTCTAGTGCATCAGAAGAA 
GGGCAAAACAGTCATTTCAAAGTTTCCGGCTCAGCCCTAGCTTCACAGGCAAGTAACATC 
ATCAACAAGGCAAACAAAGTAC^CGAGTGTTCC^TCTGCGGTTCTGAGTTCACTTCCGGG 
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CAAGCTCTCGGTGGTCACATGAGGCGGCACAGGACAGCCGTAACCACGATTAGCCCCGTT 
GCAGCCACCGCAGAAGTAAGCAGAAACAGTACAGAGGAAGAGATTGAGATCAATATAGGC 
CGTTCGATGGAACAGCAGAGGAAATATCTACCGTTGGATCTTAATCTACCAGCACCAGAA 
GATGATCTAAGAGAGTCAAAGTTTCAAGGGATAGTATTCTCAGCAACACCAGCGTTAATA 
GATTGTCATTACTAG 

>G358 Amino Acid Sequence (domain in AA coordinates: 124-135, 188-210) 

MGQDEVGSDQTQIIKGKRTKRQRSSSTFWTAATTVTSTSSSAGGSGGERAVSDEYWSAV 

SSPWTDCTQEEEDMAICLIMLARGTVLPSPDLKNSRKIHQKISSENSSFYVYECKTCNR 

TFSSFQALGGHRASHKKPRTSTEEKTRLPLTQPKSSASEEGQNSHFKVSGSALASQASNI 

INKANKVHECSICGSEFTSGQALGGHMRRHRTAVTTISPVAATAEVSRNSTEEEIEINIG 

RSMEQQRKYLPLDLNLPAPEDDLRESKFQGIVFSATPALIDCHY* 

>G360 (1..543) 

ATGTGGAACCCTAACAAAATTGAAGAATTGGAGGATGATGATGAATCTTGGGAAGTCAAA 
GCCTTTGAGCAAGACACTAAAGGCAACATCTCTGGTACCACTTGGCCTCCAAGATCTTAC 
ACTTGCAATTTCTGCCGCCGTGAGTTCCGTTCTGCTCAAGCCTTAGGCGGTCACATGAAT 
GTCCACCGCCGTGACCGCGCCTCATCTAGGGCTCATCAAGGTTCCACCGTTGCGGCTGCG 
GCTAGAAGCGGCCACGGGGGGATGTTACTCAATTCTTGTGCTCCGCCGTTGCCTACAACG 
ACACTTATAATACAATCCACGGCGAGTAACATTGAAGGTTTGTCCCATTTCTACCAACTG 
CAAAACCCTAGTGGCATTTTTGGTAATTCTGGTGACATGGTGAATCTTTATGTAGAAGTT 
CCTCCTCGGCTTATTGAATATTCGACAGGAGATGATGAGAGCATTGGCTCGATGAAAGAA 
GCGACAGGAACATCAGTGGATGAGCTTGATCTTGAACTTCGGCTAGGGCACCATCCACCG 
TGA 

>G360 Amino Acid Sequence (domain in aa coordinates: 42-62) 

MWNPNKIEELEDDDESWEVKAFEQDTKGNISGTTWPPRSYTCNFCRREFRSAQALGGHMN 

VHRRDRASSRAHQGSTVAAAARSGHGGMLLNSCAPPLPTTTIillQSTASNIEGLSHFYQL 

QNPSGIFGNSGDMVNLYVEVPPRLIEYSTGDDESIGSMKEATGTSVDELDLELRLGHHPP 
* 

>G362 (195.. 830) 

ATAAAAAACCCTTCATACAATATAAAATTTCTTTAGACATACAATATATTATACTATTAC 
AGATGCAATGCATCATTAGTTACAAACTATTAAACTAAATATCCCCCGTCTCTCTCTTGC 
TATATAAAGAAGATCATTTACACATCTCCTTAAGCAAATTAAACCCATCGATAAACACAT 
ACGTTCACACATATATGTCTATAAATCCGACAATGTCTCGTACTGGCGAAAGTTCTTCAG 
GTTCGTCCTCCGACAAGACGATAAAGCTATTCGGCTTCGAACTCATCAGCGGCAGTCGTA 
CGCCGGAAATCACGACGGCGGAAAGCGTGAGCTCGTCCACA7\ACACGACGTCGTTAACAG 
TGATGAAAAGACACGAGTGCCAATACTGCGGTAAAGAGTTTGCAAATTCTCAAGCCTTAG 
GAGGTCACCAAAACGCTCACAAGAAGGAGAGGTTGAAGAAGAAGAGGCTTCAGCTTCAAG 
CTCGGCGAGCCAGCATCGGCTATTATCTCACCAACCACCAACAACCCATAACGACGTCAT 
TTCAGAGACAATACAAAACGCCGTCGTATTGTGCATTCTCCTCCATGCACGTGAATAATG 
ATCAGATGGGTGTGTACAACGAAGATTGGTCGTCGAGGTCGTCGCAGATTAACTTCGGTA 
ATAATGACACGTGCCAAGATCTTAATGAACAAAGCGGTGAGATGGGTAAGCTGTACGGTG 
TTCGACCGAACATGATTCAGTTCCAGAGAGATCTGAGTTCTCGTTCTGATCAGATGAGAA 
GTATTAACrrCGCTGGATCTTCATCTAGGTTTTGCCGGAGATGCGGCATAACAAATTAAAG 
AGAGATATATGATTAAGATTATATGTACTATAGTGGCGTATTTCATTGGGATCATGAAGG 

GGAAAAAACGAGACATATAGTATTCTTGATGCAATTTGAGTTTTGTAATTTATTTAGGTT 
TATGTATGTTTTCGAAG 

>G362 Amino Acid Sequence (domain in AA coordinates: 62-82) 
MSINPTMSRTGESSSGSSSDKTIKLFGFELISGSRTPEITTAESVSSSTNTTSLTVMKRH 
ECQYCGKEFANSQALGGHQNAHKKERLKKKRLQLQARRAS IGYYLTNHQQP I TTS FQRQY 
KTPSYCAFSSMHVNNDQMGVYNEDWSSRSSQINFGl^TCQDLNEQSGEM 
IQFQRDLS SRSDQMRS INSLDLHLGFAGDAA* 
>G364 (64.. 516) 

AAGCTTGATATCGCCTCTCTCTAATCTCTCTTTCTCTCTCTATCTCTAAGAATATATA^ 
GGTATGGACTACCAGCCAAACACATCCCTACGTCTAAGCCTACCAAGTTACAAGAACCAC 
CAACTAAACCTAGAACTTGTTCTCGAGCCTTCTTCCATGTCTTCTTCT^ 
ACGAACTCATCATCATGTTTGGAGC^^ 

AAGTTTTACAGCTCTCAAGCTCTTGGTGGTCATCAAAACGCT 

TTAGCCAAGAAGAGTCGAGAACTCTTTAGATCCTCAAACACTGTTGATTCTGATCAGCCT 
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TACCCGTTCTCCGGTCGCTTTGAGCTTTACGGCCGTGGCTACCAAGGATTTCTCGAAAGT 

GGCGGCTCGAGGGACTTCTCCGCCCGCCGTGTGCCGGAGAGTGGTCTTGATCAGGATCAG 

GAGAAGAGTCACCTTGACTTATCCTTAAGGCTCTAAAAGAATCTTATATTTTGTTAGTCT 

ATATATTATCATATCAATTGTTAATCTTAAAATTGATTGTTTTACTTATTAGTCATTTCC 

TATTATCTGAAAGTTTTCTTTGTAAGTTGTAACTATGGTCCTAAATTCAAATCCAAATTT 

GATTTTGGAAGATGGTACCTAATGCAGTAGTTAAATAAGTTAAAAAAATGAAGGATCTAT 
AATTCTCT 

>G3 64 Amino Acid Sequence (domain in AA coordinates: 54-76) 

MDYQPNTSLRLSLPSYKMHQLNLELVLEPSSMSSSSSSSTNSSSCLEQPRVFSCNYCQRK 

FYSSQALGGHQNAHKLERTLAKKSRELFRSSNTVDSDQPYPFSGRFELYGRGYQGFLESG 

GSRDFSARRVPESGLDQDQEKSHLDLSLRL* 

>G365 (69.. 755) 

CAATTCTTTTACTTTCATTCTCTTTATATATTCTCTCTACGCTATAATATATATTACACA 

GAATATACATGGAACCGTCCATCAAAGGAGATCAAGAAATGTTAAAAATCAAGAAACAAG 

GTCATCAAGATCTTGAGTTGGGGTTGACCCTTTTGTCACGTGGAACCGCGACCTCATCAG 

AGCTCAATCTCATCGATTCTTTCAAAACCAGCTCATCATCGACTTCTCATCATCAGCACC 

AGCAAGAACAATTGGCAGATCCGAGAGTGTTCTCGTGTAATTATTGTCAAAGAAAGTTCT 

ATAGTTCACAAGCGCTAGGCGGTCACCAAAACGCTCATAAACGTGAGCGCACCTTAGCCA 

AACGTGGACAGTATTACAAGATGACTCTCTCCTCCTTGCCTTCTTCAGCGTTTGCGTTTG 

GCCACGGTTCAGTCAGCAGATTCGCAAGCATGGCATCGTTACCATTACATGGCTCGGTGA 

ATAACAGGTCAACGTTAGGGATTCAAGCTCATTCAACGATCCATAAGCCCAGCTTCTTAG 

GAAGACAAACGACGAGTTTAAGTCATGTTTTCAAACAGAGCATTCACCAGAAACCGACCA 

TAGGAAAGATGTTGCCGGAGAAATTTCACCTTGAAGTCGCCGGAAATAATAACAGTAACA 

TGGTTGCTGCTAAGTTGGAGAGAATTGGACATTTCAAGAGCAACCAAGAAGATCATAATC 

AGTTTAAGAAAATTGACTTGACTCTTAAGCTATGAGCTCTGCCATCTTCTTTTTAGTCTT 

CATTATAACTTTTTTTATTCTCATCTTTGTTTGATATAATGATTGACGGCAGGGTGTGTT 

AGAGTTTCACTAATGATCAAGTTGTACTTTTTATATATTTCATTGATACCTTGTTGATGT 
AATTCAATATTTTAGGTCTGTTTTT 

>G365 Amino Acid Sequence (domain in aa coordinates: 70-90) 
MEPSIKGDQEMLKIKKQGHQDLELGLTLLSRGTATSSELNLIDSFKTSSSSTSHHQHQQE 
QLADPRVFSCNYCQRKF YS SQALGGHQNAHKRERTLAKRGQYYKMTLS S LP S S AFAFGHG 

SVSRFASMASLPLHGSVNNRSTLGIQAHSTIHKPSFLGRQTTSLSHVFKQSIHQKPTIGK 

MLPEKFHLEVAGl^Sl^AAKLERIGHFKSNQEDHNQFKKIDLTLKL* 

>G367 (1..70B) 

ATGGACGCTTCAATAGTTTCCTCATCCACTGCTTTTCCATATCAAGATTCTCTAAACCAG 

AGCATCGAAGACGAAGAAAGAGACGTTCATAATTCTAGTCACGAACTCAATCTCATCGAC 

TGCATAGACGACACAACGAGTATCGTTAACGAATCTACAACATCCACAGAACAAAAGCTT 

TTCTCATGCAACTATTGTCAAAGAACTTTCTATAGCTCACAAGCACTTGGTGGTCACCAA 

AACGCACACAAGAGAGAGAGAACGTTGGCGAAGAGAGGACAACGTATGGCAGCGTCAGCC 

TCAGCTTTTGGACATCCTTACGGTTTCTCTCCACTTCCTTTCCACGGACAGTACAACAAC 

CATAGGTCTTTAGGGATCCAAGCGCATTCGATAAGCCACAAGCTAAGTTCTTATAACGGG 

TTTGGTGGTCACTATGGTCAGATCAACTGGTCAAGACTTCCATTTGATCAACAACCAGCC 

ATAGGTAAATTTCCCTCAATGGATAATTTTCATCATCATCATCATCAGATGATGATGATG 

GCTCCTTCAGTAAATTCACGGTCCAATAACATCGATAGCCCAAGCAACACAGGACGGGTT 

CTAGAAGGGTCACCGACTCTTGAACAATGGCACGGAGACAAAGGATTGTTGTTAAGTACA 

AGTCATCATGAAGAGCAGCAGAAACTTGACTTGTCCCTCAAGCTTTGA 

>G367 Amino Acid Sequence (domain in AA coordinates : 63 -84) 

MDASIVSSSTAFPY-eDSLNQSIEDEERDVHNSSHELNLIDCIDDTTSIVNESTTSTEQKL 

FSCNYCQRTFYSSQALGGHQNAHKRERTLAKRGQRMAASASAFGHPYGFSPLPFHGQYNN 

HRSLGIQAHSISHKLSSYNGFGGHYGQIIWSRLPFDQQPAIGKFPSMDNFHHHHHQMM^ 

APSWSRSNNIDSPSNTGRVLEGSPTLEQWHGDKGLLLSTSHHEEQQKLDLSLKL* 
>G373 (1..1854) 

ATGGCGATTGAAACTCAGCTTCCTTGCGACGGTGACGGTGTGTGTATGCGGTGTCAGGTG 

AATCCTCCGTCAGAAGAGACTCTCACTTGTGGCACGTGCGTCACTCCATGGCACGTGCCG 

TGTCTCCTCCCCGAATCACTCGCTTCTTCCACTGGAGAGTGGGAGTGTCCCGATTGCT 

GGCGTTGTCGTTCCCTCCGCCGCTCCGGGTACCGGAAACGCTCGACCTGAATCTTCCGGT 

TCAGTTCTCGTTGCTGCGATCCGTGCGATTCAGGCTGATGAGACTTTAACCGAAGCTGAG 
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AAAGCCAAAAAAAGGCAGAAACTGATGAGTGGGGGTGGTGACGATGGTGTCGATGAAGAA 

GAGAAGAAGAAGTTAGAAATCTTTTGTTCTATTTGCATTCAATTGCCAGAAAGACCTATC 

ACGACACCGTGTGGGCACAATTTCTGTTTGAAATGTTTCGAGAAATGGGCAGTAGGTCAA 

GGGAAGCTAACTTGTATGATATGCCGAAGCAAAATTCCGAGACATGTGGCAAAAAATCCT 

CGCATCAACTTAGCTCTAGTTTCTGCTATTCGTTTAGCAAATGTTACCAAATGTTCTGTT 

GAGGCAACTGCAGCCAAGGTTCATCATATTATCCGCAACCAAGACCGTCCTGAGAAAGCA 

TTTACTACCGAGCGGGCAGTAAAAACTGGGAAAGCTAATGCTGCTAGCGGTAAGTTTTTT 

GTGACAATACCTCGTGATCATTTTGGTCCCATACCAGCTGAGAATGATGTCACTAGAAAG 

CAAGGTGTTTTGGTTGGAGAATCTTGGGAGGACAGGCAAGAGTGTAGGCAGTGGGGAGCT 

CATTTCCCGCATATTGCTGGCATTGCCGGGCAATCAGCGGTTGGAGCTCAGTCTGTGGCC 

CTCTCTGGAGGTTATGACGATGATGAGGATCATGGTGAATGGTTTCTCTACACAGGAAGT 

GGTGGAAGGGATCTCAGTGGAAACAAAAGAATTAACAAGAAACAGTCGTCTGACCAGGCG 

TTTAAAAACATGAATGAATCTCTAAGACTTAGTTGCAAAATGGGCTATCCTGTCCGAGTT 

GTCAGGTCTTGGAAGGAGAAGCGTTCTGCATATGCCCCTGCTGAAGGTGTGAGATATGAT 

GGGGTCTATCGAATTGAGAAGTGCTGGAGTAATGTTGGAGTACAGGGTTCTTTTAAGGTP 

TGTCGTTACCTGTTTGTTAGATGTGACAATGAGCCAGCTCCATGGACCAGTGATGAGCAT 

GGCGATCGTCCAAGACCGTTGCCTAATGTTCCGGAGCTTGAGACTGCTGCTGACCTGTTT 

GTGAGAAAGGAGAGTCCATCATGGGATTTCGATGAAGCTGAGGGTCGTTGGAAATGGATG 

AAGTCTCCTCCTGTTAGCAGAATGGCTTTGGATCCTGAGGAGAGGAAGAAGAATAAGAGA 

GCAAAAAATACTATGAAGGCCAGACTTCTGAAAGAATTTAGTTGCCAAATCTGTCGGGAA 

GTGCTGAGTCTTCCAGTGACGACGCCTTGTGCACACAACTTCTGCAAAGCATGCTTAGAA 

GCGAAGTTTGCTGGGATAACTCAACTGAGAGAGAGAAGCAATGGCGGACGTAAACTACGT 

GCAAAGAAGAACATCATGACCTGCCCTTGCTGCACGACGGATCTCTCCGAGTTTCTCCAA 

AACCCGCAGGTGAACAGAGAGATGATGGAGATAATAGAGAATTTTAAGAAGAGTGAGGAA 

GAGGCTGATGCATCCATTTCTGAAGAAGAAGAAGAAGAATCCGAACCTCCAACTAAGAAG 

ATTAAGATGGATAACAACTCTGTTGGTGGTAGTGGTACAAGTCTCTCAGCTTAA 

M^ZLf™ 1110 AGid Se< 3 uence domain in AA coordinates: 129-1S8) 

MAIETQLPCDGDGVCMRCQVNPPSEETLTCGTCVTPWHVPCLLPESLASSTGEWECPDCS 

^™f^ GTGN ^ PESSGS ^ V ^ I ^ IQMCTLTEMK ^Q KL MSGGGDDGVDEE 

^^f 1 ! 0 ^ IQLPERPITOPCSHNp CLKCFEKWAVGQGKLTCMICRSKIPRHVAKNP 

RINI^VSAIRIANVTKCSVEATAAKVHHIIRNQDRPEKAFTTERAVKTGKANAASGKFF 

^™^ GPIPA ^ VTRKQGVLVGBSWEDRQECR Q WGAHF PHIAGIAGQSAVGAQSVA 

LSGGYDDDEDHGEWFLYTGSGGRDLSGNKRINKKQSSDQAFK I raNESLRLSCKMGYPVRV 

WSWKEKRSAYAPAEGVRYDGVYRIEKCWSNVGVQGSFKVCRYLFWCDNEPAPWTSDEH 

»™^ P ^ ELET ^ LFTOraSPSWDFDBMGRW ^^PPVSRMALDPEERK^ 

>G396 (1..957) 

ATGGGGGAAAGAGATGATGGGTTGGGTTTGAGTCTAAGCTTGGGAAATAGTCAACAAAAA 
GAACCATCTCTGAGGTTGAATCTTATGCCGTTGACAACTTCTTCTTCOTCTTCTTCGTTT 
CAACACATGCACAATCAGAATAACAATAGCCATCCCCAGAAGATTCATAACATCTCTTGr 

^GTCATTOCTAAGAGGTTTCAACGTGAACAGAGCTCAGTOTCGGTGGCG^ 
TTGGAAGAAGAAGCCGCCGTCGTCTCGTCTCCAAACAGCGCCGTTTCGAGTCTGAGTGGA 

rSS™ ^ G ^ CTACGGraATCG ^ GG ATCAAGCTC TT GTTCTCGAGGAGACTTT T AAA 
^^^^^^^^ ^^^^^^^^^^^^^^^-^^^^^^^^M^Cl^^^ 

C ^ C ^™ GAAGTGTCGGAGCTGAGGGCGTOG ^ G ^TCTCCACATCTCTACATGCAC 
^n^ CCTACTACTCTCACCATGTGCCOTC ^ GGG ^^ 

C^CAGCGATTAACTCOTGGACTGCTATTTCTCTCCAGCAAAAATCAGG^ 
MGFRDD^cTofif Se< I UenCe < dom ain in AA coordinates: 159-220) 
MGERDDGLGLSLSLGNSQQKEPSLRIjNLMPLTTSSSSSSFQHMHtlQNIINSHPQKIHNlSW 
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THLFQSSGIKRTTAERNSDAGSFLRGFNVNRAQSSVAWDLEEEAAWSSPNSAVSSLSG 

NKRDLAVARGGDENEAERASCSRGGGSGGSDDEDGGNGDGSRKKLRLSKDQALVLEETFK 

EHSTLNPKQKLALAKQLNLRARQVEVWFQNRRARTKLKQTEVDCEYLKRCCDNLTEENRR 

LQKEVSELRALKLSPHLYMHMTPPTTLTMCPSCERVSSSAATVTAAPSTTTTPTWGRPS 
PQRLTPWTAISLQQKSGR* 

>G431 (1. .1149) 

ATGGAGAGTGGTTCCAACAGCACTTCTTGTCCAATGGCTTTTGCCGGGGATAATAGTGAT 

GGTCCGATGTGTCCTATGATGATGATGATGCCGCCCATCATGACATCACATCAACATCAT 

GGTCATGATCATCAACATCAACAACAAGAACATGATGGTTATGCATATCAGTCACACCAC 

CAACAAAGTAGTTCCCTTTTTCTTCAATCACTAGCTCCTCCCCAAGGAACTAAGAACAAA 

GTTGCTTCTTCTTCTTCTCCTTCCTCTTGTGCTCCTGCCTATTCTCTAATGGAGATCCAT 

CATAACGAAATCGTTGCAGGAGGAATCAACCCTTGCTCCTCTTTCTCTTCTTCAGCCTCT 

GTCAAGGCCAAGATCATGGCTCATCCTCACTACCACCGCCTCTTGGCCGCTTATGTCAAT 

TGTCAGAAGGTTGGAGCACCACCGGAGGTTGTGGCGAGGCTGGAGGAGGCATGCTCGTCT 

GCCGCAGCCGCAGCCGCATCTATGGGGCCAACAGGGTGTCTTGGTGAAGATCCAGGGCTT 

GATCAATTCATGGAAGCTTACTGTGAAATGCTCGTTAAGTATGAGCAAGAGCTCTCCAAA 

CCTTTCAAGGAAGCTATGGTCTTCCTTCAACGTGTCGAGTGTCAATTCAAATCCCTCTCT 

CTATCCTCACCTTCCTCTTTCTCCGGTTATGGAGAGACAGCAATTGATAGGAACAATAAT 

GGGTCATCCGAGGAAGAAGTCGATATGAACAATGAATTTGTAGATCCACAAGCTGAGGAT 

AGAGAGCTTAAAGGACAGCTCTTGCGCAAGTACAGTGGTTACTTAGGGAGCCTCAAGCAA 

GAGTTCATGAAGAAGAGGAAGAAAGGAAAGCTCCCTAAAGAAGCTCGTCAACAACTGCTT 

GATTGGTGGAGCCGTCACTACAAATGGCCTTACCCTTCGGAGCAACAAAAGCTCGCCCTT 

GCGGAATCAACGGGGCTGGACCAGAAACAGATAAACAATTGGTTCATAAACCAGAGGAAA 

CGGCATTGGAAGCCGTCGGAGGACATGCAGTTTGTAGTAATGGACGCAACACATCCTCAC 

CATTACTTCATGGATAATGTCTTGGACAATCCTTTCCCAATGGATCACATCTCCTCCACC 
ATGCTTTGA 

>G431 Amino Acid Sequence (domain in AA coordinates: 286-335) 

MESGSNSTSCPMAFAGDNSDGPMCPMMMMMPPIMTSHQHHGHDHQHQQQEHDGYAYQSHH 

QQSSSLFLQSLAPPQGTKNKVASSSSPSSCAPAYSLMEIHHNEIVAGGINPCSSFSSSAS 

VKAKIMAHPHYHRLLAAYVNCQKVGAPPEVVARLEEACSSAAAAAASMGPTGCLGEDPGL 

DQFMEAYCEMLVKYEQELSKPFKEAMVFLQRVECQFKSLSLSSPSSFSGYGETAIDRNNN 

GSSEEEVDI^FVDPQAEDRELKGQLLRKYSGYLGSLKQEFMKKRKKGKLPKEARQQLI. 

DWWSRHYKWPYPSEQQKLALAESTGLDQKQINNWFINQRKRHWKPSEDMQFVVMDATHPH 
HYFMDNVLDNPFPMDHISSTML* 

>G479 (1..1128) 

ATGGAGATGGGTTCCAACTCGGGTCCGGGTCATGGTCCGGGTCAGGCAGAGTCGGGTGGT 
TCCTCCACTGAGTCATCCTCTTTCAGTGGAGGGCTCATGTTTGGCCAGAAGATCTACTTC 
GAGGACGGTGGTGGTGGATCCGGGTCTTCTTCCTCAGGTGGTCGTTCAAACAGACGTGTC 
CGTGGAGGCGGGTCGGGTCAGTCGGGTCAGATACCAAGGTGCCAAGTGGAAGGTTGTGGG 
ATGGATCTAACCAATGCAAAAGGTTATTACTCGAGACACCGAGTTTGTGGAGTGCACTCT 
AAAACACCTAAAGTCACTGTGGCTGGTATCGAACAGAGGTTTTGTCAACAGTGCAGCAGG 
TTTCATCAGCTTCCGGAATTTGACCTAGAGAAAAGGAGTTGCCGCAGGAGACTCGCTGGT 
CATAATGAGCGACGAAGGAAGCCACAGCCTGCGTCTCTCTCTGTGTTAGCTTCTCGTTAC 
GGGAGGATCGCACCTTCGCTTTACGAAAATGGTGATGCTGGAATGAATGGAAGCTTTCTT 
GGGAACCAAGAGATAGGATGGCO^GTTCAAGAACATTGGATACAAGAGTGATGAGGCGG 
CCAGTGTCGTCACCGTCATGGCAGATCAATCCAATGAATGTATTTAGTCAAGGTTCAGTT 
GGTGGAGGAGGGACAAGCTTCTCATCTCCAGAGATTATGGACACTAAACTAGAGAGCTAC 
AAGGGAATTGGCGAGTCAAACTGTGCTCTCTCTCTTCTGTCAAATCCACATCAACCACAT 
GACAACAACAACAACAACAACAACAACAGCAACAACAACAACAATACATGGCGAGCTTCT 
TCAGGTTTTGGCCCGATGACGGTTACAATGGCTCAACCACCACCTGCACCTAGCCAGCAT 
CAGTATCTGAACCCGCCTTGGGTATTCAAGGACAATGATAATGATATGTCTCCTGTTTTG 
AATTTAGGTCGATACACCGAGCCAGATAATTGTCAGATAAGTAGTGGCACGGCAATGGGT 

ACAAGGGCTTATGACTCTTCTTCTCACCATACCAACTGGTCTCTCTGA 

>G479 Amino Acid Sequence (conserved domain in AA coordinates- 70 mq\ 
^MGSNSGPGHGPGQAESGGSSTESSSFSGGI^FGQ^ 

RGGGSGQSGQIPRCQVEGCGMDLTNAKGYYSRHRVCGVHSKTPKVTVAGIEQRFCQQCSR 
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FHQLPEFDLEKRSCRRRLAGHNERRRKPQPASLSVLASRYGRIAPSLYENGDAGMNGSFL 
GNQEIGWPSSRTLDTRVMRRPVSSPSWQINPMNVFSQGSVGGGGTSFSSPEIMDTKLESY 
KGIGDSNCALSLLSNPHQPHDNNNNNN^ 

QYLNPPWVFKDNDNDMSPVLNLGRYTEPDNCQISSGTAMGEFELSDHHHQSRRQYMEDEN 

TRAYDSSSHHTNWSL* 

>G546 (1..588) 

atgactcgaccgtcaagattacttgagacggcggcgccaccaccacaaccgtcggaggag 
atgatcgcagcggaatccgacatggtggtgatcttgtcggctcttctttgcgctcttatc 
tgcgttgctggtctcgccgccgtcgtacgatgcgcttggctccggcggtttacagccgga 
ggagattcgccgtcaccgaacaaaggcttgaaaaagaaagctcttcagtctcttccaaga 
tccactttcaccgccgcggaatcaacctccggcgccgccgctgaagagggagactcgacg 
gaatgtgctatttgcctcactgacttcgccgacggtgaagaaataagagtgcttcctctt 
tgtggtcattctttccacgtggagtgtattgacaaatggctagtttctaggtcttcttgt 
ccttcttgtcgcaggattcttacgccggtgagatgtgaccggtgtggtcatgcttctacg 
gcggagatgaaagatcaagctcatcgtcatcaacatcaccaacactcttctactaccatt 
cctacgtttcttccttaa 

>G546 Amino Acid Sequence (domain in AA coordinates : 114-155) 
MTRPSRLLETAAPPPQPSEEMIAAESDMWILSALLCALICVAGLAAWRCAWLRRFTAG 
GDSPSPNKGLKKKALQSLPRSTFTAAESTSGAAAEEGDSTECAICLTDFADGEEIRVLPL 
CGHSFHVECIDKWLVSRSSCPSCRRILTPVRCDRCGHASTAEMKDQAHRHQHHQHSSTTI 
PTFLP* 

>G551 (1..708) 

ATGGAGTGGTCAACAACGAGCAACGTAGAAAACGTGAGAGTAGCTTTCATGCCACCGCCA 
TGGCCGGAGTCTAGTTCCTTTAACTCGCTCCACAGCTTGAACTTTGATCCTTACGCAGGA 
AATTCATATACGCCTGGCGATACACAAACCGGACCGGTTATCTCTGTACCGGAATCAGAA 
AAGATCATGAATGCGTACCGATTTCCGAACAACAACAATGAGATGATAAAAAAGAAGAGA 
CTAACGAGTGGACAATTAGCTTCACTTGAGCGAAGTTTTCAAGAAGAGATCAAATTAGAT 
TCAGACAGGAAGGTGAAGCTGTCGAGAGAGCTCGGTCTGCAGCCACGTCAGATAGCAGTT 
TGGTTCCAAAACCGCCGTGCACGGTGGAAGGCGAAGCAGCTTGAGCAGTTGTACGACTCG 
CTTAGACAAGAGTACGACGTCGTTTCTAGGGAGAAACAAATGTTACACGATGAGGTGAAG 
AAGCTGAGAGCTTTACTAAGAGACCAGGGTTTGATCAAGAAGCAAATCTCTGCCGGGACC 
ATCAAAGTTTCCGGTGAGGAAGACACGGTGGAGATTTCATCGGTGGTGGTAGCTCATCCA 
AGAACGGAGAATATGAACGCAAATCAAATCACCGGAGGGAATCAAGTTTACGGTCAATAC 
AACAATCCGATGCTGGTTGCTTCCTCTGGCTGGCCGTCATACCCCTGA 

>G551 Amino Acid Sequence (conserved domain in AA coordinates : 73-133) 

MEWSTTSlJnTENTOVAFMPPPWPESSSFNSLHSFNFDPYAGNSYTPGDTQTGPVISVPESE 

KIMNAYRFPNNl^MIKKK^TSGQI^^ 

WFQNRRAJiWKAKQLEQLYBSLRQEYDWSREKQI^HDEVKKLRALLRDQGLI KKQI S AGT 

IKVSGEEDTVEISSVWAHPRTENMNANQITGG^ 

>G578 (1..978) 

ATGCATAGTTTGAATGAAACAGTAATTCCTGATGTTGATTACATGCAGTCTGATAGAGGG 

CATATGCATGCTGCTGCCTCTGATTCCAGTGATCGATCAAAGGATAAGTTGGATCAAAAG 

ACCCTTCGTAGGCTTGCTCAAAATCGTGAGGCAGCAAGAAAAAGCAGATTGAGGAAGAAG 

GCGTATGTTCAGCAGCTGGAAGATAGTCGATTAAAGCTGACTCAAGTTGAGCAGGAGCTG 

CAAAGAGCAAGACAGCAGGGAGTTTTCATCTCAAGTTCAGGAGACC^GCT(^TTCTACT 

GGTGGCAATGGTGGGGCTTTGGCATTTGATGCAGAACACTCACGATGGCTTGAAGAAAAG 

AACAGGCAAATGAACGAGCTGAGATCTGCCCTGAATGCTCATGCAGGTGATACTGAGCTC 

CGGATAATTGTGGATGGAGTGATGGCTCACTATGAGGAGCTTTTCAGGATTAAGAGCAAT 

GCATCTAAGAATGATGTCTTCCACTTGTTATCTGGAATGTGGAAAACACCAGCTGAGCGA 

TGTTTCTTGTGGCTTGGCGGGTTCCCGTCATCCGAACTTCTCAAGCTTCTTGCGAATCAG 

CTAGAGCCCATGACAGAACGACAGGTAATGGGCATCAATAGCTTGCAGCAGACGTCGCAG 

CAGGCZAGAAGATGCTTTATCTCAAGGGATGGAGAGTTTACAGCAATCCCTAGCTGATACT 

TTATCCAGTGGAACTCTTGGTTCCAGTTCATCGGATAATGTCGCGAGCTACATGGGTCAG 

ATGGCCATGGCAATGGGCAAGTTAGGCACCCTCGAAGGATTCATACGCCAGGCTGATAAC 

TTGAGGCTGCAAACACTACAACAGATGCTTCGAGTATTAACAA^ 

GCTCTTCTTGCTATACACGATTATTCATCT 

GCCCGGCCAAGAGAGTGA 
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>G57 8 Amino Acid Sequence (domain in AA coordinates 36-96) 

MHSLNETVIPDVDYMQSDRGHMHAAASDSSDRSKDKLDQKTLRRLAQNREAARKSRLRKK 

AYVQQLEDSRLKLTQVEQELQRARQQGVFISSSGDQAHSTGGNGGALAFDAEHSRWLEEK 

NRQMNELRSALNAHAGDTELRIIVDGVMAHYEELFRIKSNASKNDVFHLLSGMWKTPAER 

CFLWLGGFPSSELLIOjLJ^QLEPMTERQVMGINSLQQTSQQAEDALSQGMESLQQSLADT 

LSSGTLGSSSSDNVASYMGQMAMAMGKLGTLEGFIRQADNLRLQTLQQMLRVLTTRQSAR 

ALLAIHDYSSRLRALSSLWLARPRE* 

>G596 (168.. 1121) 

TAATTTCTCTACTTCAGATTTTTTTCTCCTTAGATTAATTTAATTGAGTTATTGTACATC 
CCTCAAGCTAAGATTCTGGTTTTGTGAGTTGAGTGGATGAGAAGAGGAGAGATTAACTAA 
ATTAGGGTTTCAATTGTTTACTTTTTGTTTGCTTTTTATATCAAGTAATGGATCAGGTCT 
CTCGCTCTCTTCCTCCACCTTTTCTCTCAAGAGATCTCCATCTTCACCCACACCATCAAT 
TCCAGCATCAGCAGCAGCAGCAGCAACAGAATCACGGCCACGATATAGACCAGCACCGAA 
TCGGTGGGCTAAAACGTGACCGAGATGCTGATATCGATCCCAACGAGCACTCTTCAGCCG 
GAAAAGATCAAAGTACTCCTGGCTCCGGTGGAGAAAGCGGCGGCGGAGGAGGAGGAGATA 
ATCACATCACGAGAAGGCCACGTGGCAGACCAGCGGGATCTAAGAACAAACCAAAACCGC 
CAATCATCATCACTCGAGACAGCGCAAACGCTCTCAAATCTCATGTCATGGAAGTAGCAA 
ACGGATGTGACGTCATGGAAAGTGTCACCGTCTTCGCTCGCCGTCGCCAACGTGGCATCT 
GCGTTTTGAGCGGAAACGGCGCCGTTACCAACGTTACCATAAGACAACCAGCTTCAGTAC 
CTGGTGGTGGCTCATCTGTCGTTAACTTACACGGACGTTTCGAGATTCTTTCTCTCTCGG 
GATCATTCCTTCCTCCTCCGGCTCCACCAGCTGCGTCAGGTCTAACGATTTACTTAGCCG 
GTGGTCAGGGACAGGTTGTTGGAGGAAGCGTGGTTGGTCCACTCATGGCTTCAGGACCTG 
TAGTGATTATGGCAGCTTCGTTTGGAAACGCTGCGTATGAGAGACTGCCGTTGGAGGAAG 
ACGATCAAGAAGAGCAAACAGCTGGAGCGGTTGCTAATAATATCGATGGAAACGCAACAA 
TGGGTGGTGGAACGCAAACGCAAACTCAGACGCAGCAGCAACAGCAACAACAGTTGATGC 
AAGATCCGACGTCGTTTATACAAGGGTTGCCTCCGAATCTTATGAATTCTGTTCAATTGC 
CAGCTGAAGCTTATTGGGGAACTCCGAGACCATCTTTCTAAATCGCGAAGAAAAAACAAG 

TTCTTCTTCTTTGTTTTCTAAAGATAATTGTAGTCTTTGACGAAGATTCGTGGTACGTAT 
GAATCGAAGAGAATCGTTTTGGTCATGGGATTGCTCGATCTATTAGGTTTGAGAGGGGGT 
TTGTGTTTTGCGTTGACTAGCAGATTATAAAATTGTTGATTTTCGAGTTTTTATTTTCAT 
GTGTTGGTGATAAA 

>G596 Amino Acid Sequence (domain in AA coordinates: 89-96) 
MDQVSRSLPPPFLSRDLHLHPHHQFQHQQQQQQ 

HSSAGKDQSTPGSGGESGGGGGGDNHITRRPRGRPAGSKNKPKPPI I ITRDSANALKSHV 
MEVANGCDVMESVTVFARRRQRGICVLSGNGAVTN^ 

LSLSGSFLPPPAPPAASGLTIYLAGGQGQVVGGSVVGPIjMASGPVVIMAASFGNAAYERL 
PLEEDDQEfeQTAGAVANNIDGNATMGGGTQTQTQTQQQQQQQLMQD PTS F IQGLPPNLMN 
SVQLPAEAYWGTPRPSF* 
>G617 (59.. 1141) 

GAGATCAGGAGAATGTGATGAAGAGGAGATTCAAGCAAAGCAAGAAAGAGATCAAAATCA 

AAATCATCAAGTAAACTTAAACCACATGTTGCAACAACAACAGCCGAGTTCGGTATCATC 

TTCAAGGCAATGGACTTCAGCTTTTAGGAATCCAAGAATCGTTCGAGTCTCAAGAACATT 

CGGTGGCAAAGACAGACACAGCAAAGTATGTACAGTCCGTGGTCTTCGAGACCGGAGGAT 

AAGGTTGTCCGTACCTACAGCTATTCAACTCTACGACCTTCAAGATCGATTAGGGCTGAG 

TCAGCCAAGCAAAGTCATTGATTGGTTACTCGAAGCAGCAAAAGATGACGTAGACAAGCT 

ACCTCCTCTACAATTCCCACATGGATTTAACCAGATGTATCCAAATCTCATCTTCGGAAA 

CTCCGGGTTTGGAGAATCTCCATCTTCAACTACATCAACAACGTTTCCAGGAACCAATCT 

CGGGTTCTTGGAAAATTGGGATCTTGGTGGTTCTTCAAGAACAAGAGCAAGATTAACCGA 

TACAACTACGACCCAAAGAGAAAGTTTTGATCTTGATAAAGGAAAATGGATCZAAAAACGA 

CGAGAATAGTAATCAAGATCATCAAGGGTTTAACACCAATCATCAACAACAATl^ 

GACGAATCCGTACAACAACACTTCAGCTTATTAC 

AGACCAATCTGGTAATAACGTTACTGTCGCAATATCTAATGTTGCTGCTAATAATAACAA 
TAATCTC^TTTGCATCCTCCTTCCTCGTCTC 

TCCTACTCCTCCGGCT^TGAGCTCrCTATTCCCGACATACCCTTCGTTTCTTGGAGCTTC 
TCATCATCATCATGTCGTCGATGGAGCCGGTCATCTTCAGCTCTTTAGCTCGAATTCAAA 
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TACCGCATCGCAGCAACACATGATGCCGGGTAATACGAGTTTGATTAGACCATTTCATCA 
TTTGATGAGCTCGAATCATGATACGGATCATCATAGTAGCGATAATGAATCAGATTCTTG 
AATGATTTTATATATCTACACTATACATTGAAAATGTTATATGTATACGTATTCTTCTAT 
ATTTTGATATATATGCGTATTGTTGGATTGGTTTATGTATCT 

>G617 Amino Acid Sequence (domain in AA coordinates: 64-118) 
MRSGECDEEEIQAKQERDQNQNHQVNLNHMLQQQQPSSVSSSRQWTSAFRNPRIVRVSRT 
FGGKDRHSKVCTVRGLRDRRIRLSVPTAIQLYDLQDRLGLSQPSKVIDWLLEAAKDDVDK 
LPPLQFPHGFNQMYPNLIFGNSGFGESPSSTTSTTPPGTNLGFLENWDLGGSSRTRARLT 
DTTTTQRESFDLDKGKWIKNDENSNQDHQGFNTNHQQQFPLTNPYNNTSAYYNLGHLQQS 
LDQSGNNVTVAISNVAANNWNNLNLHPPSSSAGDGSQLFFGPTPPAMSSLFPTYPSFLGA 
SHHHHWDGAGHLQLFSSNSNTASQQHMMPGNTSLIRPFHHLMSSNHDTDHHSSDNESDS 

>G620 (40.. 666) 

GAATTGAACTTGGACCAGCACAGCAACAACCCAACCCCAATGACCAGCTCAGTCATAGTA 

GCCGGCGCCGGTGACAAGAACAATGGTATCGTGGTCCAGCAGCAACCACCATGTGTGGCT 

CGTGAGCAAGACCAATACATGCCAATCGCAAACGTCATAAGAATCATGCGTAAAACCTTA 

CCGTCTCACGCCAAAATCTCTGACGACGCCAAAGAAACGATTCAAGAATGTGTCTCCGAG 

TACATCAGCTTCGTGACCGGTGAAGCCAACGAGCGTTGCCAACGTGAGCAACGTAAGACC 

ATAACTGCTGAAGATATCCTTTGGGCTATGAGCAAGCTTGGGTTCGATAACTACGTGGAC 

CCCCTCACCGTGTTCATTAACCGGTACCGTGAGATAGAGACCGATCGTGGTTCTGCACTT 

AGAGGTGAGCCACCGTCGTTGAGACAAACCTATGGAGGAAATGGTATTGGGTTTCACGGC 

CCATCTCATGGCCTACCTCCTCCGGGTCCTTATGGTTATGGTATGTTGGACCAATCCATG 

GTTATGGGAGGTGGTCGGTACTACCAAAACGGGTCGTCGGGTCAAGATGAATCCAGTGTT 

GGTGGTGGCTCTTCGTCTTCCATTAACGGAATGCCGGCTTTTGACCATTATGGTCAGTAT 

AAGTGAAGAAGGAGTTATTCTTCATTTTTATATCTATTCAAAACATGTGTTTCGATAGAT 

ATTTTATTTTTATGTCTTATCAATAAC^TTTCTATATAATGTTGCTTCTTTAAGGAAAAG 

TGTTGTATGTCAATACTTTATGAGAAACTGATTTATATATGCAAAT 

>G620 Amino Acid Sequence (domain in AA coordinates- 20-118) 

MTSSVIVAGAGDKNNGIWQQQPPCVAREQDQYMPIANVIRIMRKTLPSHAKISDDAKET 

IQECVSEYISFVTGEANERCQREQRKTITAED1LWAMSKLGFDNYVDPLTVFINRYREIE 

TDRGSALRGEPPSLRQTYGGNGIGFHGPSHGLPPPGPYGYGMLDQSMVMGGGRYYONGSS 
GQDES S VGGGS S S S INGMPAFDHYGQYK* 

>G625 (151.. 1137) 

AATCGACCATTCACAACGATGACATTCAAACACTCTTCAGTTTCCCTTCCTTCTTGATTC 

GTCCTCTCCACTATTTTTCTCAATTTCTTTAATCTCTCTCTTTCTCTCTCTACTTCCTCT 

TCCTCTTCTTCTTCTTCTTCTTCTTCATCTATGGACCCTTTAGCTTCCCAACATCAACAC 

AACCATCTGGAAGATAATAACCAAACCCTAACCCATAATAATCCTCAATCCGATTCCACC 

ACCGACTCATCAACTTCCTCCGCTCAACGCAAACGCAAAGGCAAAGGTGGTCCGGACAAC 

TCCAAGTTCCGTTACCGTGGCGTTCGACAAAGAAGCTGGGGCAAATGGGTCGCCGAGATC 

CGAGAGCCaCGTAAGCGCACTCGCAAGTGGCTTGGTACTTTCGCAACCGCCGAAGACGCC 

GCACGTGCCTACGACCGGGCTGCCGTTTACCTATACGGGTCACGTGCTCAGCTCAACTTA 

ACCCCTTCGTCTCCTTCCTCCGTCTCTTCCTCTTCCTCCTCCGTCTCCGCCGCTTCTTCT 

CCTTCCACCTCCTCTTCCTCCACTCAAACCCTAAGACCTCTCCTCCCTCGCCCCGCCGCC 

GCCACCGTAGGAGGAGGAGCCAACTTTGGTCCGTACGGTATCCCTTTTAACAACAACATC 

TTCCTTAATGGTGGGACCTCTATGTTATGCCCTAGTTATGGTTTTTTCCCTCAACAACAA 

CAACAACAAAATCAGATGGTCCAGATGGGACAATTCCAACACCAACAGTATCAGAATCTT 

CATTCTAATACTAACAATAACAAGATTTCTGACATCGAGCTCACTGATGTTCCGGTAACT 

AATTCGACTTCGTTTCATCATGAGGTGGCGTTAGGGCAGGAACAAGGAGGAAGTGGGTGT 

AATAATAATAGTTCGATGGAGGATTTGAACTCTCTAGCTGGTTCGGTGGGTTCGAGTCTA 

TCAATAACTCATCCACCGCCGTTGGTTGATCCGGTATGTTCTATGGGTCTGGATCCGGGT 

TATATGGTTGGAGATGGATCTTCGACCATTTGGCCTTTTGGAGGAGAAGAAGAATATAGT 

CATAATTGGGGGAGTATTTGGGATTTTATTGATCCCATCTTGGGGGAATTCTATTAATTT 

GTTTTTGTGGAAGATCATATTATATACGATGAGCATCCCTAAGGTCGGTCAAGAGCATTG 

GAGATTCATTGTTGAGAGGAATCAAAGAGATTGCATTCTATGAGGAGCTCTGCATGCAAA 

ATTTTGGAGGATTTTTTTACTACCTATAGAGATAAATAAGAGGGTAITTTTATTATTTTT 

TTGAAGATTTTTATTTTCAAGGAATTCGTAAAAGAGATTACGGTTCCAATAAAGTATGTA 

TATGTGGAAGAGAATCGGAGGAGATGGTGGAAAGTTGTATGGGAATTTTATTGGTTCAAC 
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ACTTCCTTCACAGTGTGCCTACCTTAATATATAATTATTGATAGGATATGATAATTTCTG 

>G625 Amino Acid Sequence (conserved domain in AA coordinates • 52 -119) 

MDPLASQHQHNHLEDNNQTLTHNNPQSDSTTDSSTSSAQRKRKGKGGPDNSKFRYRGVRQ 

RSWGKWVAEIREPRKRTRKWLGTFATAEDAARAYDRAAVTLYGSRAQLNLTPSSPSSVSS 

SSSSVSAASSPSTSSSSTQTbRPLLPRPAAATVGGGANFGPYGIPFNNNIFLNGGTSMLC 

PSYGFFPQQQQQQNQMVQMGQFQHQQYQNLHSNTNNWKISDIELTDVPVTNSTSFHHEVA 

LGQEQGGSGCNWNSSMEDLNSLAGSVGSSLSITHPPPLVDPVCSMGLDPGYMVGDGSSTI 

WPFGGEEEYSHNWGSIWDFIDPILGEFY* 

>G658 (17.. 757) 

CCACGCGTCCGCTCACATGAACAAAGGAGCTTGGACTAAAGAAGAAGATCAGCTTCTTGT 

TGATTACATCCGTAAACACGGTGAAGGTTGCTGGCGATCTCTCCCTCGCGCCGCTGGATT 

ACAAAGATGTGGTAAGAGTTGTAGATTGAGATGGATGAATTATCTAAGACCAGATCTCAA 

AAGAGGCAATTTTACTGAAGAAGAAGATGAACTCATCATCAAGCTCCATAGCTTGCTCGG 

TAACAAATGGTCTTTAATAGCTGGGAGATTACCAGGAAGAACAGATAACGAGATCAAGAA 

CTATTGGAACACTCATATCAAGAGGAAGCTTCTCAGCCGTGGGATTGATCCAAACTCTCA 

CCGTCTGATCAACGAATCCGTCGTGTCTCCGTCGTCTCTTCAAAACGATGTCGTTGAGAC 

TATACATCTTGATTTCTCTGGACCGGTTAAACCGGAACCGGTGCGTGAAGAGATTGGTAT 

GGTTAATAATTGTGAGAGTAGTGGAACGACGTCGGAGAAGGATTATGGGAACGAGGAAGA 

TTGGGTGTTGAATTTGGAACTCTCTGTTGGACCGAGTTATCGGTACGAGTCGACTCGGAA 

AGTGAGTGTTGTTGACTCGGCTGAGTCGACTCGACGGTGGGGTTCCGAGTTGTTTGGAGC 

TCATGAGAGTGATGCGGTGTGTTTGTGTTGTCGGATTGGGTTGTTTCGTAATGAGTCGTG 

TCGGAATTGTCGGGTTTCTGATGTTAGAACTCATTAGAGAGTCAATCGAGAATTCTTTAG 

GAATCTTTTTATATATTTAGATCGTCAATTGTGTTTTTTTTTTGTTCAGATTTGTTATGT 

AACATCAAGTAAGAAACTAGCATAATTATTTGATGGCAAAGCCAAAAGATTGTGCTC 

>G658 Amino Acid Sequence (domain in AA coordinates: 2-105) 

^KGAWTKEEDQLLVT)YIRKHGEGCWRSLPRAAGLQRCGKSCRLRWMNYLRPDLKRGNFT 
EEEDELIIKLHSLLGNKWSLIAGRLPGRTDNEIKNYWNTHIKRKLLSRGIDPNSHRLINE 
SWSPSSLQNDWETIHLDFSGPVKPEPTOEEIGM^ 

ELSVGPSYRYESTRKVSWDSAESTRRWGSELFGAHESDAVCLCCRIGLFRNESCRNCRV 
SDVRTH* 

>G716 (271. .2079) 

AAAAAAAAAGGGGAGAGATTTAGTTTTATCCNNCAGNGCCTGAANTACGTTCTGCAATCA 

ANACGGACATAACCGNCCGTTGTGTCCTGTTTATAAAGTTTTGCTTTTTTTATTTTCTCC 

ANTGATGGGTCTTTTCTTTCTTCTCTCTCTNGTGTTTCTTTCATGGGGTTAAGACTAGTG 

TTTACCGCGTGAAGGTTTTTTTTTCTTTCTATTTTCTTTCATTTCCTCTCCTTCTACTTC 

TTCTTCTCCAGTTCTCATCTGGGTTCTTCAATGGCGAGTGTTGAAGGTGATGATGATTTC 

GGAAGTTCTTCGTCAAGGTCTTATCAAGATCAACTATACACAGAGCTATGGAAAGTTTGT 

GCAGGTCCATTAGTGGAAGTTCCTCGTGCTCAAGAGAGAGTTTTCTACTTCCCTCAGGGT 

CACATGGAACAACTTGTGGCGTCAACTAATCAAGGAATCAATTCAGAAGAAATACCTGTT 

TTTGATCTTCCTCCAAAGATACTTTGTCGAGTTCTTGATGTCACTTTAAAGGCGGAGCAT 

GAAACAGATGAGGTTTACGCTCAGATCACATTACAACCAGAGGAAGATCAAAGTGAACCA 

ACAAGTCTTGATCCACCTATTGTTGGACCAACTAAGCAAGAGTTTCATTCGTTTGTTAAG 

ATTTTAACGGCTTCAGATACAAGCACTCATGGTGGATTCTCTGTTCTTCGTAAACACGCC 

ACTGAATGCTTGCCTTCTTTGGATATGACACAAGCTACTCCTACTCAAGAACTTGTGACT 

AGAGATCTTCATGGCTTTGAATGGAGGTTTAAGCATATATTCAGAGGACAACCACGGAGG 

CATTTGCTTACTACGGGTTGGAGTACATTTGTATCCTCGAAAAGACTTGTAGCTGGAGAT 

GCTTTTGTGTTCTTGAGGGGTGAGAATGGGGATTTACGGGTTGGAGTGAGACGATTAGCT 

CGGCATCAAAGCACAATGCCTACTTCGGTTATTTCAAGTCAGAGCATGCATTTGGGAGTT 

CTTGCTACAGCTTCTCATGCTGTGCGTACAACAACAATCTTTGTTGTCTTTTACAAGCCT 

AGGATAAGCCAATTCATAGTTGGGGTGAACAAGTATATGGAAGCTATAAAGCATGGATTT 

TCTCTCGGTACCCGATTCAGAATGAGGTTTGAAGGAGAAGAGTCTCCTGAGAGAATATTT 

ACTGGTACGATTGTGGGAAGTGGAGATCTATCTTCACAATGGCCAGCTTCTAAATGGAGG 

TCATTGCAGGTACAATGGGATGAGCCAACAACAGTTCAGAGACCAGATAAAGTCTCACCA 

TGGGAGATAGAGCCTTTCTTGGCAACTTCCCCAATTTCAACTCCTGCTCAACAACCACAA 

TCGAAATGCAAGCGGTCAAGACCCATCGAGCCATCAGTTAAAACACCAGCCCCACCTAGT 

TTCTTGTACAGCCTCCCTCAGAGCCAAGATTCCATTAATGCATCCCTTAAACTGTTTCAA 

GATCC^T(^CTTGAGAGAATTTCAGGTGGATACTCCTCAAAGAACAGCTTCAAACCCGAG 
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ACTCCTCCTCCTCCAACGAATTGTAGCTATAGGTTGTTTGGATTTGATCTCACAAGCAAT 

TCTCCTGCTCCAATCCCTCAAGACAAGCAACCGATGGATACTTGTGGAGCTGCCAAGTGT 

CAAGAACCCATCACTCCAACCTCAATGAGTGAGCAGAAGAAGCAACAAACATCAAGAAGT 

CGAACTAAAGTGCAAATGCAAGGCATTGCGGTTGGTCGTGCGGTTGATTTAACACTGTTG 

AAATCTTACGATGAACTGATTGATGAGCTTGAGGAGATGTTTGAGATTCAAGGACAGCTT 

CTTGCCCGAGACAAATGGATCGTTGTCTTCACTGATGATGAAGGAGATATGATGCTTGCT 

GGTGATGATCCGTGGAATGAGTTTTGCAAGATGGCAAAGAAGATATTTATATATTCGAGC 

GATGAGGTTAAGAAAATGACAACGAAACTGAAGATTTCTTCGTCGTTAGAGAATGAGGAA 

TATGGTAATGAATCATTCGAAAATCGTAGTAGGGGGTGAGAGTTTTAGCTGTTAATTAAG 

GTTAATTCGGCGACGTCGTTTTAGTGCGTAAGTGTCTAAAGACTTTTTTTTTAGTCTGTG 

TATATAAAGTCTTGTCCTCTTTTTCATGTCAATTITTC^AAGTTGGCGATTTAATATTTCG 

GTTTTGGGACAGTGGTTGATGGGGCGGTTTTACATTTTTTATGTGTATGTACTTGTTCCA 
AAACCATTCAATTTTCAAA 11 A 

>G716 Amino Acid Sequence (domain in AA coordinates- 24-355) 
MASVEGDDDFGSSSSRSYQDQLYTELWKVCAGPLVEVPRAQERVFYFPQGHMEQLVASTN 

qginseeipvfdlppkilcrvldvtlkaehetdevyaqitlqpeedqsepSS 

5222 PRraLLTTGWSTFVSS ^ VAGD ^VFLRGENGDLRVGVRRI^QSTMPTSV 

Irlll^ RIFTGTIVGSGDLSS Q W ^SKWRSLQVQWDEPriVQRPDKVSP W EIEPFLATS 

vScSf QSKC ^ SRPIEPS ^ 
S™f CTSTPPPP ™ CSY ^ FGFDLTSNSP ^^^ 

eq^qqtsrsrtkvqmqgiavgravdltllksydelideleemfeiqgqlL^kwiwf 

TOJEGDMMIAGDDPV^FCKMAKKIFIYSSDEVKKMTT^ 
>G725 (45.. 1122) 

CCTCTTTCAGAGAGAGAAAGAGAGTCAGAGAGAGAGAGAGAGAGAATGTTCCATGCTAAG 

AAACCTTCAAGTATGAATGGTTCATATGAGAACAGAGCTATGTGCGTTCAAGGCGAT^ 

GGCCTTGTCCTCACCACCGACCCTAAACCGCGTTTGCGTTGGACCGTCGAACTCCACGAG 

CGTTTTGTGGACGCCGTCGCTCAGCTCGGCGGCCCCGACAAAGCGA^ 

ATGAGAGTTATGGGTGTGAAGGGTCTTACTCnTTACCACCTAAAGAS 

^ CAGGC .™^ 

AGAGCTTCTGCCATGGATATTCAGCGCAACGTAGCrrTCTTCTTCTGGCATGATGAGTCGC 
^™^f ATGCAAATGGAAGTGCAGAG ^ 

AGACATCTGCAACTGAGGATTGAAGCACAAGGAAAGTACATGCAATCTATCTTGGAGAGA 
GCTTGCCAAACCCTAGCCGGTGAGAAGATGGCAGCCGCCACCGCAGCAGCCGCCGTC^ 

ggaggatac^gggtaatctgggaagttcgagtctttcagcagcgS 

CATCCTCTTAGTTTCCCGCCGTTTCAAGACCTAAACATCTATGGAAAC^SS 
GTCCTCGACCATCA^CTTC^TCATCAAAACATAGAGAACCATTTCA^ 

gctgciagacaccaacatttacttgggc^gaagcgacctaatcctaattttgI^ 

STAAGGAAAGGACTATTGATGTGGTCTGATCAAGA^ 

atcgatgatgagcatagaattcagatacagatggctacacatg T ctccacg^™gat 
tcjttgtc^gagatctacgaaaggaaatcaggittat^ggtgaS^ 

aatggtggattaatac^ggaagaaactcgccatctgggtgatacaatSa^ 

>G725 Amino Acid Sequence (domain in AA coordinates- 39-87) 
MFHAKKPSSMNGSYENRAMCVQGDSGLVLTTDPKPRLRWTVELHERFVDAVAQLGGPD 

tpmvmgvkgltlyhlkshlqkfr^^ 

gmmsrnmnemqmwqrrlheqlevqrhlqlrieaqgkymqsileracq^geWa?! 

aaavgggykgnlgssslsaavgppphplsfppfqdi^iygnttiSSfISS 
ftcnnaadtoiylgkkrpnpnfgndvrkg^ 

cttcttctccttctctgatcgttcgttttctggacgagagagatggtaaatccgggtcac 
ggaagaggacccgattcgggtactgctgctggtgggtcaaactccgac^gtttcctgcg 
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AATCTTCGAGTTCTTGTCGTTGATGATGATCCAACTTGTCTCATGATCTTAGAGAGGATG 
CTTATGACTTGTCTCTACAGAGAGCAGAGAGCGCATTGTCTCTGCTTCGGAAGAACAAAG 
AATGGTTTTGATATTGTCATTAGTGATGTTCATATGCCTGACATGGATGGTTTCAAGCTC 
CTTGAACACGTTGGTTTAGAGATGGATTTACCTGTTATCAATCTGAATGTTTTGAAACCT 
TTGGTTATAGTGATGTCTGCGGATGATTCGAAGAGCGTTGTGTTGAAAGGAGTGACTCAC 
GGTGCAGTTGATTACCTCATCAAACCGGTACGTATTGAGGCTTTGAAGAATATATGGCAA 
CATGTGGTGCGGAAGAAGCGTAACGAGTGGAATGTTTCTGAACATTCTGGAGGAAGTATT 
GAAGATACTGGCGGTGACAGGGACAGGCAGCAGCAGCATAGGGAGGATGCTGATAACAAC 
TCGTCTTCAGTTAATGAAGGGAACGGGAGGAGCTCGAGGAAGCGGAAGGAAGAGGAAGTA 
GATGATCAAGGGGATGATAAGGAAGACTCATCGAGTTTAAAGAAACCACGCGTGGTTTGG 
TCTGTTGAATTGCATCAGCAGTTTGTTGCTGCTGTGAATCAGCTAGGCGTTGACAGTGAG 
TTAAAAACTTGCTTGCTTATGCATTTGTGTGTGTCGATTGGTAACATTGTGGAATTCCAG 
AAGTATCGGATATATCTGAGAGGGCTTGGAGGAGTATCGCAACACCAAGGAAATATGAAC 
CATTCGTTTATGACTGGTCAAGATCAGAGTTTTGGACCTCTTTCTTCGTTGAATGGATTT 
GATCTTCAATCTTTAGCTGTTACTGGTCAGCTCCCTCCTCAGAGCCTTGCACAGCTTCAA 
GCAGCTGGTCTTGGCCGGCCTACACTCGCTAAACCAGGGATGTCGGTTTCTCCCCTTGTA 
GATCAGAGAAGCATCTTCAACTTTGAAAACCCAAAAATAAGATTTGGAGACGGACATGGT 
CAGACGATGAACAATGGAAATTTGCTTCATGGTGTCCCAACGGGTAGTCACATGCGTCTG 
CGTCCTGGACAGAATGTTCAGAGCAGCGGAATGATGTTGCCAGTAGCAGACCAGCTACCT 
CGAGGAGGACCATCGATGCTACCATCCCTCGGGCAACAGCCGATATTGTCAAGCAGCGTT 
TCAAGAAGAAGCGATCTCACTGGTGCGCTGGCGGTTAGAAACAGTATCCCCGAGACCAAC 
AGCAGAGTGTTACCAACTACTCACTCGGTCTTCAATAACTTCCCCGCGGATCTACCTCGC 
AGCAGCTTCCCGTTGGCAAGTGCCCCAGGGATTTCAGTTCCAGTATCAGTTTCTTACCAA 
GAAGAGGTCAACAGCTCGGATGCAAAAGGAGGTTCATCAGCTGCTACTGCTGGATTTGGT 
AACCCAAGCTACGACATATTTAACGATTTTCCGCAGCACCAACAGCACAACAAGAACATC 
AGCAATAAACTAAACGATTGGGATCTGCGGAATATGGGATTGGTCTTCAGTTCCAATCAG 
GACGCAGCAACTGCAACCGCAACCGCAGCATTTTCCACTTCGGAAGCATACTCTTCGTCT 
TCTACGCAGAGAAAAAGACGGGAAACGGACGCAACAGTTGTGGGTGAGCATGGGCAGAAC 
CTGCAGTCACCGAGCCGGAATCTGTATCATCTGAACCACGTTTTTATGGACGGTGGTTCA 
GTCAGAGTGAAGTCAGAAAGAGTGGCGGAGACAGTGACTTGTCCTCCAGCAAATACATTG 
TTTCACGAGC^GTATAATCAAGAAGATCTGATGAGCGCATTTCTCAAACAGGTTTGATTA 
TTACTCGAATACAGTGCACTCTAAAAC 

>G727 Amino Acid Sequence (domain in AA coordinates: 226-269) 
MVNPGHGRGPDSGTAAGGSNSDPFPANLRVLWDDDPTCLMILERMLMTCLYREQRAHCL 
CFGRTKNGFDIVISDVHMPDWGFKLLEHVGLEMDLPVINLNVLKPLVIVMSADDSKS^ 
LKGVTHGAVD YLI KP VRIEALKNI WQHVVRKKRNEWNVS EHS GGS I EDTGGDRDRQQQHR 
EDADNNS S S VNEGNGRS SRKRKEEEVDDQGDDKEDS S S LKKPRWWS VELHQQF VAAVNQ 
LGVDS ELKTCLLMHLCVS IGNI VEFQKYRI YLRRLGGVS QHQGNMNHS FMTGQDQS FGPL 
S SLNGFDLQS LAVTGQLPPQS LAQLQAAGLGRPTLAKPGMS VS PLVDQRS I FNFENPKIR 
FGDGHGQTMNNGNLLHGVPTGSHMRLRPGQNVQS SGMMLPVADQLPRGGPSMLP SLGQQP 
I LS S S VSRRSDLTGALAVRNS I PETNSRVLPTTHS VFNNFPADLPRS S FPLAS APGI S VP 
VS VS YQEEVNS SDAKGGS SAATAGFGNPS YD I FNDFPQHQQHNKNI SNKLNDWDLRNMGL 
VTSSNQDAATATATAAFSTSEAYSSSSTQRKRRETDATW^ 
FMDGGSVRVKSERVAETVTCPPANTLFHEQYNQEDLMSAFLKQV* 
>G740 (25.. 924) 

CTTCTTCAACTTTTTTTXTTAACGATGGCTTCAGAGGATCAATCGGCGGCGAGATCTACC 
GGGAAGGTGAACTGGTTCAACGCTTCTAAAGGCTATGGTTTCATTACTCCTGACGATGGC 
AGCGTAGAGCTTTTCGTTCATCAATCTTCAATTGTCTCCGAAGGTTACCGGAGTTTAACC 
GTCGGCGATGCGGTTGAGTTCGCTATTACTCAGGGAAGCGACGGTAAGACTAAAGCCGTC 
AATGTTACTGCTCCTGGTGGTGGTTCTCTCAAGAAGGAGAATAACTCTCGTGGTAACGGT 
GCTAGGCGCGGCGGCGGTGGAAGCGGTTGCTAC^ATTGCGGTGAGTTAGGTCATATCTCT 
AAAGATTGTGGTATTGGTGGCGGCGGCGGAGGTGGTGAACGTAGATCTAGAGGAGGAGAA 
GGTTGTTACAATTGTGGTGATACTGGTCACTTCGCTAGGGATTGTACTTCAGCTGGAAAC 
GGTGACCAACGTGGAGCCACCAAAGGTGGAAACGATGGTTGCTACACTTGCGGTGATGTT 
GGTCACGTGGCTAGGGATTGTACTCAGAAATCAGTTGGAAACGGAGACCAACGTGGAGCG 
GTCAAAGGTGGAAACGATGGTTGCTACACTTGTGGTGATGTTGGTCACTTTGCTAGGGAT 
TGTACTCAGAAGGTTGCTGCCGGAAACGTCAGAAGCGGTGGTGGTGGTAGTGGAACTTGT 
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TATTCATGCGGTGGAGTTGGTCACATTGCAAGAGATTGTGCGACTAAGAGACAGCCTTCT 
CGTGGGTGTTACCAGTGTGGTGGTTCTGGTCACTTGGCTCGTGATTGTGACCAGAGAGGA 
AGCGGTGGAGGAGGTAATGATAATGCGTGCTACAAGTGTGGTAAGGAAGGTCACTTTGCA 
AGGGAATGTTCTTCTGTAGCTTAATCGATTTCCTAATCAACAAAACAAAAAAACAAGAAT 
GAAATTGAATCGAGTTATATAGTTTGGTATATATTACTCTTCGTTTTCATTTATCTTTTT 
TTTTGTTGTTGATGGGAATGAAATTGCCTGGTCCTTTTGGTGTGTTTTTGAGCTTTTATT 
ATTATACAGAGTGATCCCTTTTTTGTTATAACTATTACAAGTTTTTAGCTTTATTTGATA 
TGGATGCTCTCTCCTTTTCTTCTATCTGTTTCTGGAAATTTTGACCTCATCATATTACTT 
ATGTCATCCAAA 

>G740 Amino Acid Sequence (domain in AA coordinates: 24-42, 232-268) 

MASEDQSAARSTGKVNWFNASKGYGFITPDDGSVELFVHQSSIVSKGYRSLTVGDAVEFA 

ITQGSDGKTKAVNVTAPGGGSLKKENNSRGNGARRGGGGSGCYNCGELGHISKDCGIGGG 

GGGGERRSRGGEGCYNCGDTGHFARDCTSAGNGDQRGATKGGNDGCYTCGDVGHVARDCT 

QKSVGNGDQRGAVKGGNDGCYTCGDVGHFARDCTQKVAAGNVRSGGGGSGTCYSCGGVGH 

IARDCATKRQPSRGCYQCGGSGHLARDCDQRGSGGGGNDNACYKCGKEGHFARECSSVA* 

>G770 (119.. 1069) 

CCTTCCTCTATATAAGGAAGTTCATTTCATTTGGAGAGGACACGCTGACAAGCTGACTCT 
AGCAGATCTGGTACCGTCGACGGTTCTTGGATTTGGAGTAAACTAAAGATCATATAAAAT 
GGAACAAGGAGATCATCAGCAGCATAAGAAAGAAGAAGAAGCTTTGCCACCGGGTTTCAG 
ATTTCATCCGACGGATGAGGAGCTAATCTCATATTACTTGGTTAATAAGATTGCCGATCA 
AAACTTCACCGGGAAAGCAATCGCTGACGTTGATCTTAACAAGTCCGAGCCATGGGAGCT 
TCCTGAGAAGGCGAAAATGGGAGGAAAAGAATGGTACTTTTTTAGCCTCCGGGACCGGAA 
GTACCCGACGGGAGTGAGGACGAATAGGGCGACGAATACAGGATATTGGAAAACCACAGG 
AAAAGACAAAGAGATATTCAATAGCACAACCTCGGAGTTGGTCGGGATGAAGAAGACTTT 
GGTCTTTTACAGAGGACGAGCTCCTCGTGGGGAGAAGACTTGTTGGGTCATGCATGAGTA 
TCGACTTCACTCCAAGTCCTCATATAGAACCTCCAAGCAAGACGAGTGGGTAGTGTGTAG 
AGTGTTCAAGAAAACAGAAGCAACCAAGAAATACATAAGCACCAGTAGCAGCAGCACAAG 
TCATCACCACAACAACCACACAAGAGCCTCAATACTATCAACCAACAACAATAATCCTAA 
TTACTCATCZAGACCTCCTTCAACTCCCACCGCATCTACAACCACACCCGAGCCTCAATAT 
TAACCAATCCCTCATGGCAAACGCCGTTCACCTAGCTGAGCTCTCAAGAGTCTTCCGTGC 
CTCTACAAGCACCACCATGGACTCTTCTCATCAGCAGCTAATGAACTACACCCACATGCC 
TGTCTCAGGGCTCAACCTCAACCTTGGCGGTGCACTGGTCCAGCCGCCTCCTGTTGTGTC 
TCTTGAGGATGTTGCCGCGGTTAGTGCTTCGTACAATGGCGAAAACGGGTTTGGAAATGT 
GGAGATGAGCCAGTGCATGGACTTGGATGGATACTGGCCATCTTATTGATTGGTAATTGT 
CAGTTTAAGTTATGGTTTTTATATTGTTTCCATTTACTTGTTGGTAAAACGATTTTGGTT 
GTTCTTGCGAACGCTCTAGACAGGCCTCGTACCGGATCCTCTAGCTAGAGCTTTCGTTCG 
TATCATCGGTTTC 

>G770 Amino Acid Sequence (domain in AA coordinates: 19-162) 
MEQGDHQQHKKEEEALPPGFRFHPTDEELISYYLWKIADQNFTGKAIADVDLNKSEPWE 
LPEKAKMGGKEWYFFSLRDRKYPTGWTNRA 

LVFYRGRAPRGEKTCWVMHEYRLHSKS STOTSKQDEWWCRVFKKTEATKKYI STS S SST 

SHHHNNHTRASILSTNNNNPNYSSDL^ 

ASTSTTITOSSHQQLMNYTHMPVSGL^ 

VEMSQCMDLDGYWPSY* 

>G858 (99.. 869) 

CATAATCTCnTCTCTCTATATCTCTTCTCTTCTTCTTTTACCCTGTTTTTTTTTTCATTC 

CACAGAGCCCAGGTTGATTGATTTTGTTATTCAGAGATATGGGGAGAGGAAGGATTGAGA 

TTAAGAAGATTGAGAATATCAACAGTCGTCAAGTCACTTTCTCTAAGAGACGAAACGGTT 

TGATCAAGAAGGCTAAAGAGCTTTCGATTCTCTGTGACGCCGAGGTTGCTCTTATCATCT 

TCTCCAGCACCGGCAAGATTTACGATTTCTCCAGCGTCTGTATGGAGCAAATTCTTTCTA 

GATATGGATACACTACTGCGTC(^CTGAGCATAAACAACAAAGAGAACACCAACTTOT 

TTTGTGCTTCACATGGAAATGAAGCTGTGTTGCGAAATGATGATTCTATGAAGGGGGAAC 

TTGAAAGATTACAGCTTGCAATTGAGAGACTTAAGGGTAAGGAGCTTGAAGGTATGAGTT 

TCCCGGATCTTATTTCTCTTGAAAACCAGTTGAACGAGAGC 

AAAAGACACAAATCCTGCTCAACCAGATTGAGAGATCCAGGATACAGGAGAAAAAAGCAT 
TGGAAGT^AAACCAAATCTTGCGCAAACAGGTTGAGATGTTGGGGAGAGGTTCAGGACCAA 
AAGTGTTGAATGAAAGGCCTCAAGATTCTAGCCCAGAAGCCGATCCCGAGAGCTCTTCAT 
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CAGAAGAGGATGAGAATGACAACGAGGAGCACCATTCCGACACTTCCTTGCAGTTGGGGT 

TGTCGTCGACGGGGTATTGCACAAAGAGAAAGAAGCCGAAGATCGAACTGGTCTGCGATA 

ACTCTGGGAGTCAAGTGGCTTCTGATTGATGGAATCGATTATTTTTCTAATTCTGGTTGT 

TTAGGGGTCTCTATGTGTCTTCTTGTTTCTGGCTGTTCTTTTGCTTTATTTCATCTCAAG 

TAGAGTTTTCTTAATGTTTAGGTGGAACATTTTTCCATAATCAAGAAGGGATTTGATCAA 

TCAATAACATTAGATTTTCTTAGTTAAAGACTTAAAGTTGCCCACACACCACACCATATG 

TGATTATGATGAATTTACATTTTATAAAAAAAAAAAAAAAAAAAAAAAAA 

>G858 Amino Acid Sequence (domain in AA coordinates: 2-57) 

MGRGRIEI KKIENINSRQVTFSKRRNGLI KKAKELS ILCDAEVALI I FS STGKI YDFS S V 

CMEQILSRYGYTTASTEHKQQREHQLLICASHGNEAVLRNDDSMKGELERLQLAIERLKG 

KELEGMSFPDLISLENQLNESLHSVKDQKTQILLNQIERSRIQEKKALEENQILRKQVEM 

LGRGSGPKVLNERPQDSSPEADPESSSSEEDE1TONEEHHSDTSLQLGLSSTGYCTICRKKP 

KIELVCDNSGSQVASD* 

>G865 (282.. 920) 

ATCCCCACTTGTTGTTCATCACCAAGCCAAGCTCCATGTCCTAGTCACTCCACAGATTCC 

CTATCATCATCAATTCGTTTCAAACTTAGTTCCTTTCAAAGTCTTGTACATATATACACA 

CACACCTATTATTCTCTTGGTGTGTTTGTGTGTTACATATACGTGTGAGTACATACTTTG 

TTGTAAAAGTGGATCGGAGGTATGGAAAGGGACCGGTTCCACCGGAAACATCGGCGGCGG 

CGGATGATAATTCGTCTTGGAACGAGACTGATGTCACCGCCATGGTCTCCGCTCTCAGCC 

GTGTCATAGAGAATCCGACAGACCCGCCGGTCAAACAAGAGCTTGATAAATCGGATCAAC 

ATCAACCAGACCAAGATCAACCAAGAAGAAGACACTATAGAGGCGTAAGGCAGAGACCAT 

GGGGTAAATGGGCGGCAGAAATCCGCGATCCAAAGAAAGCAGCCCGTGTCTGGCTCGGGA 

CTTTCGAGACGGCAGAGGAAGCTGCTTTAGCCTATGACCGAGCTGCCCTCAAATTCAAAG 

GCACCAAGGCTAAACTGAACTTCCCTGAACGGGTCCAAGGCCCTACTACCACCACAACCA 

TTTCTCATGCACCAAGAGGAGTTAGTGAATCCATGAACTCACCTCCTCCTCGACCTGGTC 

CACCTTCAACTACTACTACTTCGTGGCCAATGACTTATAACCAGGACATACTTCAATACG 

CTCAGTTGCTTACGAGTAACAATGAGGTTGATTTATCATACTACACGTCGACTCTCTTCA 

GTCAACCTTTTTCAACGCCTTCTTCATCTTCTTCTTCCTCCCAACAGACGCAGCAACAGC 

AGCTACAACAAC^CAACAGCAGCGTGAAGAAGAAGAGAAGAATTATGGTTACAATTATT 

ATAACTACCCAAGAGAATAATCTAATTATTATTGTTGGTCGAATCAGTTTTATAAATAGC 

TATCATAGTTTCATTTTTGGTTTCCGTAACCTTTGTTGCATGGAAAATATGAATGAACGA 

GGGACATGTGTAACAATTTGTTTGTGTTTCGTAAATGTTAGTTGTATTTGGATTTGCTGA 

AGTTTGATTTTCTGAGCATAAATCATTTGACGGTCAAAAAAAAAAA 

>G865 Amino Acid Sequence (domain in AA coordinates: 36-103) 

MVSALSRVIENPTDPPVKQELDKSDQHQPDQDQPRRRHYRGVRQRPWGKWAAEIRDPKKA 

ARWLGTFETAEEAALAYDRAALKFKGTKAKLNFPERVQGPTTTTT I SHAPRGVS ESMNS 

PPPRPGPPSTTTTSWPMTYNQDILQYAQLLTSNNEVDLSYYTSTLFSQPFSTPSSSSSSS 

QQTQQQQLQQQQQQREEEEKNYGYNYYNYPRE* 

>G872 (59.. 646) 

CCGGAAACAGAATCC^TTC^CC^AACCGAATCGAACCGAACCGGAGTTTTTATCCAAT 
GGTGAAGCAAGCGATGAAGGAAGAGGAGAAGAAGAGAAACACGGCGATGCAGTCAAAGTA 
CAAAGGAGTGAGGAAGAGGAAATGGGGAAAATGGGTATCGGAGATCAGACTTCCACACAG 
CAGAGAACGAATTTGGTTAGGCTCTTACGACACTCCCGAGAAGGCGGCGCGTGCTTTCGA 
CGCCGCTC^TTTTGTCTCCGCGGCGGCGATGCTAATT^ 

GTCGATCTCCGTAGAAAAGTCGTTGACGCCTCCGGAGATTCAGGAAGCTGCTGCTAGATT 

CGCTAACACATTCCAAGACATTGTCAAGGGAGAAGAAGAATCGGGTTTAGTACCCGGATC 

CGAGATCCGACCAGAGTCTCCTTCTACATCTGCATCTGTTGCTACATCGACGGTGGATTA 

TGATTTTTCGTTTTTGGATTTGCTTCCGATGAATTTCGGGTTTGATTCCTTCTCCGACGA 

CTTCTCTGGCTTCTCCGGTGGTGATCGATTTACAGAGATTTTACCCATCGAAGATTACGG 

AGGAGAGAGTTTATTAGATGAATCTTTGATTCTTTGGGATTTTTGAATTCCCAAACATAA 

TATTTTTTTAGAGCGAACTGTGAGATTTTCCTTGGAGTCATGGAGAAATCTGGAGATTTT 

TTGTAACACGGAGCTCCAATGACCCGGGAATTTCTTTCGTTTCGGATCCGAATTTGATGT 

GGATCATATTCACACCTATATTTTTTCATTTTTTTGTTGTAAAG 

TCTAGTAATAAATGTTAAAAGTC(^TTTCATTAAAAAAAAAAAAAAAAAAA 

>G872 Amino Acid Sequence (domain in AA coordinates: 18-85) 

MVKQAMKEEEKKRNTAMQSKYKGVRKRKWGKWVS 

DAAQFCLRGGDANFNFPNNPPSISVEKSLTPPEIQEAAARFANTFQDIVKGEEESGLVPG 
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Z 



>G910 (1..1071) 

====== 




138 



BNSDOCID: <WO_03013227A2_IA> 



WO 03/013227 PCT/US02/25805 



CAGAGGAGTTCGTCAGAGGAATTCTGGTAAATGGGTTTGTGAAGTTAGAGAGCCTAATAA 
GAAATCTAGGATTTGGTTAGGTACTTTTCCGACGGTTGAAATGGCTGCTCGTGCTCATGA 
TGTTGCTGCTTTAGCTCTTCGTGGTCGCTCTGCTTGTCTCAATTTCGCTGATTCTGCTTG 
GCGGCTTCGTATTCCTGAGACTACTTGTCCTAAGGAGATTCAGAAAGCTGCGTCTGAAGC 
TGCAATGGCGTTTCAGAATGAGACTACGACGGAGGGATCTAAAACTGCGGCGGAGGCAGA 
GGAGGCGGCAGGGGAGGGGGTGAGGGAGGGGGAGAGGAGGGCGGAGGAGCAGAATGGTGG 
TGTGTTTTATATGGATGATGAGGCGCTTTTGGGGATGCCCAACTTTTTTGAG7UVTATGGC 
GGAGGGGATGCTTTTGCCGCCGCCGGAAGTTGGCTGGAATCATAACGACTTTGACGGAGT 
GGGTGACGTGTCACTCTGGAGTTTTGACGAGTAATTTTTTGGCTCTTTTTCTGGATAATA 
AGTT 

>G912 Amino Acid Sequence (domain in AA coordinates : 51-118) 

MNPFYSTFPDSFLSISDHRSPVSDSSECSPKLASSCPKKRAGRKKFRETRHPIYRGVRQR 

NSGKWCETOEPNKKSRIWLGTFPTVEMAARAHDVAALALRGRSACLNFADSAWRLRIP^ 

TTCPKEIQKAASEAAMAFQNETTTEGSKTTU^AEEAAGEGVREGERRAEEQNGGVFYMDD 

EALLGMPNFFENMAEGMLLPPPEVGWNHNDFDGVGDVSLWSFDE* 

>G920 (114.. 1154) 

AAAAAATCTATTTTCTTCTCTTTCCACTATATTACAACATTTCTTCATTCTCAAATCATC 
ATACTAAAAACCTAAAAAAAGTTACATATTCATTGTATCTTTGTGAGAAAAAAATGGATT 
CGAATAGTAACAACACGAAATCCATA7\AGAGAAAAGTTGTCGACCAACTTGTCGAAGGCT 
ATGAATTCGCTACTCAGCTTCAGCTTCTCCTTTCTCATCAACACTCTAACCAGTACCACA 
TCGATGAGACCCGTCTTGTTTCCGGGTCGGGTTCAGTTTCCGGTGGTCCAGATCCCGTTG 
ATGAGCTCATGTCTAAGATCTTGGGATCTTTCCATAAAACTATATCGGTTCTTGATTCTT 
TTGATCCCGTCGCCGTCTCTGTCCCCATCGCCGTCGAGGGTTCATGGAATGCTTCATGTG 
GGGATGATTCGGCGACTCCGGTGAGTTGCAACGGTGGAGATTCCGGTGAGAGTAAGAAGA 
AGAGATTAGGGGTTGGTAAGGGTAAAAGAGGATGCTACACTAGAAAGACGAGATCACATA 
CAAGGATCGTGGAAGCTAAAAGTTCTGAAGACAGATATGCTTGGAGGAAATATGGACAAA 
AGGAGATTCTTAATAC(^CATTCCCAAGAAGTTACTTTAGATGCACACACAAGCCAACGC 
AAGGATGCAAAGCAACAAAGCAAGTTCAGAAACAGGATCAAGATTCTGAGATGTTCCAAA 
TCACATACATTGGCTACCACACATGCACTGCCAATGACCAAACGCACGCGAAGACCGAGC 
CTTTTGATCAAGAAATCATTATGGATTCGGAAAAGACATTGGCTGCTAGCACTGCTCAGA 
ACCATGTCAATGCTATGGTGCAAGAGCAAGAGAACAACACCAGCAGTGTGACAGCAATAG 
ACGCAGGCATGGTTAAGGAGGAACAAAATAACAATGGTGATCAGAGTAAAGATTATTATG 
AGGGCTCTTCGACAGGTGAGGACTTGTCATTGGTTTGGCAAGAGACGATGATGTTTGATG 
ATCATCAAAATCACTACTATTGTGGTGAAACCAGTACTACTTCTCATCAATTTGGTTTCA 
TCGACAACGATGATCAGTTTTCCTCCTTCTTCGACTCATATTGTGCTGATTATGAAAGAA 
CAAGTGCTATGTGAACATCCAAATCTGGAATGATGAATCAGCACTAGGTCTTCTCTTTGA 
GTATGTCTAGTTTAATGTAATATTTTTGTTGTATGTTTGATATiAAACACCATATATACTT 
CTCTTTTTACACCAAAAAAAAAAAAAAAAAAAAAAA 

>G920 Amino Acid Sequence (domain in AA coordinates: 152-211) 
MDSNSNNTKSIKRKVVDQLVEGYEFATQLQLLLSHQHSNQYHIDETRLVSGSGSVSGGPD 
PVDELMSKILGSFHKTISVLDSFDPVAVSVPIAVEGSWNASCGDDSATPVSCNGGDSGES 
KKKRLGVGKGKRGCYTRKTRSHTRI VEAKS S EDRYAWRKYGQKEI LNTTFPRS YFRCTHK 
PTQGCKATKQVQKQDQDS EMFQ ITYIGYHTCTANDQTHAKTEPFDQEI IMDS EKTLAAST 
AQNHWAMVQEQENNTSSVTAIDAGMVKEEQNNNGDQSKDYYEGSSTGEDLSLW 
FDDHQNHYYCGETSTTSHQFGFIDNDDQFSSFFDSYCADYERTSAM* 
>G939 (9.. 1565) 

CAGATTCTATGGATATGTATAACAACAATATAGGGATGTTCCGGAGTTTAGTTTGTAGCT 
CGGCGCCTCCATTTACAGAGGGACATATGTGTTCTGATTCGCATACGGCTTTGTGCGATG 
ATCTGAGTAGTGATGAGGAAATGGAAATAGAGGAGCTTGAGAAGAAGATCTGGAGAGACA 
AGCAGCGTTTAAAGCGGCTCAAGGAAATGGCGAAGAACGGTCTAGGAACAAGATTGTTGT 
TGAAGCAGCAACATGATGATTTTCCAGAGCACTCTAGTAAGAGAACCATGTACAAGGCAC 
AAGATGGGATCTTGAAGTACATGTCGAAGACAATGGAGCGATATAAAGCTCAAGGTTTTG 
TTTATGGGATTGTGTTAGAGAATGGGAAAACGGTAGCGGGATCTTCTGATAATCTCCGTG 
AATGGTGGAAAGACAAAGTGAGGTTTGATAGGAACGGCCCAGCTGCTATAATCAAGCACC 
AAAGGGATATCAATCTTTCTGATGGAAGTGATTC^GGGTCTGAGGTTGGGGATTCTACCG 
CAC^GAAGTTGCTTGAGCTTCAAGATACTACTCTTGGAGCTCTGTTATCGGCTCTGTTTC 
CTCACTGCAACCCTCCTCAGAGGCGGTTTCCGTTGGAGAAAGGCGTGACACCGCCATGGT 
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GGCCAACGGGGAAAGAAGATTGGTGGGATCAACTGTCTTTACCCGTTGATTTTCGAGGTG 
TTCCGCCACCTTACAAGAAGCCTCATGATCTCAAGAAGCTGTGGAAAATTGGTGTTTTGA 
TTGGTGTAATCAGACATATGGCTTCTGACATTAGCAACATACCCAATCTCGTGAGACGGT 
CTAGAAGTTTGCAGGAGAAAATGACGTCAAGAGAAGGCGCTTTATGGCTCGCTGCTCTTT 
ACCGAGAAAAGGCTATTGTTGATCAAATAGCCATGTCTAGAGAAAACAACAACACTTCTA 
ACTTTCTTGTTCCTGCAACCGGTGGAGACCCAGATGTTTTGTTTCCTGAATCTACAGACT 
ATGATGTTGAACTGATTGGTGGCACTCATCGGACCAATCAGCAGTATCCTGAATTTGAAA 
ACAACTACAACTGTGTTTACAAGAGAAAGTTTGAAGAAGATTTTGGGATGCCAATGCATC 
CAACACTCCTAACATGTGAGAACAGTCTCTGTCCTTATAGCCAACCACATATGGGATTTC 
TTGACAGGAACTTAAGAGAGAATCACCAAATGACTTGTCCTTATAAAGTCACTTCCTTCT 
ACCAACCAACTAAACCCTATGGTATGACGGGTTTAATGGTTCCTTGTCCGGATTATAACG 
GGATGCAGCAGCAGGTTCAGAGCTTTCAAGACCAGTTTAATCATCCCAACGATCTCTACA 
GACCAAAAGCTCCACAAAGAGGCAACGATGACTTGGTTGAGGATTTGAATCCTTCTCCTT 
CGACGCTGAATCAGAATCTTGGTTTAGTCTTACCTACTGACTTCAATGGAGGTGAGGAAA 
CAGTAGGAACAGAGAACAATCTGCATAATCAAGGGCAAGAGTTGCCCACATCTTGGATTC 
AGTAAAGAAAGCTTCAGAGTTTTCTTTTTATGTTTTCTAGTCTTTATAGCTTTGTCTCTT 
GCTTATTCTCTCATTAAACACAGTTTTTGATCTCTCCATTTCATAGCCCATGTAGCAATG 
GAGAAGATTAGGTTTCATAATAAGTTAATAACGAAATTCAAA 

>G939 Amino Acid Sequence (domain in AA coordinates: 97-106) 

MDMY1INNIGMPRSLVCSSAPPFTEGHMCSDSHTALCDDLSSDEEMEIEELEKKIWRDKQR 
LKRLKEMAKNGLGTRLLLKQQHDDFP 

I VLENGKTVAGSSDNLREWWKDKVRFDRNGPAAI I KHQRDINLSDGSDSGSE VGDSTAQK 
LLELQDTTLGALLSALFPHCNPPQRRFPLEKGVTPPWWPTGKEDWWDQLSLPVDFRGVPP 
PYKKPHDLKKLWKIGVLIGVIRHMASDI^^ 

KAIVDQIAMSRENl^TSNFLVPATGGDPDVLFPESTO^ 

NCWKRKFEEDFGMPMHPTLLTCENSLCPYSQPHMGFLDRNLRENHQMTCPYKVTSFYQP 
TKPYGMTGLMVPCPDYNGMQQQVQSFQDQFNHPNDLYRPKAPQRGNDDLVEDLNPSPSTL 
NQNLGLVLPTDFNGGEETVGTENNLHNQGQELPTSWIQ* 
>G963 (1..897) 

ATGAGTTTGCCTCCAGGATTCAGGTTTCATCCCACTGATGAAGAACTGGTGGCTTACTAT 
CTTGATAGGAAGGTCAACGGCCAAGCCATTGAGCTCGAGATCATCCCAGAAGTTGATCTT 
TATAAATGCGAGCCATGGGACTTGCCTGAAAAGTCATTTTTGCCGGGAAACGACATGGAA 
TGGTACTTTTACAGCACAAGGGATAAGAAGTATCCAAATGGCTCTAGGACGAACCGTGCG 
ACCCGAGCGGGTTACTGGAAGGCCACGGGGAAAGATCGTACAGTAGAATCAAAGAAGATG 
AAGATGGGAATGAAGAAGACACTGGTTTATTATAGAGGAAGGGCTCCTCATGGCCTTCGT 
ACTAATTGGGTCATGCATGAATATCGTCTCACGCACGCTCCTTCCTCCTCCTTGAAGGAG 
TCGTATGCATTGTGCCGAGTGTTTAAGAAGAACATACAAATTCCAAAGAGAAAAGGGGAA 
GAAG AAG AAG C AGAAG AAG AG AG CACTAGTGTAGG AAAAGAAGAGG AAG AAGAAAAGG AG 
AAGAAGTGGAGAAAATGTGATGGTAATTATATTGAAGACGAGAGCTTGAAAAGAGCATCC 
GCGGAGACATCTTCATCAGAGCTAACTCAAGGGGTCCTTTTAGACGAAGCAAACAGCTCA 
TCCATATTTGCTCTTCATTTCTCATCTTCTCTTCTGGACGATCATGATCATCTTTTCTCA 
AACTATTCTCATCAGCTTCCATATCATCCTCCTCTT^ 

TCTATGAACGAAGCAGAGATTATGTCAATCCAACAAGACTTTCAATGCAGAGACTCTATG 
AACGGGAC^CTTGACGAAATCTTCTCTTCTTCCGCC^CTTTCCCCGCTTCCCTTTGA 
>G963 Amino Acid Sequence (domain in AA coordinates: TBD) 

MSLPPGFRFHPTDEELVAYYLDRKVNGQAIELEIIPEVDLYKCEPWDLPEKSFLPGNDME 

wyfystrdkkypngsrtnratragywkatgkdrtvesk™ 

TNV^MHEYRLTHAPSSSLKESYALCRVFKKNIQIPKRKGEEEEAEEESTSVGKEEEEE^ 
KKWRKCTGNYIEDESLKRASAETSSSELTQGVL^ 

NYSHQLPYHPPLQLQDFPQLSMNEAEIMSIQQDFQCRDSMNGTLDEIFSSSATFPASL* 
>G979 (60.. 1352) 

CCTCTGAGGAATCAAATCACTCACACTCCAAAAAA 

TGAAGAAGCGCTTAACCACTTCCACTTGTTCTTCTTCTCCATCTTCCTCTGTTTCT 

CTACTACTACTTCCTCTCCTATTCAGTCGGAGGCTCCAAGGCCTAAACGAGCCAAAAGGG 

CTAAGAAATCTTCTCCTTCTGGTGATAAATCTCATAACCCGACAAGCCCTGCTTCTACCC 

GACGCAGCTCTATCTACAGAGGAGT(^CTAGACATAGATGGACTGGGAGATTCGAGGCTC 

ATCTTTGGGACAAAAGCTCTTGGAATTCGATTCAG^ 
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TGGGAGCATATGACAGTGAAGAAGCAGCAGCACATACGTACGATCTGGCTGCTCTCAAGT 
ACTGGGGACCCGACACCATCTTGAATTTTCCGGCAGAGACGTACACAAAGGAATTGGAAG 
AAATGCAGAGAGTGACAAAGGAAGAATATTTGGCTTCTCTCCGCCGCCAGAGCAGTGGTT 
TCTCCAGAGGCGTCTCTAAATATCGCGGCGTCGCTAGGCATCACCACAACGGAAGATGGG 
AGGCTCGGATCGGAAGAGTGTTTGGGAACAAGTACTTGTACCTCGGCACCTATAATACGC 
AGGAGGAAGCTGCTGCAGCATATGACATGGCTGCGATTGAGTATCGAGGCGCAAACGCGG 
TTACTAATTTCGACATTAGTAATTACATTGACCGGTTAAAGAAGAAAGGTGTTTTCCCGT 
TCCCTGTGAACCAAGCTAACCATCAAGAGGGTATTCTTGTTGAAGCCAAACAAGAAGTTG 
AAACGAGAGAAGCGAAGGAAGAGCCTAGAGAAGAAGTGAAACAACAGTACGTGGAAGAAC 
CACCGCAAGAAGAAGAAGAGAAGGAAGAAGAGAAAGCAGAGCAACAAGAAGCAGAGATTG 
TAGGATATTCAGAAGAAGCAGCAGTGGTCAATTGCTGCATAGACTCTTCAACCATAATGG 
AAATGGATCGTTGTGGGGACAACAATGAGCTGGCTTGGAACTTCTGTATGATGGATACAG 
GGTTTTCTCCGTTTTTGACTGATCAGAATCTCGCGAATGAGAATCCCATAGAGTATCCGG 
AGCTATTCAATGAGTTAGCATTTGAGGACAACATCGACTTCATGTTCGATGATGGGAAGC 
ACGAGTGCTTGAACTTGGAAAATCTGGATTGTTGCGTGGTGGGAAGAGAGAGCCCACCCT 
CTTCTTCTTCACCATTGTCTTGCTTATCTACTGACTCTGCTTCATCAACAACAACAACAA 
CAACCTCGGTTTCTTGTAACTATTTGGTCTGAGAGAGAGAGCTTTGCCTTCTAGTTTGAA 
TTTCTATTTCTTCCGCTTCTTCTTCTTTTTTTTCTTTTGTTGGGTTCTGCTTAGGGTTTG 
TATTTCAGTTTCAGGGCTTGTTCGTTGGTTCTGAATAATCAATGTCTTTGCCCCTTTTNN 
AANGNTNCAAGNTNAAANAAAAAAAAAAAA 

>G979 Amino Acid Sequence (domain in AA coordinates: 63-139,165-233) 

MKKRLTTSTCSSSPSSSVSSSTTTSSPIQSEAPRPKRAKRAKKSSPSGDKSHNPTSPAST 

l^SSIYRGVTRHRWTGRFEAHLWDKSSWHSIQNKKGKQVYLGAYDSEEAAAHTYDLAALK 

YWGPDTILNFPAETYTKELEEMQRWKEEYIiASLRRQSSGFSRGVSKYRGVARHHHNGRW 

EARI GRVFGNKYIjYLGTYNTQEEAAAAYDMAAI EYRGANAVTNFD I SNYIDRLKKKGVFP 

FPTOQANHQEGILVEAKQEVETREAKEEPREEVKQQYVEEPPQEEEEKEEEKAEQQEAEI 

VGYSEEAAVWCCIDSSTIMEMDRfiGDNNELAWNFCMMDTGFSPFLTDQNL^ 

ELFNELAFEDNIDFMFDDGKHECLiniENLDCCVVGRESPPSSSSPLSCIiSTDSASSTTTT 

TTSVSCNYLV* 

>G987 (1..4011) 

ATGGGTTCTTACTCAGCTGGCTTCCCTGGATCCTTGGACTGGTTTGATTTTCCCGGTTTA 
GGAAACGGATCCTATCTAAATGATCAACCTTTGTTAGATATTGGATCTGTTCCTCCTCCT 
CTAGACCCATATCCTCAAC^GAATCTTGCTTCTGCGGATGCTGATTTCTCTGATTCTGTT 
TTGAAGTACATAAGCCAAGTTCTTATGGAAGAGGACATGGAAGATAAGCCTTGTATGTTT 
CATGATGCTTTATCTCTTCAAGCAGCTGAGAAGTCTCTCTATGAAGCTCTCGGCGAGAAG 
TACCCGGTTGATGATTCTGATCAGCCTCTGACTACTACTACTAGCCTTGCTCAATTGGTT 
AGTAGTCCTGGTGGTTCTTCTTATGCTTCAAGCACCACAACCACTTCCTCTGATTCACAA 
TGGAGTTTTGATTGTTTGGAGAATAATAGGCCTTCTTCTTGGTTGCAGACACCGATCCCG 
AGTAACTTCATTTTTCAGTCTACATCTACTAGAGCCAGTAGCGGTAACGCGGTTTTCGGG 
TCAAGTTTTAGCGGTGATTTGGTTTCTAATATGTTTAATGATACTGACTTGGCGTTACAA 
TTCAAGAAAGGGATGGAGGAAGCTAGTAAATTCCTTCCTAAGAGCTCTCAGTTGGTTATA 
GATAACTCTGTTCCTAACAGATTAACCGGAAAGAAGAGCCATTGGCGCGAAGAAGAACAT 
TTGACTGAAGAAAGAAGTAAGAAACAATCTGCTATTTATGTTGATGAAACTGATGAGCTT 
ACTGATATGTTTGACAATATTCTGATATTTGGCGAGGCTAAGGAACAACCTGTATGCATT 
CTTAACGAGAGTTTCCCTAAGGAACCTGCGAAAGCTTCAACGTTTAGTAAGAGTCCTAAA 
GGCGAAAAACCGGAAGCTAGTGGTAACAGTTATACAAAAGAGACACCTGATTTGAGGACA 
ATGCTGGTTTCTTGTGCTCAAGCTGTTTCGATTAACGATCGTAGAACTGCTGACGAGCTG 
TTAAGTCGGATAAGQCAACATTCTTCATCTTACGGCGATGGAACAGAGAGATTGGCTCAT 
TATTTTGCTAACAGTCTTGAAGCACGTTTGGCTGGGATAGGTACACAGGTTTATACTGCC 
TTGTCTTCCAAGAAAACATCTACTTCTGACATGTTGAAAGCTTATCAGACATATATAT(^ 
GTCTGTCCGTTCAAGAAAATCGCAATCATATTCGCCAACCATAGTATTATGCGGTTGGCT 
TCAAGTGCTAATGCCAAAACCATCCACATCM 

TGGCCTTCTCTGATTCATCGACTTGCTTGGAGACGTGGTT(^TCTTGTAAGCTTCGGATA 
ACCGGTATAGAGTTGCCTCAACGTGGTTTTAGACCAGCCGAGGGAGTTATTGAGACTGGT 
CGTCGCTTGGCTAAGTATTGTC^GAAGTTCAATATTCCGTTTGAGTACAATGCGATTGCG 
CAGAAATGGGAATCAATCAAGTTGGAGGACTTGAAGCTAAAAGAAGGCGAGTTTGTTGCG 
GTAAACTCTTTATTTCGGTTTAGGAATCTTCTAGATGAGACGGTGGCAGTGCATAGCCCG 
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AGAGATACGGTTTTG7VAGCTGATAAGGAAGATAAAGCCAGACGTGTTCATCCCCGGGATC 

CTCAGCGGATCCTACAACGCGCCTTTCTTTGTCACGAGGTTTAGAGAAGTTCTGTTTCAT 

TACTCATCTCTGTTTGACATGTGTGACACGAATCTAACACGGGAAGATCCAATGAGGGTT 

ATGTTTGAGAAAGAGTTCTATGGGCGGGAGATCATGAACGTGGTGGCGTGTGAGGGGACG 

GAGAGAGTGGAGAGGCCAGAGAGTTATAAGCAGTGGCAGGCGAGGGCGATGAGAGCCGGG 

TTTAGACAGATTCCGCTGGAGAAGGAACTAGTTCAGAAACTGAAGTTGATGGTGGAAAGT 

GGATACAAACCCAAAGAGTTTGATGTTGATCAAGATTGTCACTGGTTGCTTCAGGGCTGG 

AAAGGTAGAATTGTATACGGTTCATCTATTTGGGTTCCTTTCTTTTTCTATGTGGGCAGA 

GCAACTAGGGTTTTGATCATGGATCCAAACTTCTCTGAATCTCTAAACGGCTTTGAGTAT 

TTTGATGGTAACCCTAATTTGCTTACTGATCCAATGGAAGATCAGTATCCACCACCATCT 

GATACTCTGTTGAAATACGTGAGTGAGATTCTTATGGAAGAGAGTAATGGAGATTATAAG 

CAATCTATGTTCTATGATTCATTGGCTTTACGAAAAACTG7VAGAAATGTTGCAGCAAGTC 

ATTACTGATTCTCAAAATCAGTCCTTTAGTCCTGCTGATTCATTGATTACTAATTCTTGG 

GATGCAAGCGGAAGCATCGATGAATCGGCTTATTCGGCTGATCCGCAACCTGTGAATGAA 

ATTATGGTTAAGAGTATGTTTAGTGATGCAGAATCAGCTTTACAGTTTAAGAAAGGGGTT 

GAAGAAGCTAGTAAATTCCTTCCCAATAGTGATCAATGGGTTATCAATCTGGATATCGAG 

AGATCCGAAAGGCGCGATTCGGTTAAAGAAGAGATGGGATTGGATCAGTTGAGAGTTAAG 

AAGAATCATGAAAGGGATTTTGAGGAAGTTAGGAGTAGTAAGCAATTTGCTAGTAATGTA 

GAAGATAGTAAGGTTACAGATATGTTTGATAAGGTTTTGCTTCTTGACGGTGAATGCGAT 

CCGCAAACATTGTTAGACAGCGAGATTCAAGCGATTCGGAGTAGTAAGAACATAGGAGAG 

AAAGGGAAGAAGAAGAAGAAGAAGAAGAGTCAAGTGGTTGATTTTCGTACACTTCTCACT 

CATTGTGCACAAGCCATTTCCACAGGAGATAAAACCACGGCTCTTGAGTTTCTGTTACAG 

ATAAGGCAACAGTCTTCGCCTCTCGGTGACGCGGGGCAAAGACTAGCTCATTGTTTCGCT 

AACGCGCTTGAAGCTCGTCTACAGGGAAGTACCGGTCCTATGATCCAGACTTATTACAAT 

GCTTTAACCTCGTCGTTGAAGGATACTGCTGCGGATACAATTAGAGCGTATCGAGTTTAT 

CTTTCTTCGTCTCCGTTTGTTACCTTGATGTATTTCTTCTCCATCTGGATGATTCTTGAT 

GTGGCTAAAGATGCTCCTGTTCTTCATATAGTTGATTTTGGGATTCTATACGGGTTTCAA 

TGGCCGATGTTTATTCAGTCTATATCAGATCGAAAAGATGTACCGCGGAAGCTGCGGATT 

ACTGGTATCGAGCTTCCTCAGTGCGGGTTTCGGCCCGCGGAGCGAATAGAGGAGACAGGA 

CGGAGATTGGCTGAGTATTGTAAACGGTTTAATGTTCCGTTTGAGTACAAAGCCATTGCG 

TCTCAGAACTGGGAAACAATCCGGATAGAAGATCTCGATATACGACCAAACGAAGTCTTA 

GCGGTTAATGCTGGACTTAGACTCAAGAACCTTCAAGATGAAACAGGAAGCGAAGAGAAT 

TGCCCGAGAGATGCTGTCTTGAAGCTAATAAGAAACATGAACCCGGACGTTTTCATCCAC 

GCGATTGTCAACGGTTCATTCAACGCACCCTTCTTTATCTCGCGGTTTAAAGAAGCGGTT 

TACCATTACTCCGCTCTCTTCGACATGTTTGATTCGACGTTGCCTCGGGATAACAAAGAG 

AGGATTAGGTTCGAGAGGGAGTTTTACGGGAGAGAGGCTATGAACGTGATAGCGTGCGAG 

GAAGCTGATCGAGTGGAGAGGCCTGAGACTTACAGGCAATGGCAGGTTAGAATGGTTAGA 

GCCGGGTTTAAGCAGAAAACGATTAAGCCTGAGCTGGTAGAGTTGTTTAGAGGAAAGCTG 

AAGAAATGGCGTTACCATAAAGACTTTGTGGTTGATGAAAATAGTAAATGGTTGTTACAA 

GGCTGGAAAGGTCGAACTCTCTATGCTTCTTCTTGTTGGGTTCCTGCCTAG 

>G987 Amino Acid Sequence (domain in AA coordinates: 428-432,704-708) 

MGSYSAGFPGSLDWFDFPGLGNGSYLNDQPLLDIGSVPPPLDPYPQQNLASADADFSDSV 

LKYISQVLMEEDMEDKPCMFHDALSLQAAEKSLYEALGEKYPVDDSDQPLTTTTSLA 

SS PGGS S YAS STTTTS SDSQWSFDCLENNRPS SWLQTPI PSNFI FQSTSTRAS SGNAVFG 

SSFSGDLVSNMFNDTOLALQFKKGMEEASKFLPKSSQLVIDNSVPNRLTGKKSHWREEEH 

LTEERSKKQSAIYVDETDELTDMFDNILIFGEAKEQPVCILNESFPKEPAKASTFSKSPK 

GEKPEASGNS YTKETPDLRTMLVSCAQAVS INDRRTADELLSRIRQHS S S YGDGTERLAH 

YFANSIiEARLAGIGTQVYTALSSKKTSTSDMLKAYQTYISVCPFKJCIAIIFANHSIMRLA 

SSANAKTIHIIDFGISDGFQWPSLIHRLAWRRGSSCKLRITGIELPQRGFRPAEGVIETG 

RRLAKYCQKFNIPFEYNAIAQKWESIKLEDLKLKEGEFVAWSLFRFRNLIJDETVAW 

RDTVLKLIRKIKPDVFIPGILSGSYNAPFFVTRFREVLFHYSSLFDMCDTNLT 

MFEKEF YGRE IMNVVACEGTERVERPES YKQWQARAMRAGFRQI PLEKELVQKLKLMVES 

GYKPKSFDVDQDCHWLLQGWKGRIVYGSSIVT7PFFFWGRATRVLIMDPNFSESLNGFEY 

FDGNPmiLTDPMEDQYPPPSDTLLKWSEILMEESNGDYKQSMFYDSLALRKTEEMLQQV 

ITDSQNQSFSPADSLITNSWDASGSIDESAYSADPQPVNEIMVKSMFSDAESALQFKKGV 

EE AS KFLPNSDQWVINLDI ERS ERRDSVKEEMGLDQLRVKKNHERDFEE VRS S KQFASNV 

EDS KVTDMFDKVLLLDGECDPQTLLDSE I QAIRS SKNIGEKGKKKKKKKSQWDFRTLIjT 
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HCAQAISTGDKTTAIiEFLLQIRQQSSPLGDAGQRLAHCFAJMALEARLQGSTGPMIQTYYN 

ALTSSLKDTAADTIRAYRVYLSSSPFVTLMYFFSIWMILDVAKDAPVLHIVDFGILYGFQ 

WPMFIQSISDRKDVPRKLRITGIELPQCGFRPAERIEETGRRLAEYCKRFNVPFEYKAIA 

SQNWETIRIEDLDIRPNEVLAVNAGLRLKNLQDETGSEENCPRDAVLKLIRNMNPDVFIH 

AIVNGSFNAPFFISRFKEAVYHYSALFDMFDSTLPRDNKERIRFEREFYGREAMNVIACE 

EADRVERPETYRQWQVRMVRAGFKQKTIKPELVELFRGKLKKWRYHKDFVVDENSKW^ 

GWKGRTLYASSCWVPA* 

>G993 (6. .1091) 

CAAATATGGAATACAGCTGTGTAGACGACAGTAGTACAACGTCAGAATCTCTCTCCATCT 
CTACTACTCCAAAGCCGACAACGACGACGGAGAAGAAACTCTCTTCTGCGCCGGCGACGT 
CGATGCGTCTCTACAGAATGGGAAG CGG CGGAAGCAGCGTCGTTTTGGATTCAG AGAACG 
GCGTCGAGACCGAGTCACGTAAGCTTCCTTCGTCGAAATATAAAGGCGTTGTGCCTCAGC 
CTAACGGAAGATGGGGAGCTCAGATTTACGAGAAGCATCAGCGAGTTTGGCTCGGTACTT 
TCAACGAGGAAGAAGAAGCTGCGTCTTCTTACGACATCGCCGTGAGGAGATTCCGCGGCC 
GCGACGCCGTCACTAACTTCAAATCTCAAGTTGATGGAAACGACGCCGAATCGGCTTTTC 
TTGACGCTCATTCTAAAGCTGAGATCGTGGATATGTTGAGGAAACACACTTACGCCGATG 
AGTTTGAGCAGAGTAGACGGAAGTTTGTTAACGGCGACGGAAAACGCTCTGGGTTGGAGA 
CGGCGACGTACGGAAACGACGCTGTTTTGAGAGCGCGTGAGGTTTTGTTCGAGAAGACTG 
TTACGCCGAGCGACGTCGGGAAGCTGAACCGTTTAGTGATACCGAAACAACACGCGGAGA 
AGCATTTTCCGTTACCGGCGATGACGACGGCGATGGGGATGAATCCGTCTCCGACGAAAG 
GCGTTTTGATTAACTTGGAAGATAGAACAGGGAAAGTGTGGCGGTTCCGTTACAGTTACT 
GGAACAGCAGTCAAAGTTACGTGTTGACCAAGGGCTGGAGCCGGTTCGTTAAAGAGAAGA 
ATCTTCGAGCCGGTGATGTGGTTTGTTTCGAGAGATCAACCGGACCAGACCGGCAATTGT 
ATATCCACTGGAAAGTCCGGTCTAGTCCGGTTCAGACTGTGGTTAGGCTATTCGGAGTCA 
ACATTTTCAATGTGAGTAACGAGAAACCAAACGACGTCGCAGTAGAGTGTGTTGGCAAGA 
AGAGATCTCGGGAAGATGATTTGTTTTCGTTAGGGTGTTCCAAGAAGCAGGCGATTATCA 
ACATCTTGTGACAAATTCTTTTTTTTTGGTTTTTTTCTTCAATTTGTTTCTCCTTTTTCA 
ATATTTTGTATTGAAATGACAAGTTGTAAATTAGGACAAGACAAGAAAAAATGACAACTA 
GACAAAATAGTTTTTGTTTAAAAAAAAAAAAAAAAAAAA 

>G993 Amino Acid Sequence (domain in AA coordinates: 69-134) 
MEYSCVDDSSTTSESLSISTTPKPTTTTEKKLSSPPATSMRLYRMGSGGSSVVLDSENGV 
ETESRKLPS SKYKGVVPQPNGRWGAQ I YEKHQRWLGTFNEEEEAAS S YD I AVRRFRGRD 
AVTNFKSQVDGNDAESAFLDAHSKAEIVDMLRKHTYADEFEQSRRKFVNGDGKRSGLETA 
TYGNDAVLRAREVLFEKTVTPSDVGKLNRLVIPKQHAEKH 

LINLEDRTGKVWRFRYSYWNSSQSYVLTKGWSRFVKEKNLRAGDVVCFERSTGPDRQLYI 

HWKVRSSPVQTVVRLFGWIFNVSNEKPND^^ 

L* 

>G681 (1..804) 

ATGGGGAGGACGACATGGTTCGACGTCGACGGGATGAAGAAAGGAGAGTGGACGGCAGAG 

GAAGACCAGAAGCTCGGCGCTTACATCAACGAGCATGGCGTTTGTGATTGGCGTTCCCTC 

CCCAAAAGAGCTGGTTTGCAGAGATGTGGAAAGAGCTGCAGATTAAGGTGGCTTAACTAT 

CTAAAGCCTGGGATTAGAAGAGGCAAATTCACTCCTCAAGAAGAAGAAGAAATCATCCAA 

CTTCATGCTGTTCTCGGAAACAGGTGGGCAGCCATGGCGAAGAAGATGCAGAATCGAACA 

GACAATGATATCAAGAACCATTGGAACTCTTGTCTCAAGAAAAGACTTTCGAGAAAGGGA 

ATCGACCCTATGACCCACGAGCCCATCATCAAACACCTCACCGTCAATACCACTAACGCA 

GATTGTGGTAACTCTTCCACCACGACGTCCCCGTCGACGACGGAAAGCTCTCCTTCCTCC 

GGCTCGTCTCGTCTTCTTAACAAACTCGCCGCAGGTATCTCATCTAGACAACATAGTCTC 

GATAGGATCAAGTfteATCTTGTCGAATTCAATAATCGAAAGCAGTGATCAAGCAAAAGAG 

GAAGAAGAAAAAGAAGAAGAAGAAGAAGAAAGAGATTCAATGATGGGTCAGAAGATTGAC 

GGTAGTGAAGGAGAAGATATTCAGATTTGGGGCGAGGAGGAAGTTAGGCGTTTAATGGAG 

ATTGATGCAATGGATATGTACGAGATGACTTCGTACGACGCTGTCATGTACGAGAGTAGT 

CACATACTTGATCATCTCTTTTGACTTAATATAGTGTGACTGTGTGAGTG(^TGCATGTT 

>G681 Amino Acid Sequence (domain in AA coordinates : 14 -120) 

MGRTTWFDVDGMKKGEWTAEEDQKLGAYINEHGVCDWRSLPKRAGLQRCGKSCRLRWLNY 

LKPG IRRGKFTPQEEEE I IQLHAVLGNRWAAMAKKMQl^TDND IKITHWNS CLKKRLSRKG 

IDPMTHEPIIKHLTVNTTNADC^^ 

DRIKYILSNSIIESSDQAKEEEEKEEEEEERDSMMGQKIDGSEGEDIQIWGEEEVRRLME 
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IDAMDMYEMTSYDAVMYESSHILDHLF* 
>G1482 (1..996) 

ATGAAGATCAGGTGCGACGTCTGCGATAAAGAAGAAGCGTCGGTGTTTTGCACGGCCGAC 
GAAGCATCTCTCTGCGGCGGCTGCGACCACCAAGTCCACCACGCTAACAAACTCGCCTCT 
AAACATCTCCGTTTCTCTCTCCTTTATCCTTCTTCTTCCAACACCTCCTCTCCTCTCTGC 
GACATCTGTCAGGATAAAAAAGCTCTGTTGTTCTGTCAACAAGATAGAGCTATTTTATGC 
AAAGATTGCGATTCATCGATCCACGCTGCGAACGAACACACAAAGAAACACGATAGGTTT 
CTTCTTACAGGGGTTAAGCTCTCTGCAACATCGTCTGTTTACAAACCTACTTCGAAATCT 
TCTTCTTCTTCTTCAAGCAACCAAGATTTCTCTGTCCCTGGATCATCAATCTCTAATCCT 
CCTCCTCTCAAGAAACCTCTCTCAGCTCCTCCTCAGAGCAACAAGATCCAACCCTTTTCG 
AAGATCAACGGCGGTGATGCGTCGGTGAATCAGTGGGGATCCACAAGCACGATTTCTGAG 
TATTTGATGGATACGTTACCTGGTTGGCACGTTGAGGATTTCCTCGATTCCTCTCTTCCT 
ACTTATGGTTTCTCTAAGAGTGGTGATGATGATGGAGTGTTACCATATATGGAACCAGAA 
GATGACAACAACACTAAGAGAAACAACAACAACAACAACAACAACAACAACAATACAGTG 
TCACTTCCATCTAAGAATTTAGGGATTTGGGTCCCTCAGATTCCACAAACTCTTCCTTCT 
TCATACCCAAATCAATACTTTTCTCAAGACAACAACATACAGTTTGGGATGTACAACAAA 
GAAACATCACCAGAAGTAGTGTCTTTTGCTCCAATACAAAACATGAAACAACAAGGACAG 
AACAACAAGAGATGGTATGATGATGGTGGCTTCACTGTCCCACAGATCACTCCTCCTCCT 
CTTTCTTCTAATAAAAAGTTTAGATCTTTCTGGTAA 

>G1482 Amino Acid Sequence (domain in aa coordinates: 5-63) 
MKI RCDVCDKEEAS VFCTADEASLCGGCDHQVHHANKLAS KHLRFSLLYPSS SNTS S PLC 
DI CQDKKALLFCQQDRAI LCKDCDS S IHAANEHTKKHDRFLLTGVKLS ATS S VYKPTS KS 
SSSSSSNQDFSVPGSSISNPPPLKKPLSAPPQSNKIQPFSKINGGDASVNQWGSTSTISE 
YLMDTLPGWHVEDFLDSSLPTYGFSKSGDDDGVLPYMEPEDDWT 

SLPSKNLGIOTPQIPQTLPSSYPNQYFSQDNNIQFGMYNKETSPEWSFAPIQNMKQQGQ 

NNKRWYDDGGFTVPQITPPPLSSNKKFRSFW* 

>G225 (157.. 441) 

CTCTCTCTCTCACTCTTTTCTTTTCCGAGAACCCAACAAAAAAAAAGCTACTATTAATCC 

TTCCCCTCGTGAGGAAATCATTTCTTCTTGTTTCrCGAGATTTATTCTCTTTCTCTCTCT 

CTTTCTCTGTGTGTTTCGTGTCTTCAGATTAGTTCGATGTTTCGTTCAGACAAGGCGGAA 

AAAATGGATAAACGACGACGGAGACAGAGCAAAGCCAAGGCTTCTTGTTCCGAAGAGGTG 

AGTAGTATCGAATGGGAAGCTGTGAAGATGTCAGAAGAAGAAGAAGATCTCATTTCTCGG 

ATGTATAAACTCGTTGGCGACAGGTGGGAGTTGATCGCCGGAAGGATCCCGGGACGGACG 

CCGGAGGAGATAGAGAGATATTGGCTTATGAAACACGGCGTCGTTTTTGCCAACAGACGA 

AGAGACTTTTTTAGGAAATGATTTTTTTTGTTTGGATTAAAAGAAAATTTTCCTCTCCTT 

T^TTCACAAGACAAGAATIAAAAGGAAATGTACCTGTCCTTGAATTACTATTTTGGAATGT 

ATAATTATCTATATATATAAGAAGAAAAAATTGCTTAGGAATTT 

>G225 Amino Acid Sequence (domain in AA coordinates: 39-76) 

MFRSDKAEKMDKRRRRQSKAKASCSEEVSSIEWEAVKMSEEEEDLISRMYKLVGDRWEL^ 

AGRIPGRTPEEIERYWLMKHGWFANRRRDFFRK* 

>G226 (10.. 348) 

CCAGTAGTTATGGATAATACCAACCGTCTTCGTCTTCGTCGCGGTCCCAGTCTTAGGCAA 

ACTAAGTTCACTCGATCCCGATATGACTCTGAAGAAGTGAGTAGCATCGAATGGGAGTTT 

ATCAGTATGACCGAACAAGAAGAAGATCTCATCTCTCGAATGTACAGACTTGTCGGTAAT 

AGGTGGGATTTAATAGCAGGAAGAGTCGTAGGAAGAAAGGCAAATGAGATTGAGAGATAC 

TGGATTATGAGAAACTCTGACTATTTTTCTCACAAACGACGACGTCTTAATAATTCTCCC 

TTTTTTTCTACTTCTCCTCTTAATCTCCAAGAAAATCTAAAATTGTAAAGAAATCAAAAT 

AAAAGCTTTCAATCATAAAAGTAGAACAAATCTTGAATGTCTTCTCA 

>G226 Amino Acid Sequence (domain in AA coordinates: 28-78) 

MDOTNRLRLRRGPSLROTKFTRSRYDSEEVSSIEW 

LIAGRWGRKANEIERYWI^1RNSDYFSHKRRRI^SPFFSTSPI^QENLKL^ 
>G9 (81.. 1139) 

GTGTTTCTTCTTTCTGCTAAAAGGTTATAATTTTTGTTTCTTGG 

AAGAAACTGAAACAAAGAAAATGGATTCTAGTTGCATAGACGAGATAAGTTCCT 

CAGAATCTTTCTCCGCCACCACCGCCAAGAAGCTCTCT^ 

GCCTCTACCGGATGGGAAGCGGCGGGAGCAGCGTCGTGTTGGATCCCGAGAACGGCCTAG 
AGACGGAGTCACGAAAGCTACCATCTTCAAAATAC^^GGTGTTGTTCCTCAGCCTAACG 
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GAAGATGGGGAGCTCAGATCTACGAGAAGCACCAACGAGTATGGCTCGGGACTTTCAACG 

AGCAAGAAGAAGCTGCTCGTTCCTACGACATCGCAGCTTGTAGATTCCGTGGCCGCGACG 

CCGTCGTCAACTTCAAGAACGTTCTGGAAGACGGCGATTTAGCTTTTCTTGAAGCTCACT 

CAAAGGCCGAGATCGTCGACATGTTGAGAAAACACACTTACGCCGACGAGCTTGAACAGA 

ACAATAAACGGCAGTTGTTTCTCTCCGTCGACGCTAACGGAAAACGTAACGGATCGAGTA 

CTACTCAAAACGACAAAGTTTTAAAGACGTGTGAAGTTCTTTTCGAGAAGGCTGTTACAC 

CTAGCGACGTTGGGAAGCTAAACCGTCTCGTGATACCTAAACAACACGCCGAGAAACACT 

TTCCGTTACCGTCACCGTCACCGGCAGTGACTAAAGGAGTTTTGATCAACTTCGAAGACG 

TTA^CGGTAAAGTGTGGAGGTTCCGTTACTCATACTGGAACAGTAGTCAAAGTTACGTGT 

TGACCAAGGGATGGAGTCGATTCGTCAAGGAGAAGAATCTTCGAGCCGGTGATGTTGTTA 

CTTTCGAGAGATCGACCGGACTAGAGCGGCAGTTATATATTGATTGGAAAGTTCGGTCTG 

GTCCGAGAGAAAACCCGGTTCAGGTGGTGGTTCGGCTTTTCGGAGTTGATATCTTTAATG 

TGACCACCGTGAAGCCAAACGACGTCGTGGCCGTTTGCGGTGGAAAGAGATCTCGAGATG 

TTGATGATATGTTTGCGTTACGGTGTTCCAAGAAGCAGGCGATAATCAATGCTTTGTGAC 

ATATTTCCTTTTCCGATTTTATGCTTTCGTTTTTTAATTTTTTTTTTTGTCAAGTTGTGT 

AGGTTGTGATTCATGCTAGGTTGTATTTAGGAAAAGAGATAAGACC 

>G9 Amino Acid Sequence (domain in AA coordinates: 62-127) 

MDSSCIDEISSSTSESFSATTAKKLSPPPAAALRLYRMGSGGSSWLDPENGIjETESRKL 

PSSKYKGWPQPNGRWGAQIYEKHQRVWLGTFNEQEEAARSYDIAACRFRGRDAWNFKN 

VLEDGDLAFLEAHSKAEIVDMLRKHTYADELEQNWKRQLFLSVDANGKRNGSSTTQND^ 

LKTCEVLFEKAVTPSDVGKLNRLVI PKQHAEKHFPLPS PSPAVTKGVLINFEDVNGKVWR 

FRYSYWNSSQSYVLTKGWSRFVKEKNLRAGDVVTFERSTGLERQLYIDWKVRSGPRENPV 

QVWRLFGVD I FNVTTVKPNDVVAVCGGKRS RDVDDM FALRCS KKQAI INAL * 

>G1040 (51-. 863) 

CTTTGATCTCCACTATTTAAGTAGACAAGAATCATAAAGAAAATAGTGAGATGATGATGT 
TAGAGTCAAGAAACAGTATGAGAGCTTCAAACTCAGTCCCAGATCTGTCTCTTCAGATCA 
GTCTTCCTAACTATCACGCCGGAAAACCTCTTCACGGCGGTGACCGGAGCTCCACAAGCA 
GTGATTCTGGAAGCAGCCTCAGTGACCTGAGCCATGAGAACAACTTCTTCAACAAACCTC 
TCTTGAGCTTAGGATTTGACCATCATCATCAAAGGCGCTCAAACATGTTCCAACCTCAAA 
TCTACGGTCGAGATTTCAAGAGAAGCTCATCATCAATGGTTGGTCTTAAACGAAGCATTC 
GTGCTCCAAGAATGAGATGGACTTCTACTCTTCATGCTCACTTCGTCCATGCTGTTCAAC 
TTCTTGGCGGCCATGAAAGAGCAACGCCTAAATCAGTGTTGGAGCTCATGAATGTGAAGG 
ATCTAACCCTAGCTCATGTCAAGAGTCACTTGCAGATGTATAGAACAGTGAAATGCACTG 
ATAAAGGATCACCAGGAGAAGGAAAGGTAGAGAAAGAGGCAGAGCAGAGGATAGAGGACA 
ATAATAATAATGAAGAAGCTGATGAAGGAACTGACACAAATTCGCCAAACTCATCATCTG 
TGCAAAAGACCCAAAGAGCTTCATGGTCATCGACAAAGGAAGTATCTAGGAGCATATCTA 
CACAAGCATATTCTCACTTGGGAACAACTCATCACACTAAGGCCAATGAAGAGAAAGAGG 
ATACCAAC^TTCATCTCAATTTGGATTTCACATTGGGCGGCCTAGTTGGGGGATGGAATA 
TGCGGAACCCTCCAGTGATTTAACCCTTCTCAAGTGCTAATTGCCTTAAGCTACAACAAA 
TAAGTCAGCTTAGGTTACCAGTTTTAACATAATTTTAACTTGTTTTGATCATATGAGCTT 
CGGAAGAATCATATTATCATCATATATGAACTTCTTTCCAAGAATGTTCTATGAGTTTTT 
TGATATGTATAATCAAGAGAATCGTTTGAAGTAAAAA 

>G1040 Amino Acid Sequence (domain in AA coordinates: 109-158) 
MMMLESRNSMRASNSVPDLSLQISLPNYHAGKPLHGGDRSSTSSDSGSSLSDLSHENNFF 
NKPLLSLGFDHHHQRRSNMFQPQIYGRDFKRSSSSIWGLKRSIRAPRl^WTSTLHAHFV^ 
AVQLLGGHERATPKSVLELMNVKDLTI^^ 

IEDNNl^EADEGTDTNSPNSSSVQKTQRASWSSTKEVSRSISTQAYSHLGTTHHTKA^ 
EKEDTNIHLNLDFTfcGGLVGGWNMRNPPVI * 
>G2114 (64.. 1311) 

ATAAAACGAAACCCTATACATATAAACTAAGAGCGAGAAAGACAGCTAGAGAGAGAGAGA 
GAGATGAAGAAATGGTTGGGATTTTCATTGACACCTCCTTTGAGAATCTGCAATAGTC 
GAAGAAGAACTTAGGCATGACGGTTCCGATGTTTGGAGATATC 
CATCATCATGATGAAGACGTTCCAAAGGTGGAAGATCTC 

GAGTATCCTATAAACCATAACCAAACCAATGTCAACTGCACCACTGTGGTTAACAGGTTA 
AACCCACCCGGTTACCTTCTCCACGACCAAACCGTAGTTACACCACATTACCCGAACCTA 
GATCCGAACCTTAGCAATGATTATGGAGGTTTTGAGAGGGTCGGTTCGGTCTCGGTTTTC 
AAATCTTGGTTAGAGCT^GGCACTCCAGCATTCCCACTCTCGAGTCATTACGTTACTGAA 
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GAGGCTGGTACGAGCAATAATATTAGTCATTTTAGTAACGAAGAGACTGGTTATAACACC 
AATGGCTCAATGCTATCATTGGCTTTGAGCCATGGGGCTTGTTCTGATTTGATCAACGAA 
TCGAATGTATCCGCACGGGTCGAAGAACCGGTTAAGGTAGATGAGAAGCGGAAGAGATTG 
GTTGTTAAACCTCAGGTAAAGGAATCCGTTCCTCGGAAGTCGGTTGATAGTTATGGACAA 
AGAACTTCTCAGTATCGTGGAGTTACAAGGCATAGATGGACAGGGAGATATGAAGCTCAC 
TTATGGGATAATAGCTGTAAGAAGGAGGGACAGACAAGGAGAGGAAGACAAGTGTATCTT 
GGAGGGTATGATGAGGAGGAGAAAGCAGCGAGGGCATATGATTTAGCGGCTCTGAAGTAT 
TGGGGTCCTACCACTCACTTAAATTTCCCTTTGAGTAATTACGAAAAGGAGATCGAGGAA 
CTCAATAACATGAATCGGCAAGAATTTGTTGCCATGTTGAGGAGGAATAGCAGCGGGTTT 
TCGAGGGGAGCTTCCGTGTATAGAGGAGTTACAAGGCATCATCAACATGGAAGGTGGCAA 
GCCAGAATTGGAAGAGTTGCTGGAAACAAGGACTTGTACCTTGGAACATTTAGCACGCAA 
GAAGAAGCAGCGGAGGCGTACGATATCGCGGCAATTAAATTCAGAGGCCTAAACGCTGTA 
ACCAATTTCGATATAAATAGATATGACGTGAAGAGGATATGTTCAAGCTCAACGATTGTT 
GATAGCGACCAGGCCAAACATTCTCCCACCAGCTCTGGCGCCGGCCACTAACCGACACCG 
TAAACTCCTCGCCGGAGAGACTATTCCCACGTACGGTTGGTTTGAGGAAATAAGTTCGTC 
CAGTCTGTTTAATCATTTATGGTTTAATAAACATATATTCCTAAGTAATTGAGGCCGGTC 
TACATATATAC^VACTTTTTTAGCAAATTAAGTTATCAGAATCCACTATATATTATTCTCT 

>G2114 Amino Acid Sequence (conserved domain in AA coordinates : 221-297 323-393) 

MKKWLGFSLTPPLRICNSEEEELRHDGSDVWRYDINFDHHHHDEDVPKVEDLLSNSHQTE 
YPINHNQTNTOCTTVVNRI^PP 

SWLEQGTPAFPLSSHYVTEEAGTSNNISHFSNEETGYNTNGSMLSLALSHGACSDLINES 
NVS AR VEE P VKVDE KRKRL WKPQ VKE S VPRKS VD S YG QRTS Q YRG VTRHR WTGR YE AHL 
WDNSCKKEGQTRRGRQVYLGGYDEEEKAARAYDLAALKYWGPTTHLNFPLSNYE 
NNM^QEFVAMLRRNSSGFSRGASVYRGWROTQHGRWQARIGRVAGNKDLYLGTFSTQ^ 
EAAE AYDI AAI KFRGLNAVTNFDINRYDVKRI CSS STI VDSDQAKHS PTS SGAGH* 
>G450 (65.. 751) 

GAGTTATCGAGAGAGAGAGAAAACATATTCTGATTTAAGACATATATAGACAGCAAGAAG 
AGATATGAACCTTAAGGAGACGGAGCTTTGTCTTGGCCTCCCCGGAGGCACTGAAACCGT 
TGAAAGTCCGGCCAAGTCGGGTGTTGGGAACAAGAGAGGCTTCTCCGAGACCGTTGATCT 
CAAACTTAATCTTGAATCTAACAAACAAGGACATGTGGATCTCAACACTAATGGAGCTCC 
CAAGGAGAAGACCTTCCTTAAAGACCCTTCTAAGCCTCCTGCTAAAGCACAAGTGGTGGG 
TTGGCCACCGGTGAGGAACTACCGGAAAAATGTTATGGCTAATCAGAAGAGCGGCGAAG'C 
AGAGGAGGCAATGAGTAGTGGTGGAGGAACCGTCGCCTTTGTGAAGGTTTCCATGGATGG 
AGCTCCTTATCTTCGGAAGGTTGACCTCAAGATGTACACCAGCTACAAGGATCTCTCTGA 
TGCCTTGGCCAAAATGTTCAGCTCCTTTACCATGGGGAGTTATGGAGCACAAGGGATGAT 
AGATTTCATGAACGAGAGTAAAGTGATGGATCTGTTGAACAGTTCTGAGTATGTTCCAAG 

CGAGT<^TGCAAACGTTTGCGCATAATGAAAGGATCCGAAGCAATTGGACTTGCTCCAAG 

AGCAATGGAGAAGTTCAAGAACAGATCATGAACAAAAAAAAAAGAGGACAATATGCATTG 
ATTTTTTTTTTTTTGGTATTGTTATGAT 

TAGGAAAATATAATTGTTTACAAAAAAATAACTC^ 

AATTAGTCTGTGTTTTTGTTTTCATCTCTTAATTAGTAGAAATCATTTm 
TTGTGATAGTAAATCTATAGAGTTCGTA 

>G450 Amino Acid Sequence (domain in AA coordinates: TBD) 

MNLKETELCLGLPGGTET^SPAKSGVGNKRGFSETVDLKLNLQSNKQGH^ 

EKTFLKDPSKPPAKAQWGWPPVRNYRKNVMANQKSGEAEEAMSSGGGTV^ 

PYLRKVDLKMYTSYKDLSDALAKMFSSFTMGSYGAQGMI 

EDKDGDWMLVGDVPWPMFVESCKRLRIMKGSEAIGLAPRAMEKFKNRS* 
>G584 (40.. 1809) 

AAAAAGTCTTCTCTTTTATAACTACGTCAGAGAACTGTTATGTCTCCGACGAATGTTCAA 
GTAACCGATTACCATCTC^^CCAATCAAAAACGGATACAACAAATCTCTGGTCAACCGAC 
GACGATGCATCGGTAATGGAAGCTTTCATCGGCGGCGGCTCCGATCATTCTTCTCTTTTT 
CCTCCACTTCCTCCTCCTCCTCTTCCTCAAGTCAACGAAGATAATCTCCAGCIAACGTCTC 
CAAGCTTTAATCGAAGGAGCAAACGAGAACTGGACTTACGCCGTGTTCTGGCAATCATCT 
CACGGTTTCGCCGGAGAAGACAACAACAACAACAACACAGTGTTGTTAGGTTGGGGAGAT 
GGTTATTACAAAGGAGAAGAAGAGAAGTCTAGAAAGAAGAAATCAAATCCAGCTAGTGCA 
GCTGAACAAGAGCATCGTAAGAGAGTGATTAGAGAGCTCAACTCTTTAATCTCCGGTGGT 
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GTAGGAGGAGGAGATGAAGCTGGAGATGAAGAAGTTACAGATACTGAATGGTTCTTCTTA 
GTTTCAATGACACAGAGCTTTGTCAAGGGTACTGGTTTACCTGGTCAAGCTTTCTCAAAT 
TCAGACACGATTTGGTTATCTGGTTCTAATGCTTTAGCTGGATCAAGTTGTGAGAGAGCT 
CGTCAAGGTCAGATTTATGGGTTACAAACAATGGTGTGTGTAGCGACAGAGAATGGTGTC 
GTTGAGCTTGGTTCGTCGGAGATTATTCATCAAAGTTCAGATCTTGTTGATAAAGTTGAC 
ACCTTTTTCAATTTTAACAATGGTGGTGGTGAATTTGGTTCTTGGGCGTTTAATTTGAAT 
CCAGATCAAGGAGAGAATGATCCAGGTTTGTGGATTAGTGAACCTAATGGTGTTGACTCT 
GGTCTTGTAGCTGCTCCGGTGATGAATAATGGTGGAAATGACTCAACTTCTAATTCTGAT 
TCTCAACCAATTTCTAAGCTTTGTAATGGAAGCTCTGTTGAAAACCCTAACCCTAAAGTT 
CTGAAATCTTGTGAAATGGTGAATTTCAAGAATGGGATTGAGAATGGTCAAGAAGAAGAT 
AGTAGTAATAAGAAGAGATCACCGGTTTCGAATAATGAAGAAGGGATGCTTTCTTTTACC 
TCTGTTCTTCCATGTGACTCGAATCACTCTGATCTTGAAGCTTCAGTGGCTAAAGAAGCT 
GAGAGTAACAGAGTTGTGGTTGAACCGGAGAAGAAACCGAGGAAACGAGGGAGAAAACCG 
GCGAATGGAAGAGAAGAGCCTTTGAATCATGTAGAGGCAGAGAGACAGAGAAGAGAGAAG 
TTGAATCAGAGATTCTATTCTTTAAGAGCTGTGGTTCCTAATGTGTCTAAGATGGATAAA 
GCTTCTCTATTAGGAGATGCTATTTCGTATATCAGTGAGCTTAAGTCTAAGTTGCAAAAG 
GCTGAATCTGATAAAGAAGAGTTGCAGAAGCAGATTGATGTGATGAATAAAGAAGCGGGA 
AATGCGAAAAGTTCGGTAAAAGATCGAAAATGTTTGAATCAAGAATCGAGTGTGTTGATA 
GAGATGGAGGTTGATGTGAAGATTATTGGTTGGGATGCAATGATAAGGATTCAATGTAGT 
AAGAGGAATCATCCTGGTGCTAAGTTCATGGAAGCACTTAAGGAGTTGGATTTGGAAGTG 
AATCATGCGAGTTTATCGGTAGTGAATGATCTTATGATCCAACAAGCGACTGTGAAAATG 
GGGAATCAGTTTTTCACGCAAGATCAACTCAAGGTTGCTCTAACGGAGAAAGTTGGAGAA 
TGTCCATGAATTGAAGTCAGCATCTTTAGGGCTAATACACCGGAGAATACTGCGAAAAGT 
CGAAAACAACGATCATAGTATAAGCCGCGGTAAAAAGTGTTAAACCTTTCACACAAGTTT 
CTCTAGTGAATGTAGTTGTAAACTCTATTGTGTAAGGGTAATTTTGTAGTACCCACTTGT 
TGCTATTGAATGCTTGTTAGAGAGGATTCTTAGTGTAGTATATGATTAGGTTGGGGTTTG 
TTGTTTCATGAGATAAATAAATGTGTTTGATCAATGGTTAAGTCTTTGGTTTGTTGGTGT 
ATGTATGTAAATAAGGCTTTTGTTAGAAATAAGACAAATGGGACTGAAGTTGGAGTTTAA 
AA 

>G584 Amino Acid Sequence (domain in AA coordinates: 401-494) 

MSPTNVQWDYHLNQSKTDTT^ 

DNLQQRLQALIEGANEllTOTYAVFWQSSHGF 

KSNPASAAEQEHRKRVIRELNSLISGGVGGGDEAGDEEVTDTEWFFLVSMTQSFVKGTGL 
PGQAFSNSDTIWLSGSNALAGSSCERARQGQIYGLQTMVCVATENGWELGSSEIIHQSS 
DLVDKVDTFFNFNNGGGEFGSWAFNLNPDQGENDPGLWI^ 

DSTSNSDSQPISKLCNGSSVENPNPKVLKSCEMWFKNGIENGQEEDSSNKKRSPVSNNE 
EGRLSFTSVLPCDSiraSDLEASVAKEAESNRVVVE^^ 

ERQRREKLNQRFYSLRAWPNVSKMDKASLLGDAI SY I SELKSKLQKAESDKEELQKQID 
VMNKEAGNAKS S VKDRKCLNQES S VLI EMEVDVKI IGWDAMIRIQCSKRNHPGAKFMEAL 
KELDLEVNHASLSVVlsnDLMIQQATVKMGNQFFTQDQLKVALTEKVGECP* 
>G668 (1..1056) 

ATGGGAAGACCACCTTGCTGTGAAAAGATTGGAGTGAAGAAAGGGCCATGGACACCAGAG 

GAAGACATCATCTTGGTTTCTTACATCCAAGAACATGGTCCTGGAAACTGGAGATCTGTC 

CCAACACACACAGGTTTAAGATGTAGCAAGAGCTGCAGATTGAGATGGACTAATTATCTT 

CGACCCGGTATTAAGCGTGGAAATTTTACTGAGCATGAAGAGAAGACAATTGTTCATCTT 

CAAGCCCTTTTAGGCAACAGATGGGCAGCCATAGCATCATACCTTCCAGAAAGGACAGAC 

AATGATATAAAGAACTATTGGAACACTCACTTGAAGAAGAAGCTCAAAAAGATTAATGAA 

TCTGGTGAAGAAGATAATGATGGTGTCTCTTCATCAAACACTAGTTCACAAAAGAACCAT 

CAAAGCACTAACAAAGGTCAATGGGAAAGAAGACTTCAGACAGACATTAACATGGCAAAA 

CAAGCTCTTTGTGAGGCCTTGTCTTTAGACAAACCATCATCCACTCTTTCATCATCTTCA 

TCATTACCGACACCAGTAATCACACAACAAAACATCCGTAACTTCTCATC^ 

GACCGTTGTTATGATCCATCCTCTTCTTCTTC^^ 

ACTACTAATCCATACCCATCAGGGGTATATGCGTCAAGTGCTGAGAACATCGCCCGGTTG 
CTTCAAGATTTCATGAAAGACAC^CCCAAGGCTTTAACTTTATCATCTTCATCTCC 
TCAGAGACTGGACCACTCACTGCTGCAGTCTCGGAAGAAGGTGGAGAAGGGTTTGAACAA 
TCTTTCTTCAGCTTCAATT^^ 

TTCCATGATCAAGTGATCAAACCGGAAATAACAATGGACCAAGATCATGGTCTAATATCA 
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CAAGGGTCTCTGTCTTTGTTTGAGAAATGGTTATTTGATGAGCAAAGCCACGAGATGGTT 
GGTATGGCACTAGCAGGACAAGAAGGGATGTTCTAG 

>G668 Amino Acid Sequence (domain in AA coordinates: 13-113) 
MGRPPCCEKIGVKKGPWTPEEDIILVSYIQEHGPGNWRSVPTHTGLRCSKSCRLRWTNYL 
RPGIKRGNFTEHEEKTIVHLQALLGNRWAAIASY 

SGEEDNDGVSSSNTSSQKNHQSTNKGQWERRLQTDINMAKQALCEALSLDKPSSTLSSSS 
SLPTPVITQQNIRNFSSALLDRCYDPSSSSSSTTTTTTSNTTNPYPSGVYASSAENIARL 
LQDFMKDTPKALTLSSSSPVSETGPLTAAVSEEGGEGFEQSFFSFNSMDETQNLTQETSF 
FHDQVIKPEITMDQDHGLISQGSLSLFEKWLFDEQSHEMVGMALAGQEGMF* 
>G1050 (23.. 1582) 

TTCCCCATTTCAGAAAATCAAAATGGGTGGTGGTGGTGATACAACAGATACCAATATGAT 

GCAGAGAGTTAATTCTTCTTCTGGTACATCGTCTTCTTCGATCCCTAAACACAATCTTCA 

CTTGAATCCTGCTCTTATCCGCTCTCACCATCACTTCCGTCACCCTTTCACCGGAGCTCC 

TCCACCGCCGATTCCACCCATTTCTCCTTACTCTCAGATCCCGGCGACTTTACAACCTAG 

ACATTCTCGCTCTATGTCGCAACCGTCTTCTTTCTTCTCCTTTGATTCATTGCCGCCGTT 

AAATCCTTCTGCTCCGTCGGTTTCGGTGTCGGTGGAGGAGAAAACCGGTGCCGGATTTAG 

TCCTTCGTTGCCTCCGTCACCGTTTACGATGTGTCATTCTTCTAGCTCTAGGAACGCCGG 

AGATGGAGAGAATCTACCTCCGAGAAAGTCGCATAGGCGTTCGAATAGTGATGTTACTTT 

TGGGTTTAGTTCAATGATGTCTCAGAATCAAAAGTCTCCTCCTTTGAGTTCTTTGGAGAG 

ATCGATCTCTGGTGAAGATACATCAGATTGGTCTAATTTGGTGAAGAAAGAACCGAGAGA 

AGGCTTCTACAAGGGAAGAAAACCAGAGGTTGAAGCAGCTATGGACGATGTTTTCACGGC 

TTATATGAATCTTGATAACATTGATGTCTTGAATTCTTTTGGAGGTGAAGATGGCAAGAA 

TGGGAATGAGAATGTGGAGGAGATGGAGAGTAGTAGAGGTAGTGGTACAAAGAAGACGAA 

TGGTGGAAGTAGTAGTGATTCTGAAGGAGATAGCAGTGCGAGTGGGAATGTGAAGGTTGC 

GTTGAGTTCTTCTTCTTCAGGCGTGAAGAGAAGAGCAGGTGGAGATATTGCTCCTACTGG 

TAGACATTACAGGAGTGTTTCTATGGACAGTTGTTTCATGGGGAAGTTGAATTTCGGCGA 

CGAATCATCGCTAAAGCTTCCGCCTTCTTCATCAGCTAAAGTTTCCCCAACCAATTCAGG 

TGAAGGGAATTCAAGTGCTTATAGTGTTGAATTTGGAAACAGTGAGTTTACTGCAGCTGA 

AATGAAGAAGATTGCAGCTGATGAGAAACTCGCTGAGATTGTAATGGCTGACCCTAAGCG 

TGTTAAAAGAATCITGGCGAACCGCGTAT^ 

ATACATGGCAGAGTTGGAACACAAGGTGCAGACACTTCAGACTGAAGCTACTACATTATC 
GGCTCAGCTCACACATTTGCAGAGAGATTCTATGGGGTTGACAAACCAGAACAGTGAGCT 
GAAGTTTCGTCTTCAAGCTATGGAGCAGCAAGCACAACTCCGCGATGCTCTGTCAGAGAA 
ACTGAATGAAGAAGTCCAGCGGTTGAAACTGGTGATAGGGGAGCCGAACCGCAGGCAAAG 
TGGGAGCAGCAGCAGCGAATCAAAGATGTCACTAAACCCGGAGATGTTTCAGCAGCTTAG 
CATAAGTCAGTTACAACACCAACAGATGCAGCATTCCAATCAGTGTAGCACAATGAAAGC 
AAAGCACACTTCAAACGACTAGGGTAAGTAAAACTGCGATCCGCAGTTGTCTAGTTACAT 
ATATGATAAGAATCTTTTGTGCAGAGTTCTGTTTTTGGAAGTTTTAAAGAAACATATATA 
AAGATTATGTCCGGGAAATTTGATCATATTTCCTGAAACATACACACATATATATAGTGG 
TAATGGAGGACTTTCTTTCTGGACCA 

>G1050 Amino Acid Sequence (domain in AA coordinates: 372-425) 
MGGGGDTTDTNMMQRVNS S SGTS S S S I PKHNLHLNPALIRSHHHFRHP FTGAPPPP I PP I 
SPYSQIPATLQPRHSRSMSQPSSFFSFDSLPPLNPSAPSVSVSVEEKTGAGFSPSLPPSP 
FTMCHSSSSRNAGDGENLPPRKSHRRSNSDVTFGFSSMMSQNQKSPPLSSLERSISGEDT 
SDWSNLVKKEPREGFYKGRKPEVEAAMDDVFTAYMNLDNIDVLN^ 

MESSRGSGTKKTNGGSSSDSEGDSSASGNVKVALSSSSSGVKRRAGGDIAPTGRHYRSVS 

MDSCFMGKLNFGDESSLKLPPSSSAKVSPTNSGEGNSSAYSVEFGNSEFTAAEMKKIAAD 

EKLAEIVMADPKRVi^ILANRVSAARSKERKTRYMAELEHKVQTLQT 

RDSMGLTNQNSELKFRLQAMEQQAQLRDALSEKLNEEVQRLKLVIGEPNRRQSGSSSSES 

KMSLNPEMFQQLS I SQLQHQQMQHSNQCSTMKAKHTSND * 

>G1463 (199.. 1209) 

TATCCTTCGCAAGACCCTTCCTCTATATAAGGAAGTTCATTTCATTTGGAGAGGACACGC 

TGACAAGCTGACTCTAGCAGATCTGGTACCGTCGACAGTTTGAGATTTGCTT 

TTTTTTATTTTCTGCAAAATATGTCACTCTCTCCCATTTTGTTCATATAT 

AAGTTTGATCAACTTAGTATGCGTTTCITT 

TTTAGTTTCGTTATGGCGGACACACTGCT 

TATCTGAAGCCTATGATCGTTAACAGAGTATC^TGGCCTGATCTCTTCATCGAAGACGCA 
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GACGTGTTCAACAAGGATCCATATGTGAAGTTCCATGCTGAGATCCCTAGCTTCGTGATC 

GTTAAACCACGAACAAAGGCTTGTGGTAAAACCGATGGATGTGATTCGGGTTGCTGGAGG 

ATCATTGGTCGTGATAAGCTGATAAAGTCGGAGGAGACTGGTAAGATTCTAGGGTTCAAG 

AAGATACTCAAGTTCTGCCTAAAGTGGAAACCTAGAGAATACAAGAGAAGTTTGGTAATG 

GAAGAGTATAGGCTTACCAATAACTTCAACTGGAAGCAAGATCATGTGATTTGCAAGATT 

CGGCTTTTGTTTGAAGCAGAAATTAGTTTCTTGCTAGCCAAGCATTTCTACACTACATCA 

GACTCACTTCCTCGAAATGTGCTGTTGCCAGCTTATGGATTCTGTTCACCAGATAAACAA 

GAGGAGGACGAATTTTATCCGGTGACGATAATGATTTCAGAAGGAAAAGATTGGCCTAGC 

TACGTTACCAACAACGTGTATTGTCTGCATCCATCGGAGCTTGTGAATGTTCACGATGGG 

AAGTTTCATGATAACGGAATCTGCATCTTCGCTAACAGGACTTGTGGTGTAACCGATAAA 

TGCAATGAAGGTTACTGGAAGATTT^AGCACCGTGAGAAGCTGATCATGTCACGGTACGGG 

CAGACCATTGGTTGGAAGAAAGTTTTTCAGTTTTATGAAACGGAGAAAGAAAGACATTTT 

GGTAATGGAGAAGAAGTGAAGGTAACTTGGACTCTAAAAGAGTATAGGCTTACCAGAAAA 

ATGAACAAGAATAAAGTGGTGTGCGTTATCAAGTATAAGGTAAAGTGTTTACCGAGGATA 

ACTAGCTAGGGACTTCTACTCTTGGTTTCATGATCGATGCGACCGCTCTAGACAGGCCTC 

GTACCGGATCCTCTAGCTAGAGCTTTCGTTCGTATCATCGGTTTCGACAACGTTCGTCA 

>G1463 Amino Acid Sequence (conserved domain in AA coordinates : 9-156) 

MRFFFSLVPLFLGRFSFVMADTLLNAEDEVIISRYLKPMIVimVSWPDLFIEDADVFNKD 

PYVKFHAEIPSFVIV1CPRTKACGKTDGCDSGCWRIIGRDKLIKSEETGKILGFKKILKFC 

LKWKPREYKRSLVT^EEYRLTNNFNWKQDHVICKIR 

VLLPAYGFCSPDKQEEDEFYPVTIMISEGKDWPSYVTNNVYCM 

ICIFANRTCGWDKCNEGYWKIKHREKLIMSRYGQTIGWKKVFQFYETEKERHFGNGEEV 

CTTWTLKEYRLTRKMNKNKVVCVIKYKVKCLPRITS* 

>G1944 (236.. 1306) 

TCGACCTTCCTAATTTCCAACCTCTGTTCTTAGCAATATATTTTTTCTCCAAAAATAATT 

CTGAGTTTGATTTTCTTCTTCTAGCTCTTAAGTATATTTCTTTGTTGTTATTTATCTTTT 

AATCCTTTAATCTCATCTTTGTTTATCTTTAATCAAAACCCAAAATTTACATGGGTTCTT 

GAAAATCTAGAAGAAATAAAGGAAACATAACAAAAATAGAAAGAAAAAGAAGCTAATGGT 

CTTAAATATGGAGTCTACCGGAGAAGCTGTTAGATCAACCACCGGTAACGACGGTGGTAT 

TACGGTGGTTAGATCCGACGCGCCGTCAGATTTCCACGTAGCTCAAAGATCAGAAAGCTC 

A^CCAATCTCCCACCTCTGTCACTCCTCCTCCACCACAGCCATCGTCTCATCACACAGC 

TCCTCCGCCGCTGCAAATTTCGACGGTGACGACTACGACTACGACGGCCGCGATGGAAGG 

TATCTCCGGTGGACTGATGAAGAAGAAGCGTGGACGGCCAAGGAAGTATGGACCGGACGG 

GACTGTTGTAGCGTTATCTCCTAAACCGATTTCATCAGCGCCGGCGCCGTCGCATCTTCC 

GCCGCCGAGTTCACACGTCATCGATTTCTCCGCTTCTGAGAAACGTAGCAAAGTGAAACC 

AACGi^ACTCGTTTAACAGAACAAAGTATCATCACCAAGTTGAGAATTTGGGTGAATGGGC 

TCCTTGCTCCGTCGGTGGTAATTTCACACCTCATATAATCACAGTCAACACCGGCGAGGA 

TGTAACAATGAAGATAATCTCGTTTTCGCAACAAGGACCTCGCTCTATTTGTGTTCTGTC 

AGCAAACGGTGTTATTTCAAGCGTTACACTTCGTCAGCCAGATTCCTCTGGCGGCACATT 

GACATACGAAGGTCGGTTTGAGATATTATCATTATCCGGGTCATTCATGCCTAATGATTC 

AGGCGGAACACGAAGTAGAACGGGAGGAATGAGTGTATCGTTAGCAAGTCCCGATGGACG 

TGTAGTAGGCGGTGGCCTCGCCGGTTTACTAGTAGCCGCGAGTCCGGTTCAGGTGGTTGT 

AGGAAGTTTTTTAGCGGGCACTGACCATCAAGATCAGAAACCGAAAAAGAACAAACATGA 

TTTCATGTTGTCGAGTCCTACCGCTGCAATTCCTATCTCTAGTGCAGCTGATCACCGGAC 

AATCCATTCGGTCTCGTCTCTTCCGGTCAATAATAATAGRTGGCAGACTTCTTTAGCTTC 

CGATCCAAGAAACAAGCATACCGATATTAATGTCAATGTAACTTGAAATCCAATCTTTCT 

CTGTATTTTCTGTTAACAAGTTTGATTTGGTTGTTTATCTACATTAGGATTTTACTAAAA 

TGGTAGTATTATTTATAGGGTTTTAGGG 

GGATA 

>G1944 Amino Acid Sequence (domain in AA coordinates : 87-100) 
MVLNMESTGEAVTtSTTGNDGGI 

TAPPPLQI STWTTTTTAAMEGI SGGLMKKKRGRPRKYGPDGTVVALS PKP I S S APAPSH 

LPPPSSHVIDFSASEKRSKV1CPTNSFNRTKYHHQVENLGEWAPCSV 

EDVTMKIISFSQQGPRSICVliSANGVISSvTLRQPDSSGGTLTYEGRFEILSLSGSFMPN 

DSGGTRSRTGGMSVSIiASPDGRWGGGLAGLLVAASPVQWVGSFLAGTDHQDQKPKKNK 

HDFMLSSPTAAJPISSAADHRTIHSVSSLPVIJNOTWQT^ 

>G2383 (37.. 990) 
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GACCTCTTTGATCCCTTCATTCCCCATCAAACAACCATGTTTCCTTCTTTCATTACTCAC 
ATTCAAAGCCCTAATTCTCACCATCACTACTCTTCGCCTTCTTTTCCTTTCTCTTCCGAT 
TTTCTTGAGAGTTTTGATGAATCCTTCTTGATAAACCAATTCTTGTTACAGCAGCAAGAT 
GTAGCAGCAAATGTTGTTGAATCTCCTTGGAAATTTTGCAAGAAGCTTGAGCTTAAGAAG 
AAGAATGAGAAGTGTGTTGATGGAAGCACCTCACAAGAGGTTCAATGGAGAAGGACGGTC 
AAAAAAAGGGACAGGCATAGTAAGATCTGCACGGCTCAAGGTCCTAGAGACCGGAGGATG 
AGGCTGTCTCTTCAGATTGCTCGCAAGTTTTTCGATCTTCAAGACATGTTGGGTTTCGAC 
AAGGCGAGCAAGACGATTGAATGGCTTTTCTCCAAATCAAAGACTTCCATCAAACAACTT 
AAAGAAAGAGTGGCTGCATCGGAAGGAGGAGGAAAGGATGAACATCTCCAGGTTGATGAA 
AAGGAAAAGGATGAGACACTGAAGTTGAGAGTCTCAJ^GAGAAGAACAAAGACTATGGAG 
AGCTCTTTTAAGACTAAAGAGTCGAGAGAGAGAGCTAGAAAGCGAGCAAGAGAGAGAACA 
ATGGCAAAGATGAAGATGAGATTATTTGAGACCTCGGAAACAATTTCAGATCCTCATCAA 
GAAACTAGAGAGATCAAGATAACCAATGGTGTACAATTACTAGAAAAGGAAAATAAAGAA 
CAAGAATGGAGTAATACTAATGATGTTCACATGGTAGAGTATCAAATGGATTCTGTGAGC 
ATCATAGAGAAGTTTCTTGGACTAACCAGTGACTCTAGCTCCTCTTCCATTTTTGGTGAC 
TCCGAGGAATGTTACACAAGTCTTAGTTCAGTAAGAGGTACAATTTCAGCAGCAGGTAAC 
AGCAATGTGTTAACTAAAAACCCTAATTGAGTAATGCAGTTTTGATTAATATTAGCTTTT 
TGGTAATTCCAGGAATGTCGACACCAAGGG 

>G2383 Amino Acid Sequence (conserved domain in AA coordinates : 89-149) 

MFPSFITHIQSPNSHHHYSSPSFPFSSDFLESFDESFLINQFLLQQQDVAANVVESPWKF 

CKKLELKXKNEKCVDGSTSQEVQWRRTVKKRDRHSKICTAQGPRDRRMRLS 

IiQDMLGFDKASKTIEWLFSKSKTSIKQLKERVAASEGGGKDEHLQVDEKEKDETLKLRVS 

KRRTKTMESSFKTKESRERARKRARERTMAKMKKRLFETSETISDPHQETREIKITO 

LLEKENKEQEWSNTND VHMVE YQMDSVS I IEKFLGLTSDSSSSSIFGDSEECYTSLSSVR 

GTISAAGNSNVLTKNPN* 

>G571 (326.. 1708) 

TAGCCGACCTCTCTTCTCTCTTCTGAAAAAAACACCAAAGGAGCTTTAAATGCTCCGTTA 

CATAATCTCTATCTCTTTCCAAGAATATAGAGAAAGGAAAATAATATACAAGAATTAAAA 

GAAGGTATATCATCATCTCTCTAGCTAGTGATCAAAGCACCGTCATCATCATCATATATC 

ATCAGCTTGCCTCAGAGGAGAAGACCAACATAAGAGAGATCGAAGATCAAAATCTATCTC 

TCTTCATCATCTTCTGCTGTTACTATCATATCACACGCTCTCTCAAACATCATCCTATAT 

ATAGACTTCTCTTCATCATCATCAAATGCAAGGTC^TCACCAGAATCATCATCAACACTT 

ATCATCATCCTCCGCCACGTCTTCCCATGGAAACTTCATGAACAAAGATGGGTATGATAT 

TGGAGAGATAGACCCATCACTCTTCCTCTATCTTGATGGACAAGGACATCATGATCCTCC 

ATCAACTGCTCCTTCTCCTTTACATCATCATCACACAACTCAGAATTTGGCGATGAGACC 

TCCAACATCGACGCTCAACATCTTTCCATCTCAGCCTATGCACATAGAGCCACCTCCTTC 

TTCTACACACAATACCGATAATACAAGATTAGTTCCGGCTGCTCAACCTAGTGGTTCCAC 

TCGACCAGCTTCTGACCCGTCCATGGACTTGACCAATCATTCTCAGTTTCATCAACCTCC 

TCAAGGTTCTAAATCCATCAAGAAGGAAGGGAACCGCAAGGGTCTTGCCTCATCGGACCA 

TGACATACCTAAATCGTC^VGACCCTAAAACATTGAGAAGACTAGCACAAAACAGAGAAGC 

AGCAAGAAAAAGCAGATTACGTAAAAAGGCTTATGTTCAGCAACTCGAGTCATGTAGGAT 

CAAACTGACCCAACTAGAACAAGAGATTCAACGGGCCAGATCCCAAGGCGTATTCTTTGG 

AGGGTCTCTTATAGGAGGAGATCAACAGCAAGGTGGACTACCCATTGGCCCTGGCAACAT 

CAGCTCTGAAGCAGCGGTGTTCGATATGGAATATGCGAGGTGGCTGGAGGAGCAGCAGAG 

GCTATTAAACGT^CTAAGGGTGGCAACACAAGAACACTTGTCCGAGAACGAGCTTAGGAT 

GTTTGTGGACACATGTTTAGCTCATTATGACCATTTGATTAACCTCAAGGCTATGGTCGC 

TAAGACCGATGTCTTCCACCTCATTTCTGGAGCATGGAAAACTCCAGCTGAACGTTGCTT 

CTTGTGGATGGGTG6TTTCCGTCCATCGGAGATCATTAAGGTGATTGTGAACCAGATAGA 

ACCATTGACGGAGCAACAGATAGTTGGGATATGTGGGCTGC^CAGTCCACACAAGAGGC 

CGAGGAGGCTCTCTCGCAAGGCCTCGAGGCGTTGAATCAATCACTTTCCGATAGCATTGT 

CTCTGACTCCCTCCCGCCTGCCTCCG(^CCACTTCCTCCTC^TCTATCCAATTTCATGTC 

ACACATGTCCTTAGCTCTCAACAAGCTCTCTGCTCTCGAGGGCTTCGTTCTCCAGGCGGA 

TAATTTGAGGCACCAAACGATCC^TAGGCTGAACCAATTGTTGACGACCCGTCAAGAAGC 

ACGGTGTCTTCTAGCCGTTGCGGAGTACTTCCACCGTCTTCAAGCTCTAAGTTC 

GCTAGCCCGTCCTCGGCAAGATGGATAATACTA7U\ACAACTGATGAAGGAAACCAAAAAC 

AAAAACAAGAGAATAGGTTGATTAGTTAGCCGCCAGCTTGACCTCTTTATCATATATATC 

GTCTCTCTACTCAAATACAGTGCAATTAGGGAAAATTGTTTGGCTTCTTTTTGGTATATG 
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ATTCTTACTATTATGTTTTTAATCAAGA 

>G571 Amino Acid Sequence (domain in AA cordinates : 160-220) 

MQGHHQIJHHQHLSSSSATSSHGNFMNKDGYDIGEIDPSLFLYLDGQGHHDPPSTAPSPLH 

HHHTTQNLAMRPPTSTLNIFPSQPMHIEPPPSSTHNTDNTRLVPAAQPSGSTRPASDPSM 

DLTNHSQFHQPPQGS KS I KKEGNRKGLAS SDHDI PKSSDPKTLRRL AQNRE AARKSRLRK 

KAYVQQLESCRIKLTQLEQEIQRARSQGVFFGGSLIGGDQQQGGLPIGPGNISSEAAVFD 

MEYARWLEEQQRLLNELRVATQEHLSENELRMFVDTCLAHYDHLINLKAMVAKTDVFHLI 

SGAWKTPAERCFLWMGGFRPSEIIKVIVNQIEPLTEQQIVGICGLQQSTQEAEEALSQGL 

EALNQSLSDSIVSDSLPPASAPLPPHLSNFMSHMSLALNKLSALEGFVLQADNLRHQTIH 

RLNQLLTTRQE ARCLLAVAE YFHRLQAL S SLWLARPRQDG * 

>G636 (6.. 1814) 

CGATGATGCAACTGGGTGGTGGTACTCCGACCACTACAGCGGCGGCTACAACCGTCACAA 

CTGCTACAGCACCACCGCCACAATCAAACAACAACGATTCAGCGGCAACAGAAGCAGCGG 

CAGCAGCGGTTGGGGCGTTTGAGGTGTCGGAAGAGATGCACGACCGTGGGTTTGGAGGAA 

ATCGTTGGCCGCGGCAGGAAACGCTAGCGTTGTTGAAAATACGATCTGACATGGGAATAG 

CGTTTCGAGACGCTAGCGTT7VAAGGTCCCTTATGGGAAGAGGTTTCTAGGAAAATGGCGG 

AGCATGGTTACATAAGAAACGCAAAGAAATGCAAAGAGAAATTCGAGAACGTTTACAAAT 

ACCACAAACGAACCAAAGAAGGTCGTACCGGAAAATCCGAAGGCAAAACTTATCGCTTCT 

TTGATCAATTAGAAGCTCTCGAGTCTCAATCTACAACCTCACTCCACCATCATCAACAAC 

AT^ACGCCTCTTCGACCACAGCAAAACAACAACAACAACAACAACAACAACAACAACAGCT 

CCATATTTTCAACTCCTCCTCCGGTAACGACAGTTATGCCGACGCTTCCTTCTTCATCAA 

TTCCTCCGTATACTCAGCAGATTAATGTACCTTCGTTTCCAAACATCTCCGGTGATTTTC 

TATCGGATAATTCTACATCGTCTTCGTCTTCTTATTCGACTTCTTCTGACATGGAGATGG 

GTGGTGGAACTGCGACTACAAGGAAGAAAAGGAAGAGGAAATGGAAGGTGTTTTTCGAGC 

G GTTG ATGAAACAAGTAGTTGATAAACAGGAAGAG CTTCAACG CACATTCTTGGAAGCTG 

TTGAAAAGCGAGAACACAAGAGATTGGTTAGAGAAGAGTCTTGGAGAGTTCAAGAGATTG 

CCAGAATCAACCGCGAGCACGAGATCTTAGCTCAAGAACGCTCTATGTCCGCTGCAAAAG 

ACGCTGCTGTTATGGCCTTTCTTCAAAT^ACTGTCAGAGAAACAACCGAATCAGCCACAAC 

CGCAGCCTCAGCCGCAACAAGTTCGACCATCAATGCAGCTTAATAACAACAATCAGCAGC 

AACCGCCTCAACGGTCTCCTCCACCGCAACCTCCTGCTCCGCTTCCGCAGCCAATTCAAG 

CGGTTGTGTCGACGTTAGACACAACGAAAACGCACAATCGTGGTGATCAGAATATGACTC 

CTGCAGCTTCAGCGAGCTCGTCGCGGTGGCCGAAAGTGGAGATAGAAGCATTGATAAAGC 

TGAGGACGAATCTTGATTCGAAATATCAAGAAAACGGACCAAAAGGACCATTGTGGGAAG 

AGATATCAGCGGGAATGAGAAGGTTAGGATTCAACAGGAACTCAAAGAGATGCAAAGAGA 

AATGGGAAAACATAAACAAATACTTCAAGAAAGTCAAAGAGAGCAACAAGAAACGTCCCG 

AAGATTCCAAGACTTGCCCTTACTTTCACCAGCTTGATGCTTTATATAGAGAGAGGAACA 

AATTCCACAGCAACAACAACATTGCAGCTTCTTCTTCATCTTCCGGTCTTGTTAAACCGG 

ATAATTCTGTTCCCTTGATGGTCCAACCAGAGCAGCAATGGCCTCCGGCTGTAACGACTG 

CGACAACTACTCCCGCAGCGGCTCAGCCTGATCAGCAATCTCAGCCGTCGGAGCAGAACT 

TTGATGATGAAGAAGGTACAGATGAAGAGTACGACGATGAAGATGAGGAAGAGGAGAATG 

AAGAAGAGGAAGGAGGTGAGTTCGAGCTTGTGCCTAGCAATAACAACAACAACAAGACGA 

CGAATAATCTGTAATGATGATGATTCGAGTTCGAACCGGTTTGGTGGTGAAAGATTAGTA 

ATCTTTTTTTAAGTTTTGATACAGAACATGAGAATTTAAATATTGGAGGGTTT 

>G636 Amino Acid Sequence (domain in AA coordinates: 55-145, 405-498) 

MQLGGGTPTTTAAATTVTTATAPPPQSNl^ 

WPRQETLALLKIRSDMGIAFRDASVKGPLWEEVSRKMAEHGYIRNAKKCKEKFENVYKYH 
KICTKEGRTGKSEGKTYRFFDQLEALESQSTTSLHHHQQQT^ 

FSTPPPVTTVMPTIiPSSSIPPYTQQINVPSFPNISGDFLSDNSTSSSSSYSTSSDMEMGG 

GTATTRKKRKRKWKVFFERLMKQVVDKQEELQRTFLEAVEKREHKRLVREESWR 

Il^EHEILAQERSMSAAKDAAVMAFLQKLSEKQPNQPQPQPQPQ 

PQRS PPPQPP APLPQP I QAWSTIiDTTKTHNRGDQNMTPAAS AS S SRWPKVE IEAL I KLR 
TNLDS KYQENGPKGPLWEE I SAGMRRLGFNRNS KRCKEKWENINKYFKKVKESNKKRPED 
S KTCPYFHQLDALYRERNKFHSNNNIAAS S S S SGLVKPDNSVPLMVQPEQQWPPAVTTAT 
TTPAAAQPDQQSQPSEQNFDDEEGTDEEYDDEDEEEENEEEEGGEFEIiVPSNNNmKTTN 
NL* 

>G878 (197 -.1738) 

CAAAAAAAATCTCTCCCATTAAAAGACTGCCCAAAGAAATATTTTATACAAAATGAAAGA 
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TAGCGGTGGCGTTGGATTTAGTCCTGGACCAATi 



'TTT 



TCCTGATGAGTTCAAGTCTTTCTCTCAGCTTTT, 



GACTCTCGTCTCAAATTTATTCTCTGA 



'AGCTGGAGCTATGGCTTCTCCGGCGGC 



AGCTGCTGTTGCCGCCGCTGCTGTGGTTGCTACTGCTCATCATCAGACACCT 

AACGGGATTGATGATAACTCAACCACCGGGGATGTTTACTGTACCGCCGrrrTTa^arTn^ 
GGCTACTCTTTTGGATTCTCCGAGCTTCTTTGGTCTTTTCTCACCTCTTC^ 



^atcccaagcgaag^ 

tgatggatataggtggcgcaagtatggtcagaaagtagtcaaaggaaS^^^ 



^aagaagaatacgacggcgct^^ 



TGTACATGGGATAAACAAAATTTACA< 



>G1134 (61.. 849) 

TTGCTTACCGGGAACTCGAACGATTTAC ^^^AATCTTGGTCTC^CCGAT 



CCTGTTGAGCAAGGGTTGTATCAACAAi 



CCGACAAGTCGCGGCTCGTTCGAGTTCCCGATT 



^^GTTTCACCGACAGAATAGTACTCCGGCG 

AT 

CTCTTCTCTTCTCCTGAGTTTACTTCTCAAAT™ " * ^^^^ 1 u ^^^^^^^GCA 



TACTTATCGGGGAATATCGATGTTTCTCCGGGAAGT^GCGGTCT 
CTCTTCTCTTCTCCTGAGTTTACTTCTC FAAGCGGTCT, 

CCTACCGGAGTATCAAGCATGTCGGATA' 
GCTTTTAGGGTTCGGGCTAAACGTGGTTt 
GTACGAAGGACGCGGATTAGTGATCGGA* 



CCTACCGGAGTATCAAGCATGTCGGAT^^^ 

'GCGCAAC 

^TAAGGAAGCTACTUVGAGCTTGTACCTAACATG 
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GACAAGCAAACCAACACTGCAGACATGTTAGAAGAAGCAGTAGAATACGTGAAAGTTCTT 
CAAAGGCAGATCCAGGAGTTAACAGAAGAACAGAAGAGGTGCACATGCATACCTAAGGAA 
GAACAATAAGGTTTGCTCCTGATTTGTTTTATATTTGCTTAACGGCAATGATCTGATCGA 
AAAATTCGAAAGATGATCTTAGCTTGAATTTAGATGGATGTCATGTTGAAAAGTATATTA 
TTTGATAAATGGATGTAGGTGTAATATAAAATTTTTGTACAATAATGAAGAAAGTTAAAA 
AGAATTAATGAAAACATATATTCTTTATGATATAAAAAAAAAAA 

>G1134 Amino Acid Sequence (domain in AA coordinates: 198-247) 

MQPTSVGSSGGGDDGGGRGGGGGLSRSGLSRIRSAPATWLEALLEEDEEESLKPNLGLTD 

LLTGNSNDLPTSRGSFEFPIPVEQGLYQQGGFHRQNSTPADFLSGSDGFIQSFGIQANYD 

YLSGNIDVSPGSKRSREMEALFSSPEFTSQMKGEQSSGQVPTGVSSMSDMNMENLMEDSV 

AFRVRAKRGCATHPRS I AERVRRTRI SDRIRKLQELVPNMDKQTNTADMLEEAVEYVKVL 

QRQIQELTEEQKRCTCIPKEEQ* 

>G1008 (89.. 973) 

GCCTTTTTGACTCTTCTTTCTCTCTTCTACTTTTTTTCAGGCTCTCTCTCTATATCTCTA 

TCTTCTTCTCCGGTTAACTAAAAGAGAAATGAAAAGCCGAGTGAGAAAATCCAAGTACAC 

GGTTCACCGGAAAATCACATCCACACCGTTCGACGGTTTCCCGAAGATTGTCAAAATCAT 

AGTCACTGACCCATGCGCTACTGATTCTTCCAGCGATGAGGAAAACGACAACAAATCTGT 

TGCTCCGAGGGTGAAACGTTATGTGGATGAGATCAGGTTCTGTGACGAAGATGACGAACC 

TAAACCGGCGAGGAAAGCGAAGAAAAAGTCCCCGGCGGCTGCGGCGGAGAACGGTGGAGA 

TTTGGTAAAGTCTGTGGTGAAGTATAGAGGAGTGAGACAACGACCTTGGGGAAAATTTGC 

GGCGGAGATTCGTGATCCTTCGAGTCGTACTAGACTCTGGCTTGGGACTTTTGCGACGGC 

GGAGGAAGCTGCTATAGGTTACGATAGAGCCGCGATTCGAATCAAAGGTCATAACGCTCA 

GACGAATTTTCTCACTCCTCCTCCTAGTCCGACGACTGAGGTGTTACCGGAAACTCCGGT 

GATTGACCTTGAAACTGTCTCTGGTTGTGATTCGGCGAGGGAATCGCAAATCAGTCTGTG 

TTCTCCGACTTCTGTTCTCCGGTTTAGTCACAACGACGAAACAGAGTACAGAACAGAGCC 

AACGGAAGAACAAAATCCGTTTTTCTTGCCTGATTTGTTTCGCTCCGGAGATTATTTTTG 

GGATTCCGAAATTACCCCTGACCCTTTGTTTCTCGACGAATTCCACCAGTCCTTGTTACC 

AAACATCAACAACAACAACACAGTGTGTGATAAGGATACGAATCTGTCTGATAGTTTTCC 

GTTGGGAGTGATCGGAGATTTCAGCTCATGGGATGTTGATGAGTTTTTCCAAGATCATTT 

GTTGGATAAGTAATTTGATGAGTTCTTCCCCAGAATTTTTCTGGGTTTCTCTTTTTGGTT 

GTGTGAGTGAGATGAGTGGTTTGATGACAACGACGGGGATGAATCTTAGCCGTCCGTTTT 

CCATTTCGTGGACGGCTCCGATCAGCGGAAGAAGCGCAACGGAGTTTTTATTTATCTGTT 

TGAGAATTTTATAATTTAATTTGCGAGTAAATATAGTAATTAGTGTTAAGATTGTGAGAG 

TTTAAGTTAATTAGGGAGGGGTTTTGAATATTGGGGATTTTGGGAGGTTTTTGTTTGGTT 

TCTCTCCAAGTCTGTCACTATGCAAGGAAGCAGTATAAAGACCGTATATATATTTTATTA 

TTAATATTGATAAAAGTAAAAAAAAAAAAAAAAA 

>G1008 Amino Acid Sequence (domain in AA coordinates: 96-163) 
MKSRVRKS KYTVHRKITSTPFDGFPKI VKI I VTDP CATDS S SDEENDNKSVAPRVKRYVD 
E IRFCDEDDEPKPARKAKKKS PAAAAENGGDLVKS WKYRGVRQRPWGKFAAEIRDPS SR 
TRLWLGTFATAEEAAIGYDRAAIRIKGHNAQTNFLTPPPSPTTEVLPETPVIDLETVSGC 
DSARESQISLCSPTSVLRFSHNDETEYRTEPTEEQNPFFLPDLFRSGDYFWDSEITPDPL 
FLDEFHQSLLPNINNl^TVCDKDTNLSDSFPLGVIGDFSSWDVDEFFQDHIiLDK* 

>G1020 (132.. 689) 

CTGTTCACAAGAAAGCTCCCCAAAAGGAGCGTTGCTTTACTCTCCTATAAAAAGAAGCTC 

TTCTACTTCTTCTCGTTACCACAAAACTCTTTCACCGATCTTCTCGTTCCATTCTTCTTC 

CTAATTACACCATGCCCAACATCACCATGGGTTTGAAACCCGACCCGGTTGCTCCAACGA 

ACCCGACTCATCATGAGAGTAATGCTGCCAAAGAGATTCGTTACAGAGGCGTTAGGAAAC 

GTCCATGGGGAAGAffACGCCGCTGAGATCCGAGATCCGGTTAAGAAAACTCGAGTCTGGC 

TCGGTACGTTCGACACCGCTCAGCAGGCGGCGCGTGCTTACGACGCAGCCGCGCGTGACT 

TTCGTGGTGTTAAGGeTAAGACCAATTTCGGTGTTATCGTTGGTAGTAGTCCTACTCAGA 

GTAGCACCGTCGTCGACTCTCCCACGGCGGCACGGTTTATAACACCTCCGCACCTCGAGC 

TCAGCTTAGGCGGCGGCGGCGCGTGTCGTCGTAAGATCCCGCTTGTGCATCCGGTTTACT 

ACTATAACATGGCGACGTATCCAAAGATGACGACGTGTGGTGTCCAGAGCGAGTCTGAAA 

CGTCGTCGGTCGTTGATTTCGAAGGTGGAGCTGGGAAGATATCTCCGCCGTTAGATCTGG 

ATCTTAACTTAGCTCCTCCGGCGGAATAGGCCGTGAGTTTTTTTTTTCTTATGTCGTOT 

TTTAGACAAAAAAAAATAACGITTCCTTTTTTTTTCTGCCTAAGAAAA^ 

TTTTTTAGAAGAAAAAAAAAAAAAAAAAAAAAA 
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>G1020 Amino Acid Sequence (domain in AA coordinates : 28-95) 
MPNI TMGLKPD P VAPTNPTHHE SN AAKE I R YRG VRKRP WGR Y AAE I RDP VKKTRVWLGTF 
DTAQQAARAYDAAARDFRGVKAKTNFGVIVGSSPTQSSTVVDSPTAARFITPPHLELSLG 
GGGACRRKIPLVHPVYYYNMATYPKMTTCGVQSESETSSWDFEGGAGKISPPLDLDLNL 

APPAE* 

>G1023 (252. .1250) 

TCGTCTTCTTAATCGCTTTCTGCTCTGTTTTTCTCGTTCATCAAGCTACATCTACTAGCT 

CTCTCAGTGATTGATTTCTCACAGTTTCATCGATTTCCATGCGTTTAAGACCTAAAAGGA 

CTTGTTCTGGGGTAAAGGACTTTTCTTGTTCTTGAGAGAGTTCATTTTGAGGCTTTTCTG 

GGAATTTTGAGAGGTTTTTTAGGGTTTAAGGGGGTTTGGTTTTGAATTTCGCACACCAAG 

TGTTCGATAAAATGGCTGAACGZ^AAGAAACGCTCTTCTATTCAAACCAATAAACCCAACA 

AAAAACCCATGAAGAAGAAACCTTTTCAGCTAAATCACCTCCCAGGTTTATCTGAAGATT 

TGAAGACTATGAGAAAACTCCGTTTCGTTGTGAATGATCCTTACGCTACTGACTACTCAT 

CAAGCGAAGAAGAAGAAAGGAGTCAGAGAAGGAAACGTTATGTCTGTGAGATCGATCTTC 

CTTTCGCTCAAGCTGCTACTCAAGCAGAATCTGAAAGCTCATATTGTCAGGAGAGTAACA 

ATAATGGTGTAAGCAAGACTAAAATCTCAGCTTGTAGCAAAAAGGTTTTACGCAGCAAAG 

CATCTCCGGTCGTTGGACGTTCTTCTACTACTGTCTCGAAGCCTGTTGGTGTTAGGCAGA 

GGAAATGGGGTAAATGGGCTGCTGAGATTAGACATCCAATCACCAAAGTAAGAACTTGGT 

TGGGTACTTACGAGACGCTTGAACAAGCAGCTGATGCTTATGCTACCAAGAAGCTTGAGT 

TTGATGCTCTGGCTGCAGCCACTTCTGCTGCTTCCTCTGTTTTGTCAAATGAGTCTGGTT 

CTATGATCTCAGCCTCAGGGTCAAGCATTGATCTTGACAAGAAGCTAGTTGATTCGACTC 

TTGATCAACAAGCTGGTGAATCGAAGAAAGCGAGTTTTGATTTCGACTTTGCAGATCTAC 

AGATTCCTGAAATGGGTTGCTTCATTGATGACTCATTCATCCCAAATGCTTGTGAGCTTG 

ATTTTC T CTTAACAGAAGAG AAC AAC AAC C AAATGTTGGATGATTACTGTGGC ATAGATG 

ATCTGGACATCATTGGTCTTGAATGTGACGGTCCAAGCGAACTTCCAGACTATGATTTCT 

CAGATGTGGAGATCGATCTTGGTCTCATTGGAACCACCATTGACAAGTATGCTTTCGTTG 

ATCATATCGCAAC^ACTACTCCCACTCCTCTTAATATCGCGTGCCCATAAGTTTTGCAGC 

TAGGTGTTATTATTAGCTATAGGAGCAACGTAAAAAGCTCGTTGTTACTCGGTTTTGTCT 

TAAGTTATTAAAGTATAGCAGAGGCAGTTAATCTCAAGGGAAGCAAAAACCCTAAAGATA 

GAAGCAGATGC^GTTTTGTGTGTTGGTGTTACTAAAGAAAGTTTTGTTGACATAATGGTT 

TTGATGTTGTGGAGAAGATAGAGAGGTGTGATCGAAATTGTAAATCTCAGGTGGTTTTTT 

TTGAAGGCAATTGTTTCTCATTTAGGGTTTTTTTCTATATGAGGATTGTCTTTGAAAAGC 

CTTTAGATGTTTTCTAATTCGTAAGCTCTCTCAATCTTTGTAAGTTTTGCCTGTTGAGTT 

ATTGATACATATGTGAGACCTACTTTATTTGTTTTGTGCTACATACATTGTTGATGGTTT 

CGTCAAAAAAAAA 

>G1023 Amino Acid Sequence (conserved domain in AA coordinates : 128-195) 
MAERKKRSSIQTNKPNKKPMKKKPFQLNHLP 

EERSQRRKRYVCEIDLPFAQAATQAESESSYCQESNNNGVSKTKISACSKKVLRSKASPV 

VGRSSTTVSKPVGWQRKWGKWAAEIRHPITKVRTWLGTYETLEQAADAYATKKLEFD^ 

AAATSAASSVLSNESGSMISASGSSIDLDKKLVDSTLDQQAGESKKASFDFDFADIiQIPE 

MGCFIDDSFIPNACEIiDFLLTEENNWQMLDDYCGIDDLDIIGLECDGPSELPDYDFSDVE 

IDLGLIGTTIDKYAFVDHIATTTPTPLNIACP* 

>G1053 (38.. 538) 

GAAACTCTTACATACTCATATAAACCAAACTAAAACCATGATTCCGGCAGAAATCAACGG 

ATATTTCCAATATCTATCACCGGAATACAACGTAATAAACATGCCTTCATCTCCAACCTC 

TTCCTTAAACTACCTAAACGATTTGATCATCAACAACAACAACTATTCCTCATCATCCAA 

CAGTCAAGATCTCATGATAAGCAACAACTCAACTTCCGACGAAGATCATCATCAAAGCAT 

CATGGTACTCGACGAGAGGAAACAGAGAAGGATGCTTTCGAACAGAGAATCTGCAAGGAG 

GTCAAGGATGAGGAAACAGAGACATCTTGATGAACTCTGGTCTCAGGTAATAAGGCTTCG 

CAACGAGAACAACTGTCTTATCGATAAGCTGAACCGCGTATCGGAGACTCAAAATTGTGT 

ATTGAAGGAGAACTCTAAACTCAAAGAAGAAGCTTCTGATCTCCGACAGCTTGTTTGTGA 

ACTGAAATCTAACAAGAACAACAACAATAGTTTTC 

TTACTCAAA 

>G1053 Amino Acid Sequence (domain in AA coordinates: 74-120) 
MI PAEINGYFQYLSPEYNVINMPS SPTSSLNYLNDLI INNNNYS SS SNSQDLMI SNNSTS 
DEDHHQSIMVTjDERKQRRI^SNRESARRSRMRK^^ 
VSETQNCVljKENSKLKEEASDLRQLVCELKSNKNNNNSFPREFEDN* 
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>G1137 (202. .1248) 

TACTTCAGACTTCTACTCAAACCAGTCACGTAGTTGGTTGGTGACATTTCGCTGCATTTT 

TCAATCTGTGATTGTTTTTCGTTCGTCTTTCTTTTACTATTTTCTCGAAAAGGACACAAG 

AAGTATTGCATTCACTCAGTTGAGCi^ACTTAACAATCGTGTTGTACTTTTTGAAGTTCCC 

TTGAGCTAAACTGCTAAGAGCATGCCTCTGGATAAGAGGCAACGGGATTTGCCTCTGGGC 

TTAAGTCCTCAAGCTTGCTTCAAGGATATAGTAGGTCGGTCTGTCCTTCCTAGAATTCCT 

CTCCCTGAGCTTGGGAAACTATATGCAGCTAAGCTTCAGGCTCGCTGTTTGCAGCCACCA 

CCATTCCAGTCTTTGCTGTGCAGTCATGATAAGGAGTCTTATGGAAAAAGATTCTCACGG 

TCTGACATGCGGTCTTGGTGCGCTGCTGCTACTACTACTACTACTCCACTTGGAGCATTA 

GAGTCTTCTCAGAAAAGACTTTTGATATTCGATCAGTCAGGAGACCAGACTCGTCTATTA 

CAATGTCCATTTCCTCTACGGTTTCCATCTCATGCGGCTGCAGAACCAGTGAAACTCTCT 

GAGTTACAAGGTATAGAGAAAGCTTTCAAAGAAGATGGTGAAGAGTTTCACAAGAGTGAT 

GGAACAGAGTCAGAAATGCATGAAGACACTGAGGAGATCAATGCATTGCTATATTCAGAT 

GATGATTATGATGATGATTGCGAGAGTGATGATGAAGTAATGAGCACTGGTCACTCTCCT 

TATCCAAATGAAGGAGTTTGCAACAAAAGGGAATTAGAAGAAATCGATGGTCCTTGTAAA 

AGGCAGAAACTACTGGATAAGGTCAACAACATCAGCGACTTATCATCACTTGTGGGCACT 

GAGAGCTCCACACAACTCAATGGATCTTCCTTTCTTAAGGACAAAAAGCTCCCTGAATCA 

AAAACCATATCGACCAAAGAGGACACTGGTTCTGGTCTGAGCAACGAGCAGTCGAAGAAA 

GACAAGATCCGCACAGCTCTGAAAATACTCGAGAGCGTAGTCCCTGGTGCAAAAGGAAAC 

GAAGCGCTCTTACTTCTGGACGAAGCAATTGATTACCTAAAGTTGCTGAAACGAGACTTA 

ATCTCCACAGAGGTTAAGAACCAAAGCTCCACCACTCACAAGTCACCAATCTTGTTGCTT 

AAAGAGACAACATGGGGAACAAGAAATCTGCAGACAGATAAGGCGTGAAAGATTCTGACG 

AGTTAAAACGTGTGAAGTGGGTTTTTGGGTACGTATCCTTGCACCAGCTTT 

>G1137 Amino Acid Sequence (domain in AA coordinates : 2 64-314) 

MPLDKRQRDLPLGLSPQACFKDIVGRSVLPRIPLPELGKLYAAKLQARCLQPPPFQSLLC 

SHDKESYGKRFSRSDMRSWCAAATTTTTPIjGALESSQKRLLIFDQSGDQTRLLQCPFPLR 

FPSHAAAEPVKLSELQGIEKAFKEDGEEFHKSDGTESEMHEDTEEINALLYSDDDYDDDC 

ESDDEVMSTGHSPYPNEGVCNKRELEEIDGPCKRQKLLDKVNNISDLSSLVGTESSTQLN 

GSSFLKDKKLPESKTISTKEDTGSGLSNEQSKKDKIRTALKILESWPGAKGNEALLLLD 

EAIDYLKLLKRDLISTEVKNQS STTHKS PILLLKETTWGTRNLQTDKA* 

>G1181 (113.. 1012) 

CTCGATCTTTTAACCCCGATTATTACATATTACTCCTTCCTAGATTATTCTTCTTCTGCT 
TTCGTGACTTTCAGGGGACACTTTTGTTTTTATAACTTACGCTTAAAATCCTATGAATTC 
GCCGCCGGTTGACGCAATGATTACCGGAGAATCATCGTCACAAAGATCTATCCCAACGCC 
GTTTCTCACAAAAACGTTTAACCTCGTTGAAGATAGTTCCATCGACGATGTTATCTCATG 
GAACGAAGATGGTTCCTCTTTCATCGTATGGAATCCGACAGATTTCGCTAAAGATTTGCT 
TCCTAAACACTTCAAACACAACAATTTCTCTAGTTTCGTTCGTCAGCTCAACACTTACGG 
ATTCAAAAAAGTTGTACCGGATCGATGGGAGTTTTCAAACGATTTCTTTAAGAGAGGAGA 
AAAACGTCTTCTCCGTGAGATCCAACGTCGGAAAATAACAACGACGCATCAAACAGTTGT 
TGCTCCTTCGTCGGAACAACGAAACCAGACGATGGTTGTATCACCGTCAAATTCCGGGGA 
AGATAATAATAATAATCAGGTGATGTCTTCGTCTCCGTCGTCGTGGTATTGTCATCAAAC 
GAAGACGACTGGGAATGGTGGTTTATCAGTGGAGTTATTGGAAGAGAACGAGAAGCTTCG 
GAGTCAAAACATTCAGCTAAACCGTGAGCTTACTCAGATGAAATCTATCTGCGATAATAT 
CTATAGTCTCATGTCGAATTACGTCGGATCTCAGCCCACTGATCGGAGTTATTCTCCCGG 
AGGTAGTAGTAGTCAACCGATGGAGTTTTTACCGGCGAAGCGGTTTTCGGAGATGGAGAT 
TGAAGAAGAAGAAGAAGCGAGTCCGAGGTTGTTTGGTGTTCCGATTGGGTTAAAACGGAC 
GAGAAGTGAAGGTGTTCAGGTGAAGACGACGGCGGTGGTTGGGGAAAATTCCGATGAGGA 
GACGCCGTGGTTGAGACATTATAATCGAACCAATCAGAGAGTTTGTAATTAAAAACGAAC 
GGTTTAGATTTGTGGTGTAGATATGTGCGCGAAGTAGACGATTAGAGCTTTTTAAGACAA 
GCAGAGGACGTGTCCeATCTGTTTCAAGAAGTTTCTGCAATCTTGACTTCTTCTTTTAAC 
ACTTTGTGTTTTTTATTATTTAATTAATAACAATAAATGTTCTTTTTCAGTTTTGTTTT^ 
TTCAAAAATAGTTCGGCTGTTTCTAGACTTTCCTTTTTT 

>G1181 Amino Acid Sequence (domain in AA coordinates: 24-114) 

MNSPPVDAMITGESSSQRSIPTPFLTKTFNLVEDSSIDDVISWNEDGSSFIVWNPTDFAK 

DLLPKHFKHNNFSSFVRQLNTYGFKKWPDRWEFSNDFFKRGEKRLIjREIQRR 

TVVAPSSEQRNQTMWSPSNSGEDNNWNQVMSSSPSSWYCTQTKTTGNGGLSVELLEE^ 

KLRSQNIQLNRELTQMKSICDNIYSLMS1TYVGSQPTDRSYSPGGSSSQPMEFLPAKRFSE 
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MEIEEEEEASPRLFGVPIGLKRTRSEGVQVKTTAWGENSDEETPWLRHYWRTNQRVCN* 
>G1228 (63. .1139) 

GCATTTATAATTACTCACTCATCTTCTTTTCATTACATTACATACCAAACAAGAGCTCTC 

AAATGGAAAGGTTTCAAGGACACATCAACCCCTGTTTCTTCGATCGAAAACCGGATGTGA 

GAAGCCTCGAGGTTCAAGGATTTGCAGAGGCTCAAAGCTTTGCTTTCAAAGAAAAAGAGG 

AAGAAAGCTTACAAGATACAGTTCCATTTCTACAGATGCTGCAAAGTGAAGACCCCTCAT 

CGTTTTTTTCAATCAAAGAGCCAAACTTTCTGACGCTACTGTCTCTTCAAACCCTCAAGG 

AGCCTTGGGAACTCGAAAGATATCTTTCACTTGAGGATTCACAATTTCATTCACCGGTCC 

AATCTGAGACCAACCGCTTCATGGAAGGAGCCAATCAAGCTGTGTCAAGCCAAGAAATTC ■ 

CCTTTAGCCAAGCAAACATGACACTCCCTTCTTCTACCTCATCACCACTCAGTGCACATT 

CAAGACGAAAGCGCAAAATCAACCACTTGCTGCCTCAAGAAATGACTAGAGAAAAGAGAA 

AGAGGAGGAAAACAAAACCAAGTAAAAACAATGAAGAGATTGAGAATCAAAGAATAAACC 

ACATTGCTGTTGAACGAAACAGAAGACGTCAAATGAACGAACATATCAACTCTCTCCGGG 

CCCTTCTCCCACCTTCCTACATCCAACGAGGAGACCAAGCTTCCATAGTAGGAGGAGCAA 

TAAACTACGTGAAGGTCCTCGAGCAAATCATACAATCTCTCGAATCGCAAAAGAGAACGC 

AACAACAAAGTAACAGTGAGGTAGTAGAAAACGCACTTAATCATCTCTCAGGCATTTCGT 

CGAACGACCTGTGGACAACTCTTGAAGATCAAACTTGTATCCCCAAAATCGAAGCTACAG 

TGATACAAAACCATGTCAGCCTTAAAGTTCAATGTGAGAAGAAACAAGGACAACTTCTCA 

AAGGAATCATATCACTTGAAAAGCTTAAACTCACTGTTCTTCATCTCAATATCACTACTT 

CGTCTCATTCCTCTGTTTCTTATTCCTTCAACCTCAAGATGGAAGATGAGTGCGACTTAG 

AGTCAGCCGACGAGATTACGGCGGCTGTTCATCGGATTTTCGATATTCCGACAATTTGAT 

TAAACACATATAATTCCAAAAATATTAACAGCTGACAAAATGGTATCTTTGCGGCC 

>G1228 Amino Acid Sequence (domain in AA coordinates: 179-233) 

MERFQGHINPCFFDRKPDVRSLEVQGFAEAQSFAFKEKEEESLQDTVPFLQMLQSEDPSS 

FFSIKEPNFLTLLSLQTLKEPWELERYLSLEDSQFHSPVQSETNRFMEGANQAVSSQEIP 

FSQA^TLPSSTSSPLSAHSRRKRKINHLLPQEMTREKRKRRKTKPSKNNEEIENQRINH 

IAVERNRRRQMNEHINSLRALLPPSYIQRGDQASIVGGAINYVKVLEQIIQSLESQKRTQ 

QQSNSEWENALNHLSGISSNDLWTTLEDQTCIPKIEATVIQNHVSLKVQCEKKQGQLLK 

GIISLEKLKLTVLHLNITTSSHSSVSYSFNLKMEDECDLESADEITAAVHRIFDIPTI* 

>G1277 (51.. 512) 

ATTCTAAAGTCCTCCTCTCGGAAAGTAAGAGACTCAACTTCCGAGCCGCCATGGACGCCG 
GAGTAGCAGTAAAAGCTGACGTGGCAGTCAAAATGAAGAGAGAAAGACCATTCAAAGGGA 
TCAGAATGAGAAAATGGGGGAAATGGGTTGCGGAGATTCGAGAACCCAACAAGCGTTCAA 
GACTTTGGCTCGGCTCTTACTCTACTCCCGAAGCGGCGGCGCGTGCATACGACACGGCTG 
TCTTTTACCTCAGAGGACCAACTGCTACGCTCAACTTCCCGGAGCTTCTGCCGTGTACCT 
CCGCCGAGGATATGTCAGCGGCAACGATCAGGAAAAAGGCGACGGAGGTGGGAGCTCAAG 
TAGATGCGATAGGGGCGACGGTGGTGCAGAACAACAAACGCCGCCGCGTTTTTAGTCAAA 
AGCGTGACTTTGGCGGCGGGTTATTAGAGCTTGTTGACTTGAACAAGTTACCTGACCCGG 
AAAATCTCGATGATGATTTGGTGGGAAAATAGACTGAAAAATAATAATAAAATATCTTAC 
AATGGTGGCTGTAGCTATCGTACGCGGAATGCTTGGGCTTGTGTTATATGACTACGTGGT 
TACGGAAAGATTCCTCTGTTTCGTCATTGTATTAAAATTTAATCCCACAAGTCAAACATA 
CTGTACATTATTCTTAATTTAGTATTTTCTTATTAATATCTATCATTTGTTTGGTGAACA 
CCAGAATATTAGACTATTAATGTAACGAGTTTTTAATATTTCGATCATAATAACACCAAG 
CTAGTTAAAGGTTAATATCTTGTTACGAAGTCTTGAGTAAGTTCAATTGTCATATATATG 
TAACGGAAGAGGTTCGTTCGGGTCCCAAGTGAAGTGGATCAAAGGTGACTTCACATAAAA 
AATAAAAAAAA 

>G1277 Amino Acid Sequence (domain in AA coordinates: 18-85) 

MDAGVAVKADVAVIQ4KRERPFKGIRMRKWGKWVAEIREPNKRSRLWLGSYSTPEAAARAY 

DTAVFYLRGPTATLNFPELLPCTSAEDMSAATIRKKATEVGAQVDAIGATWQNNKRRRV 

FSQKRDFGGGIiLELVBLNKLPDPENLDDDLVGK* 

>G1309 (53.. 859) 

CGTCGACCTCTTAATTAAGACGACTTGAGAGAGAAAGAAAGATACGTGGAAGATGACCAA 
ATCTGGAGAGAGAC C AAAACAGAGACAGAGGAAAGGGTTATGGTCACCTGAAGAAGAC CA 
GAAGCTCAAGAGTTTCATCCrrCTCTCGTGGCCATGCTTGCTGGACCACTGTTCCCATCC^ 
AGCTGGATTGGAAAGGAATGGGAAAAGCTGCAGATTAAGGT 

AGGACTAAAGAGGGGGTCGTTTAGTGAAGAAGAAGAAGAGACCATCTTGACTTTACATTC 
TTCCTTGGGTAACAAGTGGTCTCGGATTGCAAAATATTTACCGGGAAGAACAGACAACGA 
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GATTAAGAACTATTGGCATTCCTATCTGAAGAAGAGATGGCTCAAATCTCAACCACAACT 
CAAAAGCCAAATATCAGACCTCACAGAATCTCCTTCTTCACTACTTTCTTGCGGGAAAAG 
AAATCTGGAAACCGAAACCCTAGATCACGTGATCTCCTTCCAGAAATTTTCAGAGAATCC 
AACTTCATCACCATCCAAAGAAAGCAACAACAACATGATCATGAACAACAGTAATAACTT 
GCCTAAACTGTTCTTCTCTGAGTGGATCAGTTCTTCAAATCCACACATCGATTACTCCTC 
TGCTTTTACAGATTCCAAGCACATTAATGAAACTCAAGATCAAATCAATGAAGAGGAAGT 
GATGATGATCAATAACAACAACTACTCTTCACTTGAGGATGTCATGCTCCGTACAGATTT 
TTTGCAGCCTGATCATGAATATGCAAATTATTATTCTTCTGGAGATTTCTTCATCAACAG 
TGACCAAAATTATGTCTAAGAAGAGTGAATATGATCGTAAGAGGAACATAAGCTAGTTAC 
TTGTGTTACAGC 

>G1309 Amino Acid Sequence (domain in AA coordinates: 9-114) 

mtksgerpkqrqrkglws peedqklks f ilsrghacwttvp ilaglqrngks crlrw iny 
lrpglkrgsfseeeeetiltlhsslgnkwsri7ucylpgrtdneiknywhsylkkrwlksq 
pqlksqisdltespssllscgkrnletetld^ 

nnlpklffs ew i ss snphid ys saftds khinetqdqineeevmminnnnys sledvmlr 
tdflqpdheyanyyssgdffinsdqnyv* 

>G1314 (1..990) 

atgggaagagctccgtgttgcgacaagacaaaagtgaagcgagggccttggtcgcctgaa 
gaagactctaaacttagagattacattgaaaagtatggtaatggtggaaattggatctct 
ttccccctcaaagccggtttgaggagatgtgggaagagttgtagactgaggtggctaaac 
tatttgagaccaaacataaagcatggtgacttctctgaggaagaagacaggatcattttt 
agtctcttcgctgccataggaagcaggtggtcaataatagcagctcatctaccgggacga 
acagacaacgacataaaaaactattggaacacaaagctaaggaagaaactcttgtcttct 
tcctctgattcatcatcatcagccatggcttctccttatctaaaccctatttctcaggat 
gtgaaaagaccaacctcaccaacaacaatcccatcttcttcttacaatccgtatgctgaa 
aaccctaatcaatacccaacaaaatccctcatctccagcatcaatggcttcgaagctggt 
gacaaacagataatttcctatattaaccctaattatcctcaagatctctatctctcggac 
agcaacaacaacacctcgaacgcaaatggtttcttgctcaaccacaatatgtgtgatcag 
tacaagaaccacaccagtttttcttcagacgtcaatgggataagatcagagattatgatg 
aagcaagaagagataatgatgatgatgatgatagaccaccacattgaccagaggacaaaa 
gggtacaatggggaattcacacaagggtattataattactacaatgggcatggggatttg 
aagcaaatgattagtggaacaggcactaattctaacataaacatgggtggttcaggttca 
tcttctagttcgataagcaacctagctgagaacaaaagcagtggtagcctcctactagaa 

TACAAATGCTTGCCCTATTTCTACTCCTAG 

>G1314 Amino Acid Sequence (domain in AA coordinates: 14-116) 
MGRAPCCDKTKVKRGPWS PEEDS KLRDYI EKYGNGGNWI S FPLKAGLRRCGKSCRLRWLN 
YLRPNIKHGDFSEEEDRIIFSLFAAIGSRWSIIAAHLPGRTDNDIKim^TKLRKKLLSS 
SSDSSSSAMASPYLNPISQDVKRPTSPTTIPSSSYNPYAENPNQYPTKSLISSINGFEAG 
DKQIISYINPKTYPQDLYLSDSNNNTSNANGFIjLNHNMCDQYKNHTSFSSDVNGIRSEIMM 
KQEE IMl^MMIDHH IDQRTKGYNGEFTQGYYNYYNGHGDLKQM I SGTGTNSNINMGGSGS 
SSSSISNLAENKSSGSLLLEYKCLPYFYS* 
>G1317 (1..849) 

ATGGGAAGATCACCTTGTTGTGATAAAAATGGAGTGAAGAAGGGACCATGGACTGCTGAG 

GAGGATCAGAAACTCATCGATTATATTCGATTTCATGGTCCTGGCAATTGGCGTACGCTC 

CCCAAAAATGCTGGACTCCATAGATGTGGAAAAAGCTGCCGTCTTCGATGGACCAATTAT 

CTAAGACCGGACATCAAGAGAGGAAGATTCTCGTTCGAGGAAGAAGAAACTATCATTCAG 

CTACACAGTGTTATGGGAAACAAGTGGTCAGCAATAGCCGCTCGTCTACCAGGGAGGACC 

GATAACGAAATAAAAAACCATTGGAACACTCACATCCGCAAGAGACTTGTAAGGAGTGGT 

ATCGACCCTGTTACTCATTCTCCACGCCTTGATCTTCTTGATTTGTCCTCACTTTTGAGT 

GCACTTTTCAACCAGeCAAACTTTTCAGCAGTTGCAACACATGCGTCTTCTCT 

CCTGATGTATTGAGGTTGGCCTCTCTACTACTGCCACTTCAAAACCCTAATCCAGTTTAC 

CCATCGAACCTCGACCAAAATCTTCAAACTCCAAATACATCATCAGAATCGTCTCAAC^ 

CAAGCTGAGACTAGTACAGTCCCAACAAACTATGAAACTTCATCATTGGAGCCTATGAAC 

GCAAGACTCGACGACGTTGGTCTTGCAGATGTATTACCACCTTTGTC^GAGAGTTTTGAC 

TTAGACTCGCTCATGTCAACGCCAATGTCTTCTCCACGACAAAATAGCATTGAAGCAGAA 

ACCAACTCCAGCACTTTCTTCGACTTTGGAATTC 

ATGTTTTAA 
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>G1317 Amino Acid Sequence (conserved domain in AA coordinates : 13-118 ) 

MGRS PCCDKNG VKKG P WTAEEDQ KL I D Y I RFHG PGNWRTL PKNAGLHRCGKS CRLRWTN Y 

LRPDIKRGRFSFEEEETIIQLHSVMGNKWSAIAARLPGRTDNEIKNHWNTHIRKRLVRSG 

I D P VTHS PRLDLLDL S SLLS ALFNQPNFS AVATHAS S LLNPD VLRLAS LLLPLQNPNP VY 

PSNLDQNLQTPNTSSESSQPQAETSTVPTNYETSSLEPMNARLDDVGLADVLPPLSESFD 

LDSLMSTPMSSPRQNSIEAETNSSTFFDFGIPEDFILDDFMF* 

>G1323 (49.. 870) 

AAGAGGGAATCTCAAAAGTGTGTGTCTGTGAGAGAGGAGAGAGAGAATATGGGCAAAGGA 
AGAGCACCATGTTGTGACAAAACCAAAGTGAAGAGAGGACCATGGAGCCATGATGAAGAC 
TTGAAACTCATCTCTTTCATTCACAAGAATGGTCATGAGAATTGGAGATCTCTCCCAAAG 
CAAGCTGGATTGTTGAGGTGTGGCAAGAGTTGTCGTCTGCGATGGATTAATTACCTCAGA 
CCTGATGTGAAACGTGGCAATTTCAGTGCAGAGGAAGAAGACACCATCATCAAACTTCAC 
CAGAGCTTTGGTAACAAGTGGTCGAAGATTGCTTCTAAGCTGCCTGGAAGAACAGACAAT 
GAGATCAAGAATGTGTGGCATACACATCTCAAGAAAAGATTGAGCTCGGAAACTAACCTT 
AATGCCGATGAAGCGGGTTCAAAAGGTTCTTTGAATGAAGAAGAGAACTCTCAAGAGTCA 
TCTCCAAATGCTTCAATGTCTTTTGCTGGTTCCAACATTTCAAGCAAAGACGATGATGCA 
CAGATAAGTCAAATGTTTGAGCACATTCTAACTTATAGCGAGTTTACGGGGATGTTACAA 
GAGGTAGACAAACCAGAGCTGCTGGAGATGCCTTTTGATTTAGATCCTGACATTTGGAGT 
TTCATAGATGGTTCAGACTCATTCCAACAACCAGAGAACAGAGCTCTTCAAGAGTCTGAA 
GAAGATGAAGTTGATAAATGGTTTAAGCACCTGGAAAGCGAACTCGGGTTAGAAGAAAAC 
GATAACCAACAACAACAACAGCATAAACAGGGAACAGAAGATGAACATTCATCATCACTC 
TTGGAGAGTTACGAGCTCCTCATACATTAATGAAGCCATAAAGCAAGTCATTTTCACCTT 
GAAAATGGAATTATTAGCTAACTTATTGGCATTATTAGTATATAAGCAAGATCAGATAGG 
CGCATGTAGTAGCAACAACGAAGAAACGTCGAATTGTAGACAAAATGTAGATATTACAGA 
GTTGAAAGATTGTATTTTGCAAATGATTGCTTTGTAGTGAAATGAAGTTATCACAAAAAA 
AAAAAAAA 

>G1323 Amino Acid Sequence (domain in AA coordinates: 15-116) 

MGKGRAPCCDKTKVKRGPWSHDEDLKLISFIHKNGHE1TWRSLPKQAGLLRCGKSCRLRWI 

NYLRPDVKRGNFSAEEEDTIIKLHQSFGNKWSKIASKLPGRTD^ 

ETNLNADEAGSKGSLNEEENSQES SPNASMS FAGSNI SSKDDDAQISQMFEHILTYSEFT 
GMLQEVTDKPELLEMPFDLDPDIWSFIDGSDSFQQPENRALQESEEDEVDKWFKHLESELG 
LEENDNQQQQQHKQGTEDEHSSSLLESYELLIH* 
>G1332 (1..606) 

ATGGAATGCAAAAGAGAAGAAGGGAAGTCTTACGTGAAGAGAGGGTTGTGGAAACCAGAA 

GAAGATATGATATTAAAAAGCTATGTTGAGACTCATGGTGAAGGAAACTGGGCAGACATT 

TCTCGTAGATCCGGGTTGAAGAGAGGAGGAAAAAGCTGTAGGCTGAGATGGAAGAACTAT 

CTAAGACCAAATATCAAAAGAGGAAGCATGTCACCACAAGAACAAGACCTTATCATCCGC 

ATGCATAAGCTTCTTGGAAACAGATGGTCGTTGATCGCTGGTCGCCTTCCAGGTCGTACT 

GACAATGAAGTGAAGAACTACTGGAATACTCATTTGAACAAGAAACCTAATTCCCGAAAA 

CAGAATGCACCTGAATCAATCGTCGGCGCCACTCCTTTCACGGATAAGCCAGTTATGTCT 

ACAGAACTGAGAAGAAGCCATGGAGAAGGAGGAGAAGAGGAGAGCAATACCTGGATGGAG 

GAGACCAACCACTTTGGCTATGACGTCCACGTAGGATCTCCCTTGCCACTTATTTCCCAC 

TACCCAGACAACACTCTCGTGTTTGACCC 

CTTTAG 

>G1332 Amino Acid Sequence (conserved domain in AA coordinates : 13-116) 

MECKREEGKSYVTTCGLWKPEEDMILKSYV^THGEGN^^ 

LRPNIKRGSMSPQEQDLIIRMHKLLGiraWSM 

QNAPESIVGATPFI©KPvT4STEIjRRSHGEGGEEESNTWMEETO 

YPDNTLVFDPCFSFTDFFPLL* 

>G1334 (76.. 885) 

ATAGCTCCCAACTAATAGGAATCTCAAGCTTCTCACTCTCTCTTGTTT^ 

TTTGGAACATAAGCTATGCAAACTGAGGAGCTTTTGTCGCCACCACAGACTCCTTGGTGG 

AATGCTTTTGGATCTCAGCCGTTGACTAC^GAGAGCCITTCCGGCGAAGCTTCT 

TTCACCGGAGTTAAGGCAGTTACTACGGAGGCAGAACAAGGTGTGGTGGATAAACAAACT 

TCTACAACTCTCTTCACTTTCTCACCTGGTGGTGAAAAGAGTTCAAGAGATGTGCCAAA 

CCTCATGTTGCTTTCGCGATGCAATCAGCTTGCT^ 

ATGTACACAAAGCATCCTCATGTTGAACAATACTATGGAGTTGTTTCAGCATACGGAT^ 
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CAGAGGTCTTCGGGCCGAGTAATGATTCCACTGAAGATGGAGACAGAAGAAGATGGTACC 
ATCTATGTGAACTCAAAGCAGTACCATGGAATTATCAGGCGACGCCAGTCCCGAGCAAAG 
GCTGAAAAACTGAGTAGATGCCGTAAGCCATATATGCATCACTCACGCCATCTCCATGCT 
ATGCGCCGTCCTAGAGGATCTGGCGGGCGTTTCTTGAACACCAAGACAGCTGATGCGGCT 
AAGCAGTCTAAGCCGAGTAATTCTCAGAGTTCTGAAGTCTTTCATCCGGAAAATGAGACC 
ATAAACTCATCGAGGGAAGCAAATGAGTCAAATCTCTCGGATTCTGCAGTTACAAGTATG 
GATTACTTTCTAAGTTCGTCGGCTTATTCTCCTGGTGGCATGGTCATGCCTATCAAGTGG 
AATGCAGCAGCAATGGATATTGGCTGCTGCAAACTTAATATATGATCAGCAGATAGGGGA 
CAAGACATGATTGGTCACCAGTCCTTTTGTCTTGTCCCTTATCTTTCAGCCAAACGGAAA 
GAGAACTTGTGTCTTGGAAAAAAGACATTGAGTTTCCTTGGTTTATAAGATTGGTCCTTT 
TACCATCCGTTTGGCTGTAAACAGGCAAATCATCTTTGGCTCATGCTTCATCAAGTTCTT 
ATCTTCGTCTGTTTTCTTCTACGCATCTTCATAAGATCTCTGAACTAGTGAATAACATTT 
CCTAGCATCATGTTTCAACTAGTGTGTGTTGTAAGAAACTCTGCCTTATTTCCAGATGAT 
GTATTGTGTGTAACGTGTTTATGAAACAAACGTAAGACTTTCAAGTTAAAAAAAAAAAAA 
AAAAAAAAAAAAAA 

>G1334 Amino Acid Sequence (domain in AA coordinates: 18-190) 

MQTEELLSPPQTPWWNAFGSQPLTTESLSGEASDSFTGVICAVTTEAEQGWDKQTSTTLF 

TFSPGGEKSSRDVPKPHVAFAMQSACFEFGFAQPMMYTKHPHVEQYYGWSAYGSQRSSG 

RVMIPLKMETEEDGTIYWSKQYHGIIRRRQSRAKAEKLSRCRKPYMHHSRHLHAMRRPR 

GSGGRFLNTKTADAAKQSKPSNSQSSEVFHPENETINSSREANESNLSDSAVTSMDYFLS 

SSAYSPGGMVMPIKWNAAAMDIGCCKLNI* 

>G1381 (32.. 802) 

CAGCTTTAACACTACTCTCTCTCTCTCTCAAATGGGAAAACAAATCAACATAGAGAGTAG 

TGCTACTCATCATCAAGACAATATTGTTTCCGTTATAACAGCCACGATATCCTCCTCCTC 

CGTCGTAACGTCTTCGTCAGACTCTTGGTCTACCTCCAAAAGATCGTTAGTGCAAGACAA 

TGACTCCGGAGGGAAACGGCGGAAGAGCAACGTTAGTGATGATAACAAGAATCCGACGTC 

GTATAGAGGAGTGAGGATGAGGAGTTGGGGAAAATGGGTGTCGGAGATTAGAGAGCCGAG 

GAAGAAATCAAGAATATGGCTTGGCACTTATCCAACGGCAGAGATGGCAGCTCGTGCTCA 

TGATGTGGCGGCTTTAGCTATTAAAGGCAACTCCGGTTTTCTTAATTTCCCTGAATTATC 

CGGTTTGCTTCCTCGTCCGGTTAGCTGCTCTCCTAAGGATATACAAGCTGCAGCTACCAA 

AGCCGCCGAAGCAACCACGTGGCACAAACCGGTTATCGATAAGAAATTAGCTGATGAGCT 

AAGCCACTCTGAGTTGTTGTCTACCGCTCAGTCTTCGACTTCTAGTAGTTTCGTGTTTTC 

TTCGGACACGTCGGAGACTTCTAGTACGGACAAGGAAAGCAACGAAGAGACGGTGTTTGA 

TTTGCCGGACCTTTTCACGGACGGGCTTATGAACCCAAACGATGCGTTTTGTTTATGCAA 

CGGCACCTTTACGTGGCAGCTTTACGGAGAGGAGGATGTAGGGTTCAGGTTTGAAGAGCC 

GTTTAATTGGCAAAATGACTAAACCGCCCTCCACTTGCTTACTGTAATTACTAACATATA 

ATTTTCTTGATAAAGAAC^TATATTTCC^TTACGGTATTAACTAATCTTTTCTATCCTTT 

TCTCTTTTCTTGTTTCTACATCTGAGTATATTGTCACTATGTGAAAAAATTGATCTCGTT 

TTGAATATTTACTTTTCAAAATTGAAGTAACGCAAGTGATTGATAAAAAAAAAAAAA 

>G1381 Amino Acid Sequence (domain in AA coordinates: TBD) 

MGKQINIESSATHHQDNIVSVITATI SSS SWTS S SDSWSTSKRSLVQDNDSGGKRRKSN 

VSDDNKNPTSYRGVRMRSWGKWSEIREPRKKSRIWLGTYPTAEMAARAHDVAAIA 

SGFLNFPELSGLLPRPVSCSPKDIQAAATKAAEATTWHKPVIDKKLADELSHSELLSTAQ 

SSTSSSFVFSSDTSETSSTDKESNEETVFDLPDLFTDGLl^PNDAFCLCNGTFTWQLYGE 

EDVGFRFEEPFNWQND* 

>G1382 (90.. 1763) 

CTCTCATTTCGCCATAGCTGAGAGCTTCTTCTACTTTCCCTTAGCTTCTTTTTTCCTTCA 

TTTTTGTTCTACCC^TGCGAATCTCTGAAATGAACCCTCAAGCTAATGACCGGAAGGAGT 

TTCAGGGAGATTGTTCGGCGACGGGAGATCTCACGGCAAAGCACGATTCAGCTGGAGGAA 

ACGGAGGTGGAGGTGeTAGGTATAAGCTGATGTCACCGGCCAAGCTTCCGATCTCGAGGT 

CGACTGATATCACGATTCCTCCTGGGTTGAGTCCGACTTCGTTTTTGGAATCTCCTGTTT 

TCATCTCG^CATCAAGCCAGAACCTTCCCCTACTACTGGTTCTTTGTTCAAGCCTCG 

CAGTGCACATTTCTGCTAGCTCAAGTTCTTATAC^ 

TTACTGAGCAGAAGTCCAGTGAATTTGAGTTCAGACCTCCTGCATCAAATA 
GAGAGCTTGKSCAAGATTAGAAGTGAGCGACC&GTACA^ 

CCTCACACTCACCrTCTTCGATCAGTGATGCTGCAGGTTCCTCAAGTGAGCTAAGCCGGC 
CAACTCCTCCI^GTCAGATGACACCAACGAGCTCAGATATTCCGGCTGGATCTGATCAAG 
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AGGAATCAATCCAGACTTCCCAAAATGACTCCAGAGGAAGCACTCCATCCATCTTGGCTG 
ATGATGGTTATAACTGGAGAAAATATGGTCAAAAGCATGTCAAAGGGAGTGAATTTCCCC 
GGAGCTATTATA7^ATGTACACATCCTAATTGTGAAGTGAAAAAGTTATTTGAAAGATCTC 
ATGATGGGCAGATCACCGATATTATATACAAGGGTACACATGACCATCCTAAACCTCAAC 
CTGGTCGCCGAAACTCTGGTGGTATGGCTGCACAAGAAGAAAGGCTAGACAAGTATCCTT 
CTTCAACTGGCCGAGATGAGAAGGGATCTGGCGTCTACAACTTGTCTAACCCCAATGAAC 
AAACTGGTAACCCTGAAGTACCTCCTATCTCAGCATCTGACGATGGTGGAGAAGCGGCAG 
CGTCAAATAGGAATAAAGATGAGCCGGACGATGATGATCCATTCTCAAAACGGAGGAGGA 
TGGAGGGTGCGATGGAAATAACTCCACTAGTGAAACCCATCCGGGAGCCTCGGGTTGTTG 
TTCAAACTCTGAGTGAGGTTGACATTCTGGATGATGGTTATAGATGGCGCAAATATGGGC 
AGAAAGTCGTAAGGGGGAACCCAAATCCCAGGAGCTACTACAAATGCACAGCTCATGGAT 
GCCCAGTGAGAAAACACGTGGAGAGAGCATCACATGATCCAAAAGCTGTAATAACAACAT 
ACGAAGGCAAACACGATCATGATGTTCCCACTTCAAAGTCTAGCAGCAATCACGAAATCC 
AGCCTCGGTTCAGACCAGATGAAACAGACACCATCAGCCTCAATCTTGGTGTTGGAATCT 
CATCTGATGGACCTAACCACGCTTCCAACGAACATCAGCACCAGAATCAACAACTTGTCA 
ACCAAACTCACCCAAATGGAGTCAATTTCAGGTTTGTTCATGCTAGTCCCATGTCATCCT 
ACTATGCTAGCTTAAATAGCGGTATGAATCAGTACGGCCAGAGAGAAACAAAGAACGAGA 
CTCAAAATGGTGACATCTCGTCCTTGAACAATTCATCTTACCCATATCCGCCCAACATGG 
GGAGAGTACAATCGGGTCCGTAAAACAAAAAGTAAGCAACATTATGTACGGGATCTTCTT 
AGGTTAGGAATGGGACGAGGCCTTGTTCTATATAATTCCTATTTCTTCACAGAGAGCTGA 
TCTTGATTCAAACTATCTCCACCATATATATTTGTTTGTGTCACCTGTATTGAGTTCCAA 
AAATGTTATGTAAAAATACACAACAAGATGTTAATGCTTTTATTTAAACAAGAAACAGCA 
ATATTACTACAAAAAAAAAAAAAAAAAA 

>G1382 Amino Acid Sequence (domain in AA coordinates: 210-266, 385-437) 

MNPQANDRKSFQGDCSATGDLTAKHDSAGGNGGGGARYKLMSPAKLPISRSTDITIPPGL 

SPTSFLESPVFISNIKPEPSPTTGSLFKPRPVHISASSSSYTGRGFHQNTFTEQKSSEFE 

FRPPASNMVYAELGKIRSEPPVHFQGQGHGSSHSPSSISDAAGSSSELSRPTPPCQMTPT 

SSDIPAGSDQEESIQTSQITOSRGSTPSILADDGYWWRKYGQKHVKGSEFPRSYyKCTHPN 

CEVTQCLFERSHDGQITDIIYKGTHDHPKPQP 

GVYNL SNPNEQTGNPE VPP I S ASDDGGE AAASNRNKDE PDDDDPFS KRRRMEGAME ITPL 

VKPIREPRVWQTLSEVDILDDGYRWRKYGQKVVRGNPNPRSYYKCTAHGCPVRKHVERA 

SHDPKAVITTYEGKHDHDVPTSKSSSraEIQPRFRPDETDTISLl^GVGISSDGPNHASN 

EHQHQNQQLVNQTHPNGVNFRFVHASPMSSYYASLNSGMNQYGQRETKNETQNGDISSLN 

NSSYPYPPNMGRVQSGP* 

>G1435 (8.. 904) 

GTGAAACATGGGGAAGGAAGTTATGGTGAGCGATTACGGTGACGACGACGGAGAAGACGC 

CGGCGGCGGCGATGAATATAGGATTCCGGAATGGGAAATTGGTTTACCCAACGGAGATGA 

TTTGACTCCGTTATCTCAATATCTAGTCCCGTCGATTCTCGCGTTAGCTTTCAGCATGAT 

CCCAGAACGAAGCCGTACAATTCACGACGTCAATCGCGCGTCGCAAATCACGCTCTCTTC 

GTTGAGAAGCAGTACCAATGCTTCGTCTGTGATGGAGGAGGTCGTGGATCGAGTTGAATC 

GAGTGTTCCAGGATCAGATCCGAAGAAACAGAAGAAATCGGATGGTGGTGAAGCAGCGGC 

GGTGGAGGATTCCACGGCGGAGGAAGGAGACTCCGGGCCTGAAGACGCGTCTGGGAAGAC 

ATCGAAACGACCGCGTTTAGTGTGGACACCGCAGCTACACAAGAGATTTGTGGACGTTGT 

GGCTCATCTAGGGATTAAAAACGCAGTGCCGAAGACGATTATGCAGCTGATGAACGTGGA 

AGGACTTACTCGTGAGAACGTTGCGTCTCATTTGCAGAAATATAGGCTTTACCTTAAACG 

GATTCAAGGATTGACGACGGAAGAAGATCCTTATTCGTCGTCGGATCAGCTCTTCTCTTC 

AACGCCGGTTCCTCCACAGAGCTTTCAAGACGGCGGAGGAAGTAACGGAAAGTTGGGGGT 

TCCGGTTCCGGTTG€GTCGATGGTGCCTATTCCAGGCTATGGGAATCAAATGGGTATGCA 

AGGATATTATCAACAGTATAGTAACCATGGCAATGAATCAAACCAATATATGATGCAGCA 

GAATAAGTTTGGAACAATGGTGACATATCCTTCTGTTGGTGGTGGTGACGTGAATGACAA 

GTAAATGGATCTTAAAGGTCTATAATTTGCTCTACAGAGAGATACTGGTTCTTGGCTTAT 

GGTTTATTTTCCC^CTTCATGAGGTTGTTGTGACTTTTAATTCTCCA^ 

AGTCl^TATTGCCTTTGTATAGAAAATGATTTCGAGAAAATCACT 

GTTGGAGGATGAAGCCTTCTATGAATGATTTAGTTTCCT 

GTAATAAAGCCITCTTTTGCTCATCGCTTGTAGTCTTCTTAAATTCAAGACAGCGTCA^ 

TGTTTGTTCGGTTATGTTAATTGTTTCTTTCTTTGGATAATGAAGATA^ 

ATGTCTCCTCACTTTGATAAA 
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>G1435 Amino Acid Sequence (domain in AA coordinates: 146-194) 

MGKE VMVSD YGDDDGEDAGGGDE YR I PEWE I GLPNGDDLTPLSQ YLVP S ILALAFSM I PE 

RSRTIHDVNRASQITLSSLRSSTNASSVMEEWDRVESSVPGSDPKKQKKSDGGEAAAVE 

DSTAEEGDSGPEDASGKTSKRPRLVWTPQLHKRFVDWAHLGIKNAVPKTIMQLMNVEGL 

TRENVASHLQKYRLYLKRIQGLTTEEDPYSSSDQLFSSTPVPPQSFQDGGGSNGKLGVPV 

PVPSMVPIPGYGNQMGMQGYYQQYSNHGNESNQYMMQQNKFGTMVTYPSVGGGDVNDK* 

>G1537 (1..783) 

ATGGAAAACGAAGTAAACGCAGGAACAGCAAGCAGTTCAAGATGGAACCCAACGAAAGAT 
CAGATCACGCTACTGGAAAATCTTTACAAGGAAGGAATACGAACTCCGAGCGCCGATCAG 
ATTCAGCAGATCACCGGTAGGCTTCGTGCGTACGGCCATATCGAAGGTAAAAACGTCTTT 
TACTGGTTCCAGAACCATAAGGCTAGGCAACGCCAAAAGCAGAAACAGGAGCGCATGGCT 
TACTTCAATCGCCTCCTCCACAAAACCTCCCGTTTCTTCTACCCCCCTCCTTGCTCAAAC 
GTGGGTTGTGTCAGTCCGTACTATTTACAGCAAGCAAGTGATCATCATATGAATCAACAT 
GGAAGTGTATACACAAACGATCTTCTTCACAGAAACAATGTGATGATTCCAAGTGGTGGC 
TACGAGAAACGGACAGTCACACAACATCAGAAACAACTTTCAGACATAAGAACAACAGCA 
GCCACAAGAATGCCAATTTCTCCGAGTTCACTCAGATTTGACAGATTTGCCCTCCGTGAT 
AACTGTTATGCCGGTGAGGACATTAACGTCAATTCCAGTGGACGGAAAACACTCCCTCTT 
TTTCCTCTTCAGCCTTTGAATGCAAGTAATGCTGATGGTATGGGAAGTTCCAGTTTTGCC 
CTTGGTAGTGATTCTCCGGTGGATTGTTCTAGCGATGGAGCCGGCCGAGAGCAGCCGTTT 
ATTGATTTCTTTTCTGGTGGTTCTACTTCTACTCGTTTCGATAGTAATGGTAATGGGTTG 
TAA 

>G1537 Amino Acid Sequence (domain in AA coordinates: 14-74) 

MENEVNAGTASSSRWNPTICDQITLLENLYKEGIRTPSADQIQQITGRLRAYGHIEGKNVF 

YWFQNHKARQRQKQKQERI^YFNRLLHKTSRFFYPPPCSNVGCVSPYYLQQASDHHMNQH 

GSVYTNDLLHRNNVMIPSGGYEKRTVTQHQKQLSDIRTTAATRMPISPSSLRFDRFALRD 

NCYAGEDINVNSSGRKTLPLFPLQPLNASNADGMGSSSFALGSDSPVDCSSDGAGREQPF 

IDFFSGGSTSTRFDSNGNGL* 

>G1545 (67.. 729) 

(^TCACCAATCTTTTGAATCTAAGAGAGAGAAGAAGAAGAAGGTCTAGAGAACGAAAAGA 

AGAAACATGAATAACCAGAATGTAGATGATCATAATCTTCTACTCATTTCTCAATTGTAC 

CCTAATGTCTATACTCCATTAGTACCACAACAAGGAGGAGAAGCAAAACCAACACGGCGG 

AGGAAAAGGAAGAGCAAGAGTGTTGTGGTGGCAGAGGAGGGTGAAAACGAAGGCAATGGG 

TGGTTTAGAAAGAGAAAATTGAGTGATGAGCAAGTAAGAATGTTGGAGATTAGCTTTGAA 

GACGATCATAAGCTTGAATCCGAGAGGAAAGATCGGCTTGCTTCTGAGTTAGGGCTTGAT 

CCTCGTCAAGTCGCCGTCTGGTTCCAAAACCGCCGTGCACGGTGGAAGAACAAACGAGTC 

GAGGATGAATACACTAAACTCAAGAATGCATACGAAACCACCGTCGTTGAGAAATGTCGT 

CTTGATTCTGAGGTTATTCACCTAAAGGAACAACTTTACGAGGCTGAAAGAGAGATCCAA 

CGGCTTGCAAAAAGAGTTGAAGGAACTTTAAGTAACAGTCCTATCTCATCCTCTGTGACC 

ATTGAAGCCAATCATACGACACCGTTTTTTGGAGATTACGACATCGGATTTGACGGTGAG 

GCTGACGAGAACTTGCTCTACTCGCCAGATTACATTGATGGATTAGACTGGATGAGCCAA 

TTTATGTAAAAAACTATAAGCTAATCTATTTTCAGTCGTAGTATAG 

>G1545 Amino Acid Sequence (domain in AA coordinates: 54-117) 

MNNQNVDDHNLLL I SQLYPNVYTPLVPQQGGEAKPTRRRKRKS KS VWAEEGENEGNGWF 

RKRKLSDEQVRMLEISFEDDHKLESERKDRLASELGLDPRQVAWFQNRRARWKNKllVE^ 

E YTKLKNAYETTWEKCRLDSE VIHLKEQLYE AERE I QRLAKRVEGTLSNS PISS S VTIE 

ANHTTPFFGDYDIGFDGEADENLLYSPDYIDGLDWMSQFM* 

>G1641 (1..867) 

ATGGAGGTTATGAGACCGTCGACGTCACACGTGTCAGGTGGGAACTGGCTCATGGAGGAA 
ACTAAGAGCGGCGTCGCAGCTTCTGGTGAAGGTGCCACGTGGACGGCGGCAGAGAACAAG 
GGATTCGAGAATGCTTTGGCGGTTTACGACGACAACACTCCTGATCGGTGGCAGAAGGTG 
GCTGCGGTGATTCCGGGGAAGACAGTGAGTGACGTAATTAGACAGTATAACGATTTGGAA 
GCTGATGTCAGCAGCATCGAGGCCGGTTTAATCCCGGTCCCCGGTTACATCACCTCGCCG 
CCTTTCACTCTAGATTGGGCCGGCGGCGGTGGCGGATGTAACGGGTTTAAACCGGGTCAT 
CAGGTTTGTAATAAACGGTCGCAGGCCGGTAGATCGCCGGAGCTGGAGCGGAAGAAAGGC 
GTTCCTTGGACGGAGGAAGAACACAAGCTATTTCTAATGGGTTTGAAGAAATATGGGAAA 
GGAGATTGGAGAAACATATCTCGGAACTTTGTGATAACGCGA^ 

AGCCACGCC€!AAAAGTACTTCATCCGGCAACTTTCCGGCGGCAAGGACAAGAGACGAGCA 
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AGCATTCACGACATAACCACCGTAAATCTCGAAGAGGAGGCTTCTTTGGAGACCAATAAG 
AGCTCCATTGTTGTTGGAGATCAGCGTTCAAGGCTAACCGCGTTTCCTTGGAACCAAACG 
GACAACAATGGAACACAGGCAGACGCTTTCAATATAACGATTGGAAACGCTATTAGTGGC 
GTTCATTCATACGGCCAGGTTATGATTGGAGGGTATAACAATGCAGATTCTTGCTATGAC 
GCCCAAAACACAATGTTTCAACTATAG 

>G1641 Amino Acid Sequence (domain in AA coordinates: 139-200) 

MEVMRPSTSHVSGGNWLMEETKSGVAASGEGATWTAAENKAFENALAVYDDNTPDRWQKV 

AAVIPGKTVSDVIRQYNDLEADVSSIEAGLIPVPGYITSPPFTLDWAGGGGGCNGFKPGH 

QVCNKRSQAGRSPELERKKGVPWTEEEHKLFLMGLKKYGKGDWRNISRNFVITRTPTQVA 

SHAQKYFIRQLSGGKDKRRASIHDITTVNLEEEASLETNKSSIWGDQRSRLTAFPWNQT 

DNNGTQADAFNITIGNAISGVHSYGQVMIGGYNWADSCYDAQNTMFQL* 

>G165 (19.. 699) 

CTTCAAAACATCTAAAAAATGGTGAAAAAAACTCTTGGTCGTAGAAAGGTAGAGATAGTG 
AAAATGACTAAGGAATCAAACCTTCAAGTCACATTTTCCAAGAGAAAAGCTGGTCTTTTT 
AAGAAGGCTAGTGAATTTTGCACATTATGTGATGCAAAAATTGCGATGATCGTGTTTTCA 
CCAGCTGGAAAAGTATTTTCTTTTGGTCATCCAAATGTTGATGTTCTGCTTGACCACTTT 
CGAGGGTGTGTTGTAGGACACAACAACACAAACCTTGATGAAAGCTACACAAAGCTTCAT 
GTTCAAATGCTCAACAAATCCTACACTGAGGTGAAGGCGGAAGTAGAAAAAGAACAAAAG 
AATAAGCAGTCGCGGGCTCAAAATGAAAGAGAAAACGAAAACGCTGAGGAGTGGTGGAGT 
AAGTCTCCATTAGAACTCAACTTAAGTCAATCAACCTGTATGATACGTGTTCTTAAAGAT 
TTGAAGAAGATAGTTGATGAAAAAGCAATTCAATTAATCCATCAAACAAACCCAAACTTC 
TATGTTGGAAGTTCTAGCAATGCTGCTGCTCCAGCAACTGTTAGTGGTGGTAATATCTCC 
ACAAACCAGGGGTTCTTTGATCAAAACGGAATGACGACTAATCCTACTCAAACACTTCTG 
TTTGGATTTGATATTATGAATCGCACACCAGGAGTTTAAATAAGTCTATCCTCATTATGG 
GTCTTGGTACTATAAGTTCATCTCTCTCGTTGTTGACTTTTTAAGTCTCCAATAGTTTGT 
TGTG 

>G165 Amino Acid Sequence (conserved domain in AA coordinates : 7-62) 

lWKKTLGRRK\flSIVKMTKESl^ 

SFGHPNVDVLLDHFRGCWGHNNTNLDESYTKLHVQM 

QNERENENAEE WWS KS PLELNLSQS TCMI RVLKDLKKI VDEKAI QL IHQTNPNF YVGS S S 
NAAAPATVSGGNI STNQGFFDQNGMTTNPTQTLLFGFDIMNRTPGV* 
>G1652 (77.. 1078) 

AGCAAGTCCAAATCTCCCTCTCTCTCTCTCTATCTATCTCTCTATAGAAGATTTTTTAAC 

TAAGAAGCTAGCGATC^TGGCCACAGCGATGAACGTTTTCTCTACCAAATGGTCCTCCGA 

ATTGGATATAGAAGAATATAGTATCATCCACCAATTCCACATGAACTCACTCGTCGGAGA 

TGTTCCACAGTCTCTCTCATCTCTTGATGATACCACCACTTGTTATAACCTTGATGCTTC 

TTGTAATAAAAGTTTGGTAGAAGAAAGACCTTCAAAGATCCTCAAGACCACTCACATATC 

ACCAAACTTACATCCTTTTTCTTCTTCTAATCCTCCTCCTCCAAAGCACCAGCCCTOT 

TAGGATTCTTTCTTTTGAAAAGACAGGTTTACAT^^ 

AATATTTAGCCCCAAGGACGAAGAAATTGGATTACCAGAGCATAAGAAAGCCGAGCTGAT 

AATAAGAGGGACAAAGAGAGCTCAATCCTTGACTCGAAGCCAATCAAATGCTCAAGATCA 

CATACTGGCAGAGAGAAAACGGAGAGAGAAGCTTACTCAAAGATTTGTAGCTCTTTCCGC 

GCTAATTCCTGGCCTAAAGAAGATGGACAAGGCTTCTGTGTTGGGAGATGCAATAAAGCA 

TATAAAGTACCTCCAAGAGAGTGTGAAAGAGTATGAGGAACAAAAGAAGGAAAAGACAAT 

GGAATCAGTGGTTCTTGTAAAGAAGTCTAGTCTGGTTTTAGATGAAAATCATCAACCATC 

ATCATCATCTTCCTCAGATGGAAATCGCAATAGCTCGAGCTCAAATCTTCCAGAAATAGA 

AGTTAGGGTTTCAGGAAAAGATGTTCTTATTAAGATCCTATGCGAGAAGCAAAAGGGTAA 

TGTGATCAAGATTA5GGGGGAGATTGAAAAGCTTGGTTTGTCTATCACCAACAGCAATGT 

CTTGCCCTTTGGACCCACTTTTGACATCTCTATTATCGCTCAGAAGAATAACAATTTTGA 

TATGAAAATCGAGGATGTTGTGAAGAACTTGAGTTTTGGCTTATCAAAGCTCACTTAATT 

GGTTTCACGTTACATACATATACACATCCAT^ 

ATCAGTTTTTCCATGAAAGTGGTTTTTTAGTTGTTAAGTTTGTTG 

GTCATTTAAAGATCCTTGTTCTTGTGTTGTTAAGTGTGCTTTAAGATGCATATCATCAAA 

TGTTTAGTAATTATTTCTCTCC^GTTTCATTTGGGACGGAAT 

ATATATATTTCCTGCGATGTAAAGCATTTCGOT 

TTGAAAA 

>G1652 Amino Acid Sequence (domain in AA coordinates: 143 -215) 
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MATAMNVFSTKWSSELDIEEYSIIHQFHMNSLVGDVPQSLSSLDDTTTCYNLDASCNKSL 

VEERPSKILKTTHISPNLHPFSSSNPPPPKHQPSSRILSFEKTGLHVMNHNSPNLIFSPK 

DEEIGLPEHKKAELIIRGTKRAQSLTRSQSNAQDHILAERKRREKLTQRFVALSALIPGL 

KKMDKAS VLGDAI KH I KYLQES VKE YEEQKKEKTMES WLVKKS SLVLDENHQPS S S S S S 

DGNRNSSSSNLPEIEVRVSGKDVLIKILCEKQKGNVIKIMGEIEKLGLSITNSNVLPFGP 

TFDISI I AQKNNNFDMKI EDWKNL S FGLS KLT* 

>G1655 (132.-755) 

TTTCTAACTAGTCACATTGAGAGAGAGAGAGAGAGAAAGAGAGACTCTCAGAATCTGAAG 

AAGAAGAAGAGATTGTTGTTTTTGCCTTTTATCATCGGTTTCTTTGAATCTCTGGTTTTA 

AATCGGATTTAATGGTGGAGTCTCTGTTCCCGAGCATCGAAAACACAGGTGAATCGTCTC 

GAAGAAAGAAGCCGAGGATATCAGAGACGGCGGAGGCGGAGATAGAGGCACGACGTGTCA 

ACGAAGAAAGCTTGAAGAGATGGAAAACGAATCGTGTGCAACAGATCTACGCTTGTAAGC 

TCGTCGAAGCTTTACGCCGAGTTCGTCAGAGATCTTCCACCACCAGCAACAACGAGACCG 

ATAAACTCGTCTCCGGCGCGGCGAGGGAGATACGTGATACGGCGGATCGAGTTCTAGCTG 

CGTCCGCTCGTGGTACGACTCGGTGGAGCAGAGCGATTTTAGCGAGTCGCGTCCGAGCGA 

AGCTGAAGAAACATAGAAAGGCGAAAAAGTCAACGGGAAATTGTAAATCGAGAAAAGGTC 

TCACGGAGACGAATCGGATTAAGTTACCGGCGGTTGAGAGAAAACTGAAGATTCTTGGCC 

GTTTGGTTCCTGGTTGCCGGAAAGTCTCTGTACCGAATCTTTTAGATGAAGCGACCGATT 

ACATCGCAGCGTTAGAGATGCAGGTTCGAGCCATGGAGGCTCTCGCCGAACTTTTAACCG 

CAGCCGCACCACGGACGACGTTGACCGGAACTTAACGGCGGCAGTTAGTTTGTCAGTTGT 

TAATTAGCTTTTCTTTTACCTTTTTACC 

TCGTCGACGCGATTTTAATTTATTAAATTCA 

>G1655 Amino Acid Sequence (domain in AA coordinates: 134-192) 

MVESLFPSIENTGESSRRKKPRISETAEAEIEARRVNEESLKRWKTNRVQQIYACKLVEA 

LRRWQRSSTTSISTNETDKLVSGAAREIRDTADRV^ 

HRKAIOCSTGNCKSRKGLTETNRIKLPAVERKLKILGRLVPGCRKVSVPNLLDEATDYIAA 
LEMQVRAME ALAELLTAAAPRTTLTGT * 
>G1671 (188. .751) 

TCCCAGTATCCTTCGCAAGACCCTTCCTCTATATAAGGAAGTTCATTTCATTTGGAGAGG 

ACACGCTGACAAGCTGACTCTAGCAGATCTGGTACCGTCGACCCTCTCTATATAATCTTC 

TTCTACACACACACACACACGCAACCATATACGTACATGTGAAGTAGTGAGATCAATATC 

GTTAGCAATGAATCTACCACCGGGATTTAGGTTTTTTCCGACCGATGAAGAGCTCGTCGT 

TCACTTCCTCCACCGGAAAGCTTCCCTCTTGCCTTGTCACCCTGATGTCATCCCCGACCT 

TGATCTTTACCATTACGATCCTTGGGACCTTCCCGGGAAAGCTTTGGGAGAAGGGAGGCA 

ATGGTACTTCTATAGTAGAAAGACACAAGAGAGAGTGACAAGCAATGGGTATTGGGGATC 

AATGGGAATGGACGAGCCAATCTACACAAGCTCCACACACAAGAAAGTGGGAATCAAAAA 

GTATCTAACTTTCTATCTCGGAGATTCTCAGACTAATTGGATCATGCAAGAATATTCCCT 

CCCGGATTCCTCTTCTTCATCTAGTCGATCTTCTAAGAGATCAAGCCGTGCTTCTAGTTC 

TAGTCACAAACCCGATTATAGCAAGTGGGTGATATGCAGAGTGTATGAGCAAAATTGCAG 

TGAGGAGGAAGACGATGATGGGACAGAACTCTCATGTTTGGATGAAGTGTTTTTGTCTTT 

AGATGATCTTGACGAAGTAAGCTTACCGTAATAAAGACAGAAGCACCCAAGAAGAGAAAA 

AAAAAAAAAGGGTTTAGTGGGCAATTATTTCTAAGCGACCGCTCTAGACAGGCCTAGTAC 

CGGATCCTCTAGCTAGAGCTTTCGTTCGTATCATCGGTTTCGACAACGTTCGTCAAGT 

>G1671 Amino Acid Sequence (domain in AA coordinates: TBD) 

MNLPPGFRFFPTDEELVVHFLHRKASLLPCHPDVIPDLDLYHYDPWDLPGKALGEGRQWY 

F YSRKTQERVTSNG YWGSMGMDEPI YTS STHKKVG I KKYLTF YISGDS QTNWIMQEYSLPD 

SSSSSSRSSKRSSRASSSSHKPDYSKWVICRVYEQNCSEEEDDDGTELSCLDEVFLSLDD 
LDEVSLP* r- 

>G1756 (71.. 1003) 

ATATGTACTTGTACACCAACCCACCAAAAGAGATAAAAGAGGT^AACAAAAACTCGAAAAG 
AGAGAGATATATGGGTGAGGTGGCTTATATGGACGAAGGAGACCTAGAAGCAATAGTCAG 
AGGCTACTCCGGCTCCGGAGACGCGTTTTCCGGCG^ 

GTTTTGCCTACCGATGGAGACGTCTAGTTTCTACGAACCGGAGATGGAGACAAGTGGCTT 
AGATGAGCTCGGTGAACTTTACAAACCCTTT^^ 

AAGCTCGGTCTCTCTCCCTGAAGATTCAAAACCTTTCCGAGATGACAAGAAACAACGATC 
ACATGGTTGTCTTTTATCCAACGGATCAAGAGCTGATCATATCCGAATTTCAGAATCC^ 
ATCAAAGAAAAGCAAGAAGAATCAACAGAAGAGAGTTGTTGAGCAAGTGAAAGAAGAGAA 
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TCTGTTGTCGGACGCATGGGCGTGGCGTAAATACGGGCAGAAACCCATCAAAGGATCTCC 
ATACCCAAGGAGTTATTACAGATGCAGTAGCTCAAAAGGGTGTTTGGCAAGATUVACAAGT 
CGAAAGAAATCCTCAAAACCCGGAGAAATTCACCATAACATACACTAATGAGCACAATCA 
TGAACTACCAACCCGGAGAAACTCATTAGCCGGTTCGACTCGAGCAAAAACTTCCCAACC 
CAAACCAACCTTAACCAAAAAATCCGAAAAAGAAGTTGTTTCTTCCCCTACAAGTAATCC 
TATGATCCCATCCGCTGATGAATCTTCTGTTGCGGTTCAAGAAATGAGCGTTGCGGAAAC 
GAGTACGCACCAAGCGGCTGGAGCAATCGAGGGCCGCCGCTTGAGTAACGGTTTACCATC 
GGATTTGATGTCCGGGAGCGGAACTTTTCCAAGTTTTACCGGTGACTTCGATGAACTATT 
GAATAGCCAAGAGTTCTTCAGTGGGTATTTATGGAATTACTAGAGAGCATTAGGTGTATG 

TATATATATAT 

>G1756 Amino Acid Sequence (domain in AA coordinates: TBD) 

MGEVAYMDEGDLEAIVRGYSGSGDAFSGESSGTFSPSFCLPMETSSFYEPEMETSGLDEL 

GELYKPFYPFSTQTILTSSVSLPEDSKPFRDDKKQRSHGCLLSNGSRADHIRISESKSKK 

SKKNQQKRVVEQVKEENLLSDAWAWRKYGQKPI KGS P YPRS YYRCS S S KGCLARKQVERN 

PQNPEKFTITYTNEHNHELPTRRNSLAGSTRAKTSQPKPTIiTKKSEKEWSSPTSNPMIP 

SADESSVAVQEMSVAETSTHQAAGAIEGRRLSNGLPSDLMSGSGTFPSFTGDFDELLNSQ 

EFFSGYLWNY* 
>G1757 (250.. 1224) 

ATCACCAATCCTATAACACTCTCATTCTCATCATATCATTCTTCAATCTATATAACCCAT 

TCTTAATTATACTCAACACACATTATATTTTTCTGATCATATCATTCTTTCAGTCCATCT 

ATATAACCAATTCTTGATTTATACTTAAAACACACATTATACATCTTTCTCATCATAGTT 

TGTATCAATTTCCTAGAGTAAACTACCTAAAGGAAAAAAAAAATCTATTTTGGGAATCAT 

ATACTAAAAATGGAAGGAAGAGATATGTTAAGTTGGGAGCAAAAGACATTGCTAAGCGAG 

CTTATCAATGGATTTGATGCGGCCAAAAAGCTTCAGGCACGACTTAGAGAAGCTCCGTCG 

CCGTCGTCATCATTTTCATCACCGGCGACGGCTGTTGCTGAGACTAACGAGATTCTGGTG 

AAGCAGATAGTTTCTTCCTACGAGAGATCTCTTCTTCTGCTAAACTGGTCATCCTCACCG 

AGCGTAGAACTTATTCCGACGCCGGTTACTGTAGTCCCGGTGGCAAATCCCGGCAGTGTT 

CCAGAATCTCCGGCATCGATAAACGGAAGTCCGAGAAGTGAAGAGTTTGCCGATGGAGGA 

GGTTCTAGCGAGAGTCATCATCGCCAAGATTACATTTTCAATTCAAAGAAAAGAAAGATG 

TTACCAAAGTGGTCAGAAAAAGTGAGAATAAGCCCAQAGAGAGGCTTAGAAGGACCTCAA 

GATGATGTCTTTAGCTGGAGAAAATATGGTCAAAAAGACATTTTAGGCGCCAAATTCCCA 

AGGAGTTATTACAGATGCACACATCGTAGCACACAAAACTGTTGGGCAACGAAACAAGTC 

CAGAGATCAGACGGGGATGCTACGGTTTTCGAAGTGACGTACAGAGGAACACACACTTGT 

TCGCAGGCGATCACAAGAACACCACCATTAGCCTCGCCGGAGAAGCGACAAGACACCAGA 

GTCAAACCAGCCATTACCCAAAAGCCAAAGGATATTCTCGAGAGTCTTAAATCCAACTTA 

ACCGTTCGAACCGATGGGCTTGATGATGGTAAAGACGTTTTCTCGTTCCCTGATACGCCG 

CCGTTTTACAATTACGGAACTATCAACGGCGAGTTCGGCCACGTGGAGAGTTCTCCGATC 

TTCGACGTTGTTGACTGGTTCAATCCAACGGTCGAGATTGACACAACTTTCCCCGCGTTT 

TTACACGAGTCGATTTATTATTAATTAAAATTTGTAACAGAGAAATAGATAGTAACTAGT 

AAGTAATGATCAGCGAGAGTTAAAACATAAAAGTACTTAGAGTAATCTAACGATGCATAA 

TAAGGAATGTTCAACAGGACTTGAACATGATTTCAATACTAAGAGAGATTTATCTAGCTA 

CTGGTAGTAGCCGCAGACTTCTTGTTGTAGCTTCACTTNCTTTTTGTTGCTT 

>G1757 Amino Acid Sequence (domain in AA coordinates: 158-218) 

MEGRDMLSWEQKTLLSELINGFDAAKKLQARLREAPS PS SSFS S PATAVAETNEILVKQI 

VSSYERSLLLLNWSSSPSVQLIPTPVTVVPVANPGSVPESPASINGSPRSEEFADGGGSS 

ESHHRQDYIFNSKKRKMLPKWSEKVRISPERGLEGPQDDWSWRKYGQKDILGAKFPRSY 

YRCIIDISTQNCWATKQVQRSDGDATVFEVTYRGTHTCSQAITRTPPLASPEKRQDTRV^ 

AITQKPKDILESLKSNLTVRTDGLDDGKDVFSFPDTPPFYNYGTINGEFGHVESSPIFDV 

VDWFNPTVEIDTTFPAFLHESIYY* 

>G1782 (1..927)- 

ATGCAAGTGTTTCAAAGGAAAGAAGATTCATCTTGGGGAAACT 

TCAAATATTC^GGATCTGAATCTTTCAGCTTC 

CAATTACCCGCGATGAAACATTCGGGTTTGCA^ 

CAATCTACTGAAGAAGAATCAGGCGGCGGTGAAGTTGCAAGCTTTGGAGAATATAAG 
TATGGATGCAGCATTGTTAATAACAATCTCTCAGGTTACATCGAAAAC^ 
ATTGAAAATTATACTAAGTCAATTACTACCTCGTCGATGGTGTCTCAAGACTCTGTGTCT 
CCTGCTCCTACTTCTGGTCAAATATCTTGGTCTCTTCAATGTGCTGAAACGTCA 
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AATGGTTTCTTGGCTCCTGAATATGCATCAACACCAACGGCGCTGCCACATTTAGAGATG 
ATGGGTTTGGTTTCTTCAAGAGTGCCATTGCCTCATCACATTCAAGAGAATGAACCAATA 
TTTGTCAATGCGAAACAGTATCATGCGATTCTCCGTCGCAGGAAGCACCGTGCTAAACTC 
GAAGCTCAGAACAAACTCATCAAATGCCGTAAACCGTACCTTCATGAGTCTCGCCATCTT 
CATGCTTTAAAGAGAGCTAGAGGCTCCGGTGGACGTTTCCTCAATACAAAGAAGCTTCAA 
GAATCATCAAACTCACTGTGTTCTTCTCAAATGGCAAATGGACAAAATTTCTCTATGAGC 
CCTCACGGTGGTGGAAGCGGAATCGGGTCTAGTTCGATCTCACCGAGCTCCAATTCAAAC 
TGTATCAACATGTTCCAAAACCCGCAGTTCAGATTCTCAGGTTATCCGTCAACACACCAT 

GCCTCAGCTCTCATGTCAGGGACTTGA 

>G1782 Amino Acid Sequence (domain in AA coordinates: 166-238) 

MQVFQRKEDSSWGNSMPTTNSNIQGSESFSLTKDMIMSTTQLPAMKHSGLQLQNQDSTSS 

QSTEEESGGGEVASFGEYKRYGCSIVNNNLSGYIENLGKPIENYTKSITTSSMVSQDSVF 

PAPTSGQISWSLQCAETSHFNGFLAPEYASTPTALPHLEMMGLVSSRVPLPHHIQENEPI 

FWAKQYHAILRRRKHRAKLEAQNKLIKCRKPYLHESRHLHAIjKRARGSGGRFLNTKKLiQ 

ESSNSLCSSQMANGQNFSMSPHGGGSGIGSSSISPSSNSNCINMFQNPQFRFSGYPSTHH 

ASALMSGT* 

>G184 (327.. 1937) 

TGAATTCTAGCCTTTTTGTAGGCGAATCATCTGGACCGGTAAGAGACTCTCTCATCGATA 

ATAACCACATAATTTAATCAAACTCTTTCTCTCTCTTTCTAAGATCTTTTGCTTTGCTCT 

TTTCCTTTTTGATCTTCCTATATATGGAGAAGCACCAAAACGGTACTTACTATACGATAC 

TGTACGGATCCATCAAACTGGATTAATTATCAAAACGTACATTTTTATCTTACCTGGCAA 

GTTACATTCCTAGGGTTTTGGAGAATCCAATCAACAACAAAGAAAATAATCATCGTTACA 

ATAATCAGTATCACGCACAGACTTAGATGTTCCGGTTTCCAGTGAGTCTAGGCGGTTCAC 

GTGACGAAGACCGTCACGATCAGATCACACCGTTGGATGACCATCGTGTGGTGGTTGATG 

AGGTTGACTTCTTCTCAGAGAAGAGAGATAGGGTTTCACGTGAGAACATCAACGACGACG 

ACGACGAAGGCAATAAGGTTCTCATCAAAATGGAGGGTTCACGAGTTGAAGAAAACGATC 

GTTCCAGAGATGTCAATATCGGTCTGAATCTTCTGACCGCGAATACGGGAAGCGATGAGT 

CAACGGTGGATGATGGACTATCAATGGATATGGAAGATAAACGTGCAAAGATTGAGAACG 

CACAACTACAAGAAGAGCTCAAGAAGATGAAAATAGAGAATCAAAGGCTAAGAGATATGT 

TGAGCCAAGCGACGACCAACTTCAATGCCTTACAAATGCAACTTGTTGCCGTCATGAGGC 

AACAAGAACAACGTAACTCTTCACAAGATCATCTCCTGGAGAGCAAAGCAGAAGGAAGGA 

AACGGCAGGAACTGCAAATCATGGTGCCAAGGCAGTTCATGGACCTTGGGCCGTCGTCTG 

GAGCAGCAGAGCATGGAGCCGAAGTGTCATCTGAAGAGAGGACAACGGTTCGTTCAGGTT 

CTCCTCCTTCGCTTCTAGAAAGTTCCAATCCCCGAGAGAACGGAAAGAGGTTGCTTGGAA 

GAGAAGAAAGCTCAGAGGAATCAGAGTCTAACGCCTGGGGAAACCCTAACAAAGTCCCCA 

AACATAATCCATCCTCTAGCAATAGCAATGGAAACAGAAACGGAAATGTTATTGATCAGT 

CGGCCGCAGAAGCCACCATGCGGAAAGCCCGTGTCTCAGTTCGTGCCCGATCTGAAGCTG 

CCATGATAAGCGATGGATGTCAATGGAGAAAGTACGGACAAAAAATGGCTAAAGGAAACC 

CGTGTCCGCGGGCTTATTATCGTTGCACAATGGCCGGTGGATGTCCAGTTCGCAAGCAAG 

TGCAGCGTTGCGCAGAAGACAGATCTATTCTCATAACCACCTACGAAGGAAACCACAACC 

ATCCACTCCCACCAGCCGCTACGGCCATGGCCTCAACAACCACCGCAGCTGCAAGCATGC 

TCCTCTCGGGCTCAATGTCGAGTCAAGACGGTTTAATGAACCCAACAAACCTCCTAGCTC 

GAGCTATCTTGCCTTGCTCCTCAAGCATGGCTACAATCTCAGCCTCCGCACCATTCCCAA 

CCATCACATTGGACCTCACCAATTCACCCAACGGTAACAACCCTAATATGACCACTAATA 

ACCCGTTGATGCAGTTCGCTCAACGGCCCGGTTTCAACCCGGCAGTTTTGCCTCAAGTGG 

TTGGTCAAGCTATGTACAATAACCAACAACAGTCCAAGTTTTCTGGTTTACAGTTACCGG 

CTCAGCCACTGCAGATCGCGGCCACTTCCTCGGTGGCCGAGAGCGTTAGTGCTGCCAGTG 

CAGCAATTGCGTCCGATCCAAACTTTGCGGCGGCTCTAGCGGCAGCGATCACGTCCATTA 

TGAACGGTTCCAGTCATCAAAATAATAACACCAATAATAATAATGTGGCTACGAGCAACA 

ATGACAGTAGGCAATAAGAGTTTTCATTTTGATGGTCGATTTTTTTTTTTGGGG 

>G184 Amino Acid Sequence (domain in AA coordinates: 295-352) 
MFRFPVSLGGSRDEDRHDQITPLDDHRVVVDEVDFFSEKRDRVSRENINDDDDEGNKVLI 
KMEGSRVEENDRSRDVNI GLNLIiTANTGSDESTVDDGLSMDMEDKRAKI ENAQLQEELKK 
MKIENQRLRDMLSQATTNFNALQMQLVAVI^ 

PRQFMDLGPS SGAAEHGAEVS SEERTTVRSGS PPSLLES SNPRENGKRLLGREESSEE SE 
SNAWGNPNKVPKHNPSSSNSNGNRNGNVIDQSAAEATMRK^VSVRARSEAAMISDGCQW 

RKTOQKMAKONPC^ 
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MASTTTAAASMLLSGSMSSQDGLMNPTNLLARAILPCSSSMATISASAPFPTITLDLTNS 
PNGNNPNMTTNNPLMQFAQRPGFNPAVLPQVVGQAMYNWQQQSKFSGLQLPAQPLQIAAT 
SSVAESVSAASAAIASDPNFAAALAAAITSIMNGSSHQNNN^ 
>G1845 (111.. 989) 

AAGACATJ\ATTTTCTCTGTTTTCCTAGCTCTCTCCTCTCAAATTCTTCCATTGCTCTCTG 
TTTTGGCAAATCGTGAACTGCCACGTCTTTAAGGCATCAGTGAAGCA7VAGATGGACTTTG 
ACGAGGAGCTAAATCTTTGTATTACGAAAGGTAAAAATGTTGATCATTCTTTTGGAGGAG 
AAGCTTCTTCCACGTCCCCAAGATCTATGAAGAAAATGAAGAGTCCTAGTCGTCCTAAAC 
CCTATTTCCAATCCTCTTCTTCTCCTTATTCGTTAGAGGCTTTCCCTTTTTCTCTCGATC 
CAACACTTCAGAATCAGCAACAACAACTCGGATCATACGTTCCGGTACTTGAGCAACGAC 
AAGACCCGACAATGCAAGGCCAGAAGCAAATGATCTCCTTTAGTCCTCAACAACAACAAC 
AGCAGCAGCAGTATATGGCCCAGTACTGGAGTGACACATTGAATCTGAGTCCAAGAGGAA 
GAATGATGATGATGATGAGCCAAGAAGCTGTTCAACCTTACATCGCAACGAAGCTGTACA 
GAGGAGTGAGACAACGTCAATGGGGAAAATGGGTCGCAGAGATCCGTAAGCCACGAAGCA 
GGGCACGTCTTTGGCTTGGTACCTTTGATACAGCTGAAGAAGCTGCCATGGCCTACGACC 
GCCAAGCCTTGAAATTACGAGGCCACAGCGCAACACTGAATTTCCCGGAGCATTTTGTGA 
ATAAGGAAAGCGAGCTGCATGATTCAAACTCGTCGGATCAGAAAGAACCTGAAACGCCAC 
AGCCAAGCGAGGTTAACTTGGAGAGCAAGGAACTACCGGTGATTGATGTTGGGAGAGAGG 
AAGGTATGGCTGAGGCATGGTACAATGCCATTACATCGGGATGGGGTCCTGAAAGTCCTC 
TTTGGGATGATTTGGATAGTTCTCATCAGTTTTCATCAGAAAGCTCATCTTCTTCTCCTC 
TCTCTTGTCCTATGAGGCCTTTCTTTTGAAAAAGTTTATAAACCCACATTGTGTTGTAGG 
TTATAGTTTAGGGTTATGCTCATTGGCATTTGGATGGAGGCAATTTTTGTGATCTCCCAT 
TCCACCACATATCAGTCATTATATGTGTCTACCTTTTCTCTGTATTTCTATCATTATCAT 
TGTTTTTATTATGTGTCTGTATGTGTTTCCCTATTGCTACATACATAGATGTCCTCTTTG 
TTCAAAAAAAAAAAAAAAAAAAAAAA 

>G1845 Amino Acid Sequence (domain in AA coordinates: 140-207) 

MDFDEEL^CITKGKNVDHSFGGEASSTSPRSMKKMKSPSRPKPYFQSSSSPYSLEAFPF 

SIJDPTLQNQQQQLGSYVPVLEQRQDPTMQGQKQMISFSPQQQQQQQQYMAQYWSDTLNLS 

PRGRMMMMMSQEAVQP Y I ATKLYRGVRQRQWGKWVAE IRKPRSRARLWLGTFDTAEEAAM 

AYDRQAFKLRGHSATLNFPEHFVWKESELHDSNSSDQKEPETPQPSEVNLESKELPVIDV 

GREEGMAEAWYNAITSGWGPESPLWDDLDSSHQFSSESSSSSPLSCPMRPFF* 

>G1879 (3.. 917) 

AAATGCCCTTAGAGGCTGTCGTATACCCGCAAGATCCATTCGGATATCTCTCCAATTGCA 
AAGATTTTATGTTCCACGACTTATACTCTCAAGAAGAGTTCGTAGCTCAAGATACGAAGA 
ACAACATTGATAAGTTAGGGCATGAACAGAGCTTTGTGGAACAAGGTAAGGAGGACGATC 
ATCAATGGCGAGACTATCATCAGTATCCTTTGTTGATCCCTTCGTTGGGAGAAGAGCTTG 
GTCTTACCGCCATTGATGTGGAGAGTCATCCTCCTCCACAGCACCGGAGGAAGAGGAGGA 
GAACGAGAAACTGCAAGAACAAGGAAGAGATCGAGAACCAGAGAATGACTCACATCGCCG 
TCGAGAGAAATCGCCGGAAACAGATGAACGAGTATCTGGCTGTGCTCCGTTCTCTAATGC 
CGTCGTCGTATGCTCAAAGAGGAGATCAAGCGTCGATAGTAGGAGGAGCTATAAACTACG 
TGAAGGAGTTAGAGCATATTTTACAATCTATGGAGCCGAAGAGAACTAGGACTCATGATC 
CCAAAGGAGACAAGACTAGCACTAGCTCGTTAGTGGGTCCATTCACAGATTTTTTCAGCT 
TCCCACAATATTCTACAAAGTCATCATCAGATGTACCGGAAAGCTCATCTTCACCGGCGG 
AGATAGAGGTTACGGTGGCAGAAAGCCATGCGAACATCAAGATAATGACGAAGAAGAAAC 
CGAGGCAGCTTCTTAAGCTCATAACTTCTTTACAAAGCCTAAGGCTCACTCTTCTTCATC 
TCAATGTCACCACTCTCCACAACTCCATTCTCTACTCCATCAGCGTCAGGGTTGAAGAAG 
GAAG CCAACTGAATACCGTGGACGACATTGC AACAG CTTTGAATCAAACCATAAGGAG GA 
TTCAAGAAGAGACA^AATTCAGCAAATAGATTATAATTAACTTGTTTTATl'Tri'Ari'T'i'A 
TTTTGAAATAACTGAAATCAGTTTTCTAATTTTTTTTTTTTO 

TCCCTATGTAAGTTGCATTTTTGTCTCTTGTAATGAATCAATGGTCATAAAGATCTGAAC 
AAAAAAATTGAATAAAAGAAAATGGTT 

>G1879 Amino Acid Sequence (domain in AA coordinates: 107-176) 
MPLEAVVYPQDPFGYLSNCKDFMFHDLYSQEEFVAQDTKNNIDKLGHEQSFVEQGKEDDH 
QWRDYHQYPLLIPSLGEELGLTAIDVESHPPPQHRRKRRRTRNCKNKEEIENQRMTHIAV 
ERNRRKQMNEYLAVLRS LMPSS YAQRGDQAS IVGGAINYVKELEHILQSMEPKRTRTHDP 
KGDKTSTSSLVGPFTDFFSFPQYSTKSSSDVPESSSSPAEIEVTVAESHANIKIMTKKKP 
RQLLKLITSLQSLRLTLLHLNVTTLHNS ILYS K 
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QEET* 

>G1888 (1..729) 

ATGAAGATTTGGTGTGCTGTTTGTGATAAAGAAGAAGCTTCGGTGTTTTGTTGTGCGGAT 
GAAGCAGCTCTTTGTAATGGTTGCGATCGCCATGTTCATTTCGCCAATAAACTAGCCGGG 
AAACATCTCCGGTTCTCTCTCACTTCTCCTACTTTCAAAGATGCTCCTCTTTGTGATATT 
TGCGGGGAGAGGCGTGCATTATTATTTTGCCAAGAAGACAGAGCAATACTATGCAGAGAA 
TGTGACATTCCAATACATCAAGCTAATGAGCACACTAAGAAACACAATAGATTCCTCCTT 
ACCGGCGTTAAGATCTCTGCCTCCCCGTCAGCCTACCCAAGAGCCTCCAATTCCAACTCT 
GCTGCTGCATTTGGTCGAGCCAAAACCCGACCAAAATCAGTATCGAGCGAGGTCCCGAGC 
TCGGCCTCCAATGAGGTATTTACGAGCTCTTCTTCGACGACCACGAGCAATTGCTATTAT 
GGGATAGAAGAAAACTACCATCACGTGAGCGATTCGGGGTCGGGATCGGGTTGTACAGGT 
AGTATATCCGAGTATTTGATGGAGACATTACCGGGTTGGAGAGTGGAGGATTTGCTTGAA 
CACCCTTCTTGTGTCTCCTATGAGGATAACATTATTACTAATAACAATAACAGTGAGTCT 
TATAGGGTTTATGATGGTTCTTCACAATTCCATCATCAAGGGTTTTGGGATCACAAACCC 
TTCTCTTGA 

>G1888 Amino Acid Sequence (domain in aa coordinates: 5-50) 
MKIWCAVCDKEEASVFCCADEAALCNGCDRHVHFANKLAGKHLRFSLTSPTFKDAPLCDI 
CGERRALLFCQEDRAILCRECDI P IHQANEHTKKHNRFLLTGVKI S AS PS AYPRASNSNS 
AAAFGRAKTRPKSVSSEVPSSASNEVFTSSSSTTTSNCYYGIEENYHHVSDSGSGSGCTG 
SISEYLMETLPGWRVEDLLEHPSCVSYEDNIITNimNSESYRVYDGSSQFHHQGFWDHKP 

FS* 

>G189 (34..987) 

CCACAACTCTCTCCTTGTAGAGAGAGAGATTTTATGGCGGTGGAGCTCATGACTCGGAAT 
TACATCTCCGGCGTCGGAGCTGATAGCTTCGCCGTTCAAGAAGCAGCTGCTTCAGGACTC 
AAAAGTATCGAAAATTTCATCGGTTTAATGTCTCGTGATAGCTTTAACTCTGATCAGCCA 
TCTTCTTCTTCCGCCTCCGCCTCCGCCTCCGCCGCCGCAGATCTTGAATCAGCTCGTAAC 
ACAACGGCGGACGCGGCTGTTTCAAAGTTTAAAAGAGTCATATCTCTCTTAGATCGAACT 
CGAACCGGACACGCCCGGTTTAGACGTGCTCCGGTTCATGTTATTTCTCCGGTTCTTTTA 
CAAGAAGAACCAAAAACGACGCCGTTTCAGTCTCCTGTTCCTCCTCCGCCGCAAATGATC 
CGAAAAGGTTCGTTTTCTT.CATCGATGAAAACGATTGATTTCTCATCTCTCTCCTCTGTA 
ACAACGGAATCAGACAACCAGAAGAAGATTCATCATCATCAACGTCCCTCTGAAACGGCG 
CCGTTTGCG'TCTCAAACTCAAAGCCTCTCCACGACGGTCTCGTCTTTCTCAAAATCAACA 
AAGAGAAAATGTAACTCTGAGAATCTTCTCACCGGAAAATGCGCTTCCGCTTCTTCCTCC 
GGTCGTTGTCATTGCTCGAAGAAAAGAAAGATAAAACAGAGGAGAATAATTAGGGTTCCG 
GCGATAAGTGCAAAAATGTCCGATGTACCACCGGACGATTATTCATGGAGGAAATACGGA 
CAAAAACCAATTAAAGGATCTCCACATCCAAGAGGATATTATAAGTGTAGTAGCGTAAGA 
GGTTGTCCAGCACGTAAACATGTTGAGAGAGCAGCTGATGATTCGTCCATGTTGATTGTT 
ACTTATGAAGGAGATCATAATCATTCTCTCTCCGCCGCTGATCTCGCCGGAGCCGCCGTT 
GCTGATCTTATTTTGGAATCGTCTTGAAAAGAACAAATCTTTATTTAAGGCTTTTATAAT 
ATAAATTTAGATCCTTACTTAGTGAAGTACTCAAACTATGAATGAAATCAATGTAATCAA 
AATCAAAAAGCTTTTGCTAAAAAAAAAAAAAAAAA 

>G189 Amino Acid Sequence (domain in AA coordinates: 240-297) 
MAVELMTRNYISGVGADSFAVQEAAASGLKSIENFIGLMSRDSFNSDQPSSSSASASASA 
AADLESARNTTADAAVS KFKRVI S LLDRTRTGHARFRRAPVHVI S PVLLQEEPKTTPFQS 
PLPPPPQMIRKGSFSSSMKTIDFSSLSSVTTESDNQKKIHHHQRPSETAPFASQTQSLST 
TVSSFSKSTKRKCNSENLLTGKCASASSSGRCHCSKKRKIKQRRIIRVPAISAKMSDVPP 
DDYSWRKYGQKP I KGS PHPRGYYKCS SVRGCPARKHVERAADDSSMLI VT YEGDHNHSLS 
AADLAGAAVADL I LBS S * 
>G1939 (92. .844) 

AATCATTAGCTTCTTCTCTTCTCTTCTCTCACAGAGAGAGTAATCACAAGCCAAGTGAGA 
AAAAGAAAACACTAAACCCAGATCGAAAACCATGTCTATTAACAACAACAACAACAACAA 
CAACAATAACAACGATGGTCTTATGATCTCATCTUyVCGGAGCTTTAATCGAACAAC^ACC 
ATCAGTCGTTGTGAAGAAACCACCGGCGAAAGATCGACATAGCAAAGTCGATGGAAGAGG 
GAGAAGAATCCGTATGCCGATTATATGTGCTGCTCGTGTTTTTCAGCTAACGAGAGAGCT 
TGGTCATAAGTCAGATGGCCAAACAATTGAATGGTTACTTCGTCAAGCAGAGCCTTCTAT 
TATAGCTGCAACAGGAACTGGTACAACTCCAGCGAGTTTCTCAACTGCTTCTGTCTCTAT 
CCGTGGAGCCACCAATTCTACTTCTTTAGATGATAAACCCACTTCTTTACTTGGTC 
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GTCACCGTTTATACTTGGGAAACGTGTTAGAGCTGATGAGGATAGTAATAATAGTCATAA 
TCATAGTTCTGTTGGTAAAGATGAGACCTTTACGACAACACCAGCTGGGTTTTGGGCTGT 
TCCGGCGAGGCCGGATTTTGGACAAGTTTGGAGTTTTGCTGGAGCTCCACAAGAGATGTT 
TTTACAACAACAACATCATCATCAGCAACCATTGTTTGTTCATCAGCAACAGCAACAACA 
AGCTGCAATGGGTGAAGCTTCTGCTGCTAGAGTTGGGAATTATCTTCCGGGTCATCTTAA 
TTTGCTTGCTTCTTTATCCGGTGGATCTCCCGGGTCGGATCGAAGAGAGGAAGATCCACG 
TTAATGGTTTAAGCCCTTTTAGGTTTGAGGGCAAAATTTGGTATATATATTTATTATCTT 
CTCTTCTCTATTGTTGTCATTGTTTCTCTATGTGTGTGTTTTAGTGTTGTTAGAGATTGA 
TTTGGTTTCAGAATCTCTGCAAGTGATTTGAGAGTTTTCGTTAGCTTTAAGTAAGTTAAA 
GACGGTTGTTTTTGATTAGGGTTAAATTAGGGTTTAAGAATCTGTTGTTTTTTTGGAGGG 
AGATCGATTTCTTATCGGATCCAAGATTACTTTTAGGAAAAAAGGGAAAATTTCAGAAAC 
CACGGTGGTTTCTTTTCCTCTTTTTTTTTTTG 

>G193 9 Amino Acid Sequence (domain in AA coordinates: 40-102) 
MS ItfNNNNNNNNX^ 

ARVFQLTRELGHKSDGQTIEWLLRQAEPS I IAATGTGTTPASFSTASVS IRGATNSTSLD 

HKPTSLLGGTSPFILGKKVRADEDSNNSHNHSSVGKDETFTTTPAGFWAVPARPDFGQVW 

SFAGAPQEMFLQQQHHHQQPLFVHQQQQQQAAMGEASAARVGl^LPGHLNLLASLSGGSP 

GSDRREEDPR* 

>G194 (192.. 1205) 

TCTTTCTTCTCTCTCTATCTCTCCTCTTTGAACCCTAAAAACTCTTTCTTTACAAGGATT 
GATCTTTTTGTATTTTTGATTTTGACATTTGCTTTGTGTTCGATCTCTGTTTTGATGCGA 
TTTCTCTGTTTTTAAAGCCATTTGATAGATTGTTTCCGGTAAAGCTCAGCGAGAGAAGAA 
GAAGAACAACAATGGAGTTTACAGATTTCTCAAAGACGAGTTTTTACTACCCGTCGTCAC 
AAAGCGTTTGGGATTTCGGAGATTTAGCGGCGGCGGAGAGGCATTCTTTAGGGTTCATGG 
AGTTATTAAGTTCTCAGCAGCATCAAGACTTTGCTACTGTTTCTCCTCATTCCTTCCTTC 
TCCAAACGTCTCAACCGCAAACGCAAACGCAACCATCGGCGAAGCTGTCTTCAAGTATCA 
TTCAAGCTCCACCGTCAGAGCAATTAGTGACGTCAAAGGTGGAGTCTTTGTGTTCGGATC 
ATTTGTTGATAAACCCACCGGCGACTCCTAACTCGTCATCGATTTCGTCTGCTTCAAGCG 
AGGCTCTAAATGAAGAGAAACCGAAAACAGAAGACAATGAAGAAGAAGGAGGTGAAGATC 
AACAAGAGAAGAGTCATACTAAGAAACAGTTGAAAGCAAAGAAGAATAATCAGAAGAGAC 
AGAGAGAGGCAAGAGTCGCATTCATGACAAAGAGTGAAGTTGATCATCTCGAAGATGGTT 
ATCGCTGGCGAAAATATGGTCAAAAAGCTGTCAAAAACAGTCCTTTTCCCAGGAGTTACT 
ACCGTTGCACAACGGCTTCATGTAACGTGAAGAAGAGAGTGGAGAGATCATTCAGAGATC 
CAAGCACTGTGGTTACAACCTACGAAGGTCAACACACTCACATTAGTCCACTCACGTCTC 
GTCCTATTTCCACTGGAGGTTTCTTCGGATCGTCAGGAGCTGCTTCGAGTCTCGGTAATG 
GTTGCTTTGGGTTTCCTATTGATGGCTCCACGTTAATCTCTCCTCAGTTCCAACAGCTTG 
TCCAATACCATCACCAACAGCAGCAACAAGAACTCATGTCI^GTTTTGGAGGAGTCAACG 
AGTACCTTAATAGCCACGCTAATGAGTATGGTGATGATAATCGTGTGAAGAAGAGTCGAG 
TTTTGGTTAAAGATAATGGACTTCTGCAAGATGTTGTTCCGTCTCATATGTTGAAGGAAG 
AGTAGTAGTATATATATAGTCTTATAGTTTTAATCTAGTTTTTTTTTGTATAATTGTCTA 
AAAGAAACGGATCTTTTGTTCTGATGAAGAAGATGTTTTCTTATGGTTCTGAAATCGTAA 
GGTAATGATGATTGTACCAAGCCGAGAAAGTACTTGTGATTTTCACCATTGAATCACTAT 
AAATGTAATTTTTATTTACTGTGAAAAAAAAAAAAAAA 

>G194 Amino Acid Sequence (domain in AA coordinates: 174-230) 

MEFTDFSKTSFYYPSSQSVWDFGDLAAAERHSLGFMELLSSQQHQDFATVSPHSFLLQTS 

QPQTQTQPSAKLSSSIIQAPPSEQLVTSKVESLCSDHLLINPPATPNSSSISSASSEALN 

EEKPKTEDNEEEGGEDQQEKSHTKKQLKAKKNWQKRQREARVAFMTKSEVDHLEDGYRWR 

KYGQKAVKNSPFPRSYYRCTTASCNVKKRVERSFRDPSTVVTTYEGQHTHIS 

TGGFFGSSGAASSLGNGCFGFPIDGSTLISPQFQQLVQYHHQQQQQELMSCFGGVNEYLN 

SHANEYGDDNRVKKSRVLVKDNGLLQDVVPSHMLKEE * 

>G1943 (137. .1858) 

A(^TTTGTTTCTAATCTCAGACATAAATAATTTTTGTTCCCGACTTCAAAAC 

ATTATATGATTCCACATTCATTTTCTTCTACTTCrrTCCTTCTCCTTG 

AGAAAATCCATCTATCATGGGTGAAGATGATATAGTGGAGCTCTTATGGAAGAGTGGCCA 

AGTCGTTAGAACCAGTCAAACACAGAGACCCTCCrCCAATACACCACCATCTCTTCCTCC 

ACCACCCATTCTTCGTGGTAGCGGAAGCGGCAACGGAGAAGAAAATGCCCCGCTTCCACT 

TCCACAGCCTTCACCTCCCCTCCATCATCAGAATCTTTTC^TTCTGGAAGACGAAATGTC 
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TTCTTGGCTTCACCATTCTCACCCCGGCGTTACGTCCACCCCGGCTTCTTCTGTCTCCCT 

GCCACCACCACCCAATGCTCCGCGTGAAGATGATATAGTGGAGCTTTTATGGCAAAGCGG 

CCAAGTAGTTGGAACCAACCAAACACATAGACAATCCTACGATCCTCCTCCCATTCTCCG 

CGGCAGCGGAAGTGGCAGAGGAGAAGAAAATGCTCCCCTTTCACAACCTCCGCCTCACCT 

GCATCAGCAAAATCTCTTCATTCAAGAAGGCGAAATGTATTCGTGGCTACACCATTCTTA 

CCGCCAAAACTATTTCTGCTCAGAACTTCTCAACTCCACTCCGGCTACTCACCCGCAAAG 

TTCCATCTCTCTGGCACCACGTCAGACTATCGCCACGAGAAGGGCGGAAAACTTTATGAA 

CTTCTCGTGGCTAAGAGGGAACATATTTACCGGCGGTAGAGTTGATGAAGCTGGACCGTC 

GTTTTCGGTGGTAAGAGAATCGATGCAGGTAGGCTCGAACACGACCCCCCCTTCTTCTTC 

TG CCACTGAATCATGTGTAATACCAGCTAC AGAGGGCAC CGCGAGTCGAGTGTCGGGAAC 

TTTGGCAGCTCATGATCTTGGTCGGAAGGGAAAGGCGGTGGCGGTTGAGGCGGCCGGAAC 

ACCATCTTCAGGAGTGTGCAAGGCCGAAACAGAGCCGGTTCAGATACAACCAGCAACGGA 

GTCGAAGCTAAAAGCGAGAGAAGAAACCCATGGAACTGAAGAAGCTCGTGGTTCAACGTC 

TAGAAAGAGATCACGAACTGCAGAAATGCATAACCTCGCCGAAAGGAGAAGGAGAGAAAA 

GATCAACGAGAAGATGAAGACTCTGCAACAACTCATTCCTCGCTGCAACAAGGTTGAATC 

TGATTCTGTTTCTACTCTGATCAGTCTACTAAAGTTTCAACGCTGGATGATGCTATCGAG 

TACGTCAAATCGTTACAGAGCCAAATACAAGTATGCTCTTCAAAACAGAATGTGTTTTAA 

ACCAATGGTTCAACATGGAAAGAGTTCATATGTATCTAGTTTTGTTGAGATGATGTCGAC 

GGGACAGGGTATGATGTCGCCAATGATGAATGCCGGGAATACGCAACAGTTCATGCCCCA 

TATGGCCATGGATATGAACCGACCTCCTCCATTCATACCTTTCCCCGGCACATCTTTTCC 

TATGCCGGCTCAAATGGCAGGTGTAGGTCCATCATATCCAGCACCGCGCTACCCTTTTCC 

CAACATTCAGACCTTTGACCCATCCAGAGTCCGTTTACCAAGCCCGCAGCCTAACCCGGT 

GTCGAACCAGCCTCAGTTTCCGGCTTACATGAATCCCTATAGCCAGTTTGCTGGTCCCCA 

CCAGTTGCAACAACCTCCTCCTCCTCCATTTCAGGGTCAAACAACATCACAACTGAGTTC 

CGGGCAGGCAAGTAGTAGCAAGGAACCTGAGGATCAGGAGAACCAACCAACAGCTTAGTT 

AAAGTGTGGAGCTGAAACGGATCAGTTCTTCAAGCAAATTACAACTTTGAAGATAAACCA 

GAGTTGTAACATGTAGATTTTGTCTGTTAAGTTTAATGTAAGTACTTTTTAGTTAATGGG 

AAAGATACTGACAGGTTGCAAGGTGGTCAGTATTTGTGCATCACGCTTAAGATTCCTCGA 

TGTGGCCAGTATCTCCCTTTTCTAGCATGTGAGGTCGCTACTCTCTGGTTCTACGGAGAC 

CAAATGTTCGACTGATTAAACACACAATGACTTACCAAAAGTACACGCGGCCCATCCTCG 

TCTTTATGTTCCAAGTGCGACTGTTTGTTTATTTGTAAGCATTTTTCTTATAATAATAAA 

ACAGCTCTATCTTCGTTAAAAAAAA 

>G1943 Amino Acid Sequence (domain in AA coordinates: 335-406) 

MGEDDIVELLWKSGQWRTSQTQRPSSNTPPSLPPPPILRGSGSGNGEENAPLPLPQPSP 

PLHHQNLFILEDEMSSWLHHSHPGVTSTPASSVSLPPPPNAPREDDIVELLWQSGQWGT 

NQTHRQSYDPPPILRGSGSGRGEENAPLSQPPPHLHQQNLFIQEGEMYSWLHHSYRQNYF 

CSELLNSTPATHPQSSISLAPRQTIATRRAENFMNFSWLRGNIFTGGRVDEAGPSPSVVR 

ESMQVGSNTTPPS S SATES CVI PATEGTASRVSGTLAAHDLGRKGKAVAVEAAGTPS SGV 

CKAETEPVQIQPATESKLKAREETHGTEEARGSTSRKRSRTAEMHNLAERRRREKINEKM 

KTLQQLI PRCNKVESDSVSTLI SLLKFQRWMMLS STSNRYRAKYKYALQNRMCFKPMVQH 

GKSSYVSSFVEMMSTGQGMMSPMMNAGNTQQFMPHMAMDMNRPPPFIPFPGTSFPMPAQM 

AGVGPSYPAPRYPFPNIQTFDPSRVRLPSPQBNPVSNQPQFPAYMNPYSQFAGPHQLQQP 

PPPPFQGQTTSQLSSGQASSSKEPEDQENQPTA* 

>G21 (79.-966) 

TGTGGAGGAATATTAATACAGCCCACTTCACATCTATTTTTGTGCAACCATCTCTCTAAA 
GCTTCTTCTCTCATAACAATGGCAAGACAAATCAACAT^ 

ACCTTTATCTCCTCCGCCATCCCCGCCGTATCTTCCTCCTCCTCCATCACCGCTTCCGCC 
TCATTGTCCTCTTGACCTACTACATCTTCCTCTTCTTCGTCATCAACAAATTCTAACTTC 
ATTGAGGAAGACAACTCTAAAAGAAAAGCATCTCGAAGATCATTGTCATCGTTAGTCTCC 
GTTGAAGACGATGATGATCAAAACGGTGGAGGTGGGAAACGGCGAAAGACCAACGGTGGA 
GATAAACATCCGACGTATAGAGGAGTGAGGATGAGGAGTTGGGGA71AATGGGTGTCGGAG 
ATTAGAGAGCCGAGAAAGAAATCAAGAATCTGGCTCGGGACTTATCCAACGGCTGAGATG 
GC^GCTCGAGCTCATGACGTAGCGGCTTTAGCCATTAAAGGTACAACGGCTTACCTO^T 
TTTCCCAAGTTAGCCGGCGAGCTTCCTCGTCCAGTC^ 

GCCGCCGCCTCTTTAGCGGCCGTTAACTGGCAAGATTCGGTCAACGATGTGAGTAATTCT 
GAAGTGGCTGAAATAGTTGAAGCCGAGCCGAGTCGAGCCGTGGTGGCTCAGTTGTTTTCT 
TCGGACACAAGCACGACGACGACGACTCAGAGTCAAGAGTATTCGGAAGCTTCGTGTGCT 
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TCGACTTCGGCGTGTACGGACAAAGACAGTGAGGAAGAGAAGCTGTTTGATTTGCCGGAT 
TTGTTTACCGATGAGAATGAGATGATGATACGAAACGATGCGTTTTGCTACTACTCGTCC 

GAATGACTAAAGTACCCCTCTCGAGAGAGCTCTCACTAACACT 

>G21 Amino Acid Sequence (domain in AA coordinates: 97-164) 

MARQINIESSVSQVTFISSAIPAVSSSSSITASASLSSSPTTSSSSSSSTNSNFIEEDNS 

KRKASRRSLSSLVSVEDDDDQNGGGGKRRKTNGGDKHPTYRGVRMRSWGKWVSEIREPRK 

KSRIWLGTYPTAEMAARAHDVAAIiAIKGTTAYLNFPKLAGELPRPVTNSPKDIQAAASLA 

AVNWQDS VNDVSNS E VAE I VEAE PSRAWAQLFS SDTSTTTTTQS QE YS E AS CASTS ACT 

DKDSEEEKLFDLPDLFTDENEMMIRNDAFCYYSSTWQLCGADAGFRLEEPFFLSE* 

>G2132 (42.. 1031) 

ATTCTGTTACTTAGTACCGGAGTTTAGTCGGAGAGAGAACAATGATCAGTTTCAGAGAAG 

AGAACATCGATCTCAACTTGATTAAAACAATTAGTGTAATCTGTAATGATCCAGACGCCA 

CCGATTCCTCTAGCGACGATGAATCTATCTCCGGCAATAATCCTCGCCGTCAGATCAAAC 

CAAAACCACCGAAACGTTACGTCTCAAAGATCTGTGTCCCGACGCTGATCAAAAGGTATG 

AGAACGTTTCGAATTCTACAGGGAATAAAGCAGCCGGAAACCGGAAAACGTCGTCGGGTT 

TCAAAGGCGTACGACGGAGGCCGTGGGGGAAATTTGCGGCGGAGATAAGAAATCCGTTTG 

AGAAGAAGAGAAAGTGGCTTGGAACGTTTCCTACTGAAGAAGAAGCAGCAGAAGCTTACC 

AAAAGAGTAAAAGAGAGTTTGATGAACGATTGGGTTTAGTTAAACAGGAAAAAGACCTAG 

TAGATTTGACCAAGCCGTGCGGTGTACGTAAACCAGAAGAGAAGGAAGTTACTGAGAAGT 

CGAATTGCAAAAAGGTAAATAAGAGAATTGTTACTGATCAGAAGCCATTTGGTTGTGGTT 

ATAACGCTGATCATGAAGAAGAGGGAGTGATTAGTAAAATGTTGGAAGATCCGTTGATGA 

CATCGTCAATTGCTGATATTTTTGGTGATTCGGCTGTTGAAGCAAATGATATTTGGGTGG 

ATTACAATTCAGTGGAATTTATTTCC^TTGTAGATGATTTCAAGTTTGATTTTGTGGAGA 

ATGATAGAGTAGGAAAGGAGAAAACATTTGGATTTAAGATTGGGGATCACACTAAAGTTA 

ATCAACATGCCAAAATCGTATCGACCAATGGGGACTTATTCGTCGATGATTTACTTGATT 

TTGATCCGTTGATAGATGATTTTAAGTTAGAAGATTTTCCTATGGATGATCTTGGATTAT 

TAGGAGATCCAGAGGATGATGATTTTAGTTGGTTTAATGGTACTACTGATTGGATCGATA 

AGTTTTTATGAATACTTTCTTGACACGGCCAACGGTATTAGTAC 

>G2132 Amino Acid Sequence (domain in AA coordinates: TBD) 

MISFREENIDLNLIKTISVICNDPDATDSSSDDESISGNN^ 

TLIKRYENVSNSTGNKAAGimKTSSGFKGV^ 

EAAEAYQKSKREFDERLGLVKQEKDLVDLTKPCGVRK^^ 

KPFGCGYNADHEEEGVI SKMLEDPLMTS S IAD I FGDSAVEANDI WVDYNS VEF I S IVDDF 
KFDFVENDRVGKEKTFGFKIGDHTKVNQHAKIVSTNGDLFVDDLLDFDPLIDDFKLEDFP 
MDDLGLLGDPEDDDFSWFNGTTDWIDKFL* 
>G2145 (1..777) 

ATGGACGTTTTTGTTGATGGTGAATTGGAGTCTCTCTTGGGGATGTTCAACTTTGATCAA 
TGTTCATCATCTAAAGAGGAGAGACCGCGAGACGAGTTGCTTGGCCTCTCTAGCCTTTAC 
AATGGTCATCTTGATCAACATCAACACCATM 

TTCTTGCTCCCTGATATGTTCCCATTTGGTGCAATGCCGGGAGGAAATCTTCCGGCCAT^ 

CTTGATTCTTGGGATCAAAGTCAT<^CCTCCAAGAAACGTCTTCTCTTAAGAGGAAACTA 

CTTGACGTGGAGAATCTATGCAAAACTAACTCTAACTGTGACGTCACAAGACAAGAGCTT 

GCGAAATCCAAGAAAAAACAGAGGGTAAGCTCGGAAAGCAATACAGTTGACGAGAGCAAC 

ACTAATTGGGTAGATGGTCAGAGTTTAAGCAACAGTTCAGATGATGAGAAAGCTTCGGTC 

ACAAGTGTTAAAGGCAAAACTAGAGCCACCAAAGGGACAGCC^CTGATCCTCT^AAGCCTO 

TATGCTCGGAAACGAAGAGAGAAGATTAACGAAAGGCTCAAGACACTACAAZ^ACCTTGTG 

CCAAACGGGACAAAAGTCGATATAAGCACGATGCTTGAAGAAGCGGTCCATTACGTGAAG 

TTCTTGCAGCTTCAGATTAAGTTGTTGAGCTCGGATGATCTATGGATGTACGCACCATTG 

GCTTACAACGGCCTGGACATGGGGTTCCATCACAACCTTTTGTCTCGGCTTATGTGA 

>G2145 Amino Acid Sequence (domain in AA coordinates : 166-243) 

MDVFVDGELESLLGMFNFDQCSSSKEERPRDELLGLSSLYNGHLHQHQHHNNVLSSDm 

FLLPDMFPFGAMPGGNLPAMLDSWDQSHHLQETSSLKRKL^ 

AKSKKKQRVSSESNTVDESNTNWVDGQSLSNSSDD^ 

YARKRREKINERLKTIiQNIiVPNGTKVDI STMLEEAVHYVKFLQLQIKLLS SDDLWMYAPL 

AYNGLDMGFHHNLLSRLM* 

>G23 (22.. 732) 
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TATCAAACGAGAGTACAAAAGATGACGTCACTCAACAGCTCTGCATCACCAACATCATCG 
TCATCAGACCAATCTGATGCAACTACTACAACAAGCACCCACTTGTCTGAAGAAGAAGCT 
CCACCCAGAAACAACAACACAAGAAAGAGAAGGAGAGATTCTTCTTCTGCTTCTTCATCT 
TCTTCAATGCAACATCCTGTTTACAGAGGTGTGCGGATGAGAAGTTGGGGCAAATGGGTC 
TCCGAGATCCGACAACCTCGTAAGAAAACTCGTATTTGGCTCGGCACTTTTGTCACCGCT 
GATATGGCTGCTCGTGCTCACGACGTCGCTGCTCTCACCATCAAAGGCTCCTCCGCCGTC 
TTAAATTTCCCTGAGCTTGCTTCTCTCTTCCCTCGTCCGGCGTCATCATCGCCGCATGAT 
ATCCAGACAGCCGCCGCAGAAGCCGCCGCCATGGTGGTCGAAGAA7VAACTGTTAGAGAAG 
GATGAGGCTCCGGAGGCCCCACCTTCGTCGGAATCTTCTTACGTGGCGGCGGAGTCAGAG 
GATGAGGAGAGGTTGGAGAAAATTGTGGAGCTGCCTAACATTGAAGAAGGAAGTTATGAC 
GAGAGTGTGACATCACGTGCTGATCTGGCTTATTCTGAGCCGTTCGATTGTTGGGTGTAT 
CCTCCGGTTATGGATTTTTATGAAGAAATATCGGAGTTTAATTTCGTGGAATTGTGGAGC 
TTTAATCACTAATTAAGTTAGGAAAGTGCATTATATTGCAATATTGCATCATAGATAACA 
TTTGTATTTCTTTTCTTTTTGTACGGATACGTAGCATATGCTACTATACTAGGGCTAGTG 
TACCAAATATTGTAAAATATACTTATTAATATTTATGTAAATGTGTAATATATATAACAT 
ACAATTATTGTAAGTTTGGAAATTGGAAACTATCGTTACGCAATGTTCTTGTAAAAAAAA 

AAAAAAAAAA 

>G23 Amino Acid Sequence (domain in AA coordinates: 61-117) 

MTSLNSSASPTSSSSDQSDATTTTSTHLSEEEAPPRNNNTRKRRRDSSSASSSSSMQHPV 

YRGVimRSWGK^SEIRQPRKKTRlWLGTFVTADMAARAHDVAALTI 

SLFPRPASSSPHDIQTAAAEAAAMWEEKLLEKDEAPEAPPSSESSYVAAESEDEERLEK 

IVELPNIEEGSYDESVTSRADLAYSEPFDCWVYPPVMDFYEEISEFNFVELWSFNH* 

>G2313 (104 . .724) 

CGTCGACACAATCGCTCTTCCGTAACATATTCCACAAAACGATCTTCTTGTTTCTTGAAT 
TTTTAGCCATCTCTTTTTTTTTTTTCTCATTTTCTCGGATACTATGGCTTCGAGTCCACG 
CTGGACGGAGGACGACAACAGGCGTTTTAAGTCAGCTCTGTCGCAATTCCCTCCGGATAA 
CAAGCGTTTGGTGAATGTCGCCCAGCATCTGCCGAAACCTTTGGAGGAGGTGAAGTACTA 
CTACGAAAAGTTGGTCAACGATGTTTATCTGCCGAAACCTTTAGAGAATGTCACCCAGCA 
TCTGCAGAAACCTATGGAAATGGAGGAGATGAAGTACATGTACGAAAAGATGGCCAACGA 
TGTTAATCAGATGCCCGAGTACGTACCACTGGCGGAATCGAGTCAGTCCAAACGCAGGAA 
GAAGGATACGCCAAATCCTTGGACAGAAGAGGAACACAGATTGTTTCTGCAAGGATTGAA 
AAAGTATGGGGAAGGAGCTTCGACGTTGACATCAACGAATTTTGTGAAGACAAAGACTCC 
ACGGCAAGTGTCAAGCCATGCACAGTATTACAAAAGGCAAAAATCGGACAATAAGAAGGA 
GAAACGCCGGAGTATTTTTGACATAACTTTGGAGTCTACCGAGGGCAATCCAGATTCTGG 
AAATCAGAACCCTCCGGATGATGATGATCCGTCCCAAGGTCAAGGCACTTGTCTTGGAGT 
TTAGATGTTGGAAGATAGAAGAATGGTGTGAAAGC 

>G2313 Amino Acid Sequence (domain in AA coordinates : TBD) 

MASSPRWTEDDNRRFKSALSQFPPDNKRLVNVAQHLPKPLEEVKYYYEKLVNDVYLPKPL 

E^^\F^QHLQKPMEMEEMKYWYEKMANDW 

FLQGLKKYGEGASTLTSTNFVKTKTPRQVSSHAQYYKRQKSDNKKEKRRSIFDITLES 

GNPDSGNQNPPDDDDPSQGQGTCLGV* 

>G2344 (1..573) 

ATGACTTCTTCAATCCATGAGCTTTCTGATAACATTGGAAGTCATGAGAAGCAAGAACAG 
AGAGATTCTCATTTCCAACCACCAATCCCTTCTGCAAGAAATTATGAATCAATTGTTACA 
AGTTTAGTCTACTCAGACCCGGGGACTACAAATTCCATGGCACCTGGACAATATCCATAT 
CCAGATCCTTACTACAGAAGCATATTTGCACCGCCTCCACAACCGTATACCGGGGTACAT 
CTACAGTTGATGGGAGTGCAGCAACAAGGCGTTCCTTTACCATCTGATGCAGTCGAGGAA 
CCTGTTTTTGTTAAGGCAAAGCAATACCACGGTATACTAAGGCGCAGACAATCAAGAGCA 
AGACTTGAGTCTCAGAATAAAGTCATCAAGTCACGTAAGCCGTATTTGCATGAATCTCGG 
CATTTGCATGCGATAAGACGACCAAGAGGATGTGGCGGGCGGTTTCTAAATGCCAAGAAG 
GAGGATGAGCATCACGAAGACAGTAGTCATGAAGAAAAATCCAACCTTAGCGCTGGTAAA 
TCCGCCATGGCTGCTTCTAGTGGTACATCTTGA 

>G2344 Amino Acid Sequence (domain in AA coordinates: TBD) 
MTSSIHELSDNIGSHEKQEQRDSHFQPPIPSARNYESIVTSLVYSDPGTTNSMAPGQYPY 
PDPYYRSIFAPPPQPYTGVHLQLMGVQQQGVPLPSDAVEEPVFVNAKQYHGILRRRQSRA 
RLESQNKVI KSRKPYIiHESRHLHAIRRPRGCGGRFI»NAKKEDEHHEDS SHEEKSNLSAGK 
SAMAASSGTS* 
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>G2430 (69.. 1907) 

AACTTCAACATACACATAATCTCTCACTTAAAAATATCTCTCTCTCTCTCTCTACAAAAT 
CAATTCCAATGTTGGTGGGAAAGATAAGTGGATATGAAGATAATACTCGCTCTTTGGAGC 

gagaaacatctgaaatcacttctcttctcagccaatttccggggaatS 

TTGTTGACACCAATTTCACCACTCTACTCAACATGAAACAAATCATGAAACAATACGCTT 

ATCAAGTGTCTATTOAGACAGATGCAGAAAAAGCTCTTGCGTTTTTGA^ 

ATGAAATCAATATTGTGATTTGGGA^^ 

TCAA^CRTTACTTC!ASAGTTGGATTIMCTGTAGTW 

CGGAATCTGTGATGAAAGCAACATTTTACGGTGCTTGTGACTATGTTGTGAAArcGGTTA 
AAGAAGAGGTAATGGCCAATATATGGCAACACATTGTACGGAAGAGG?^ 

CGGATGTTGCTCCACCGGTTCAATCAGATCCGGCTCGCTCTGACCGTTTAGACCAAGTCA 
AAGCTGATTTCAAGATCGTAGAAGATGAACCA^^^ 

TGAATGGCTATTCCCCAATCATGAACC^\AGATAACATGTTCAACAAA^ACCACCTAAAC 

SS^ GACGTGGACAGAAGTTA " CAACC ^ 
^ CGGCCAACTC ^ TGACTATTCCCAAATCATG ^ C ^ 

CAGCAACCAAACCACAATTGACGTGGACCGAAGAAATTCAACCGGTTCAATCAGGTCTGG 
^AGCCAACGAGTTCAGCAAAGTGAATGGA^^ 

TGTTCAACAAATCAGCAACCAACCCGCGATTGACATGGAACGAATTAOTTcS 

aatcagatctggttcaatccaatgagtttagccaattcagtgaS^ot^Stca 

™ CC ^ GTTCAATCA ^^^ 

ccataaccataaacggaggtaacggcatacaaaacatggaaaagaaacaaggaa^^ 

CACGGAAGCCGCGGATGACGTGGACCGAAGAGCTTCACCAAAAArTTCTGGAAGC^G 

aaataattggtggtatcgaaaaagctaacccaaaggtacttgtcgaatS 

TGAGGATAGAAGGAATTACTAGAAGCAATGTGGCAAGTCATCTTCAGAAACACCGTATCA 

atcttgaagaaaaccaaattcctcaacaaacacaagggaatgS^ 

GTACACTAGCTCCCTCTCTCCAAGGTTCAGACAATGTCAACACAACAATAC^CATCOT 

^aatggtccagccactttgaaccaaa^ 
^aatgaacaacaaccagatcmaaccaa^ 

ATCATCACCAACAGCT^CATCAGTCTTCTCCTCAATTTAATTACCTGATGAACAAT 
AA ^^ CAAGCCTCTGGC ^^ 

CATATGATCCACAAGAGTATCTAATCTUVTGGCTACAATTATAATTAGTCAT^ 



M^niTc^L^l!!^ 06 (domain ln coordinates: 425-478) 

QV 



>G2517 (66.. 895) 

^ C ^ CA ^ GTCTCTCTOTTOCTCTAACCAT ^TCTCTTTGATCTCTTTCTCTGTGTTT 

^^^^^^^^^^^ 

CGAA^Spa^^ CAAAGAAGCGGm ^ TOCACT 

CG ^ ARGA ^ GTGCCA ^ 

ATGG ™TAAGTC3GAGAAAATACGGTCAA^ 

ATTATTACCGTTGCACAACAACTTGGTGTGACGTGAAGAA^AGAGTA^ 
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GTGATCCAAGCAGTGTAATCACCACTTACGAAGGTCAACATACTCATCCTCGTCCACTAC 

TCATCATGCCCAAAGAAGGCAGCTCTCCATCCAATGGCTCAGCTTCTAGGGCCCACATTG 

GCCTCCCTACACTCCCTCCTCAGCTTTTAGATTACAACAACCAACAACAACAAGCGCCGT 

CTTCTTTTGGAACCGAGTACATTAACAGGCAAGAAAAAGGAATTAATCATGATGATGATG 

ACGATCATGTTGTGAAGAAGAGTCGAACTCGGGATCTGCTGGATGGAGCTGGTTTAGTCA 

AAGATCATGGCCTTCTTCAGGATGTTGTTCCCTCTCATATCATTAAGGAAGAGTATTAGT 
TAATCGCATAATTATGTAGCTAGCTAGCTAG 

>G2517 Amino Acid Sequence (domain in AA coordinates - TBD) 
MENVGVGMPFYDLGQTRVYPLLSDFHDLSAERYPVGFMDLLGVHRHTPTHTPLMHFPTTP 
NSSSSEAVNGDDEEEEDGEBQQHKTKKRFKFTKMSRKQTKKKVPKVSFITRSEVLHLDDG 
YKWRKYGQKPVKDSPFPRNYYRCTTTWCDVKKRVERSFSDPSSVITTYEGQHTHPRPLLI 

MPKEGSSPSNGSASRAHIGLPTLPPQLLDYNNQQQQAPSSFGTEYINRQEKGINHDDDDD 

HWKKSRTRDLLDGAGLVKDHGIJjQDVVPSHIIKEEY* 

>G2521 (103.. 768) 

ATTCTCCACAATTTCATAACTTTCTTCCGCTCAACTTCAGATAAATTCGGATTCTGTAGC 

TCTTTCAATACGACTGCGGAGATCAGAGCCAATTATTTGGTTATGGCGTCTCTGATCTCA 

GATATTGAACCGCCGACGAGTACTACTTCAGATCTCGTTCGGAGAAAGAAGAGATCCTCT 

GCTTCATCCGCCGCATCGTCTCGTTCAAGCGCATCTTCCGTCTCCGGTGAGATTCACGCG 

CGATGGCGATCGGAGAAGCAACAACGGATCTACTCAGCCAAACTGTTCCAAGCGCTCCAA 

CAAGTCCGCCTCAACTCTTCCGCCTCAACATCATCATCTCCAACGGCTCAGAAACGAGGA 

AAGGCCGTCCGTGAAGCCGCCGATCGAGCTCTTGCCGTTTCCGCTCGGGGAAGAACACTC 

TGGAGCAGAGCGATCTTAGCTAATCGGATCAAACTGAAATTTCGTAAACAGAGACGTCCT 

CGAGCTACGATGGCGATTCCGGCCATGACTACGGTGGTTAGTAGCAGCAGCAACAGATCG 

AGAAAACGGAGAGTGTCGGTGTTGAGATTGAATAAGAAGAGTATACCGGATGTTAACCGG 

AAAGTACGTGTTCTAGGCCGGTTAGTTCCCGGTTGCGGTAAACAATCCGTACCGGTGATT 

CTAGAAGAAGCAACTGATTATATTCAGGCTCTGGAGATGCAAGTGAGAGCCATGAACTCT 

TTAGTTCAGCTTCTCTCCTCCTACGGCTCAGCTCCTCCACCGATTTGATGAGGTTAAAAT 

CGTCTTTTTAATTCTACCATCTCTCGATCTTTCACAGCTTATGTGTATATAGAAGATTCG 

GTTTGATTATAATCTGTAACTACTCTTCCCAACCGCTGATTCTTCTCTGCTACAAGTAAA 

AGTAAATTTTGAACCGAGTCTTCCCATTTTTACGATCCTCAAGTCTAAATTAAGTATATG 
ATTGATTAATAAAGTCTTTACCATTAGGGTTC 

M»cTTL^I in0 AGid Se( I uence (domain in AA coordinates: 145-213) 

MASLISDIEPPTSTTSDLVRRKKRSSASSAASSRSSASSVSGEIHARWRSEKQQRIYSAK 

LFQALQQTOLNSSASTSSSPTAQKRGKAVREAADRAIAVSARGRTLWSRAIIA^IKLKF 

^08^ miP ^ TTWSSSSNRSRKRRVSVLRLNKKSIp DVNRKVRVLGRLVPGCGK 

QSVPVILEEATDYIQALEMQVRAMNSLVQLLSSYGSAPPPI* 

>G258 (60.. 983) 

^™ CCACCCT6CTGGTOAAT ^ 

TGAGAGAGAAGTGGGAAATGAAAAGAGATGAAATGGGACATCGATGTTGTGGAAAACACA 
AAGTGAAGAGAGGTCTTTGGTCTCCAGAGGAAGACGAGAAGCTTCTTCGTTATATCACCA 
CTCATGGTCATCCTAGTTGGAGTTCCGTTCCAAAGCTTGCCGGGTTGCAGAGATGTOGGA 
AGAGTTGCAGATTAAGGTGGATAAACTATCTAAGGCCTGATCTGAGGAGAGGTTCGTTTA 
ATGAGGAAGAAGAGCAGATTATCATCGACGTACATCGTATTCTTGGTAACAAATGGGCTC 
AGATTGCTAAGCACTTACCTGGACGCACTGATAATGAAGTCAAGAACTTTTGGAACTCAT 
GCAOTAAGAAGAAACTTCTTTCTCAAGGCTTAGATCCTTCTACACATAATCrTATGCCTT 
CACACAAAAGATCTTCTTCTTCAAACAATAATAATATCCCCAAGCCAAACAAAACGACGT 
CCATCATGAAGAACCCTACTGATCTTGATGAATCAACCACTGCTTTTTCAATCACAAACA 
TCAATCCACCCACTTCCACTAAACCAAACAAACTTAAATCTCCTAACCAGACTACAATCC 
CATCTCAAACCGTGATCCCTATCAATGATAACATGTCAAGTACTCAAACCATGATCCCTA 
TCAATGATCCCATGTeAAGTCTTTTAGATGATGAGAATATGATTCCTCACTGGTCAGATG 
TTGATGGAATGGCGATCCACGAAGCTCCGATGTTGCCTAGTGATAAGGCAGTAGTGGGAG 
TGGATGATGATGATCTGAACATGGACATTTTGTTTAACACTCCTTCTTCTTCTGCTTTTG 

ATCCTGATTTTGCTTCCATTTTCTCCTCTGCAATGTCTATCGATTTCAATCCCATGGATG 
ATCTTGGCAGCTGGACCTTTTAGCTTTTACTCTACAGC 

>G258 Amino Acid Sequence (domain in AA coordinates- 24-124) 
MREKWEMKRDEMGHRCCGKHK^GLWSPEEDEKLLRYITTHGHPSWSSVPKLAGLQRCG 
KSCRLRWIHYLRPDLRRGSFNEEEEQIIIDVHRILGNKWAQIAKHLPGRTDNEVKNFWNS 
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CIKKKLLSQGLDPSTHNLMPSHKRSSSSNWNNIPKPNKTTSIMI^NPTDLDQSTTAFSITN 

INPPTSTKPNKLKSPNQTTIPSQTVIPINDNMSSTQTMIPINDPMSSLLDDENMIPHWSD 

VDGMAIHEAPMLPSDKAWGVBDDDLNMDILFNTPSSSAFDPDFASIFSSAMSIDFNPMD 

DLGSWTF* 

>G280 (108.. 722) 

AAGTTAATATGAGAATAATGAGAAAACCACTTTCCCAAATTGCTTTTTAAAATCCCTCCT 
CACACAGATTCCTTCCTTCATCACCTCACACACTCTCTACGCTTGACATGGCCTTCGATC 
TCCACCATGGCTCAGCTTCAGATACGCATTCATCAGAACTTCCGTCGTTTTCTCTCCCAC 
CTTATCCTCAGATGATAATGGAAGCGATTGAGTCCTTGAACGATAAGAACGGCTGCAACA 
AAACGACGATTGCTAAGCACATCGAGTCGACTCAACAAACTCTACCGCCGTCACACATGA 
CGCTGCTCAGCTACCATCTCAACCAGATGAAGAAAACCGGTCAGCTAATCATGGTGAAGA 
ACAATTATATGAAACCAGATCCAGATGCTCCTCCTAAGCGTGGTCGTGGCCGTCCTCCGA 
AGCAGAAGACTCAGGCCGAATCTGACGCCGCTGCTGCTGCTGTTGTTGCTGCCACCGTCG 
TCTCTACAGATCCGCCTAGATCTCGTGGCCGTCCACCGAAGCCGAAAGATCCATCGGAGC 
CTCCCCAGGAGAAGGTCATTACCGGATCTGGAAGGCCACGAGGACGACCACCGAAGAGAC 
CGAGAACAGATTCGGAGACGGTTGCTGCGCCGGAACCGGCAGCTCAGGCGACAGGTGAGC 
GTAGGGGACGTGGGAGACCTCCGAAGGTGAAGCCGACGGTGGTTGCTCCGGTTGGGTGCT 
GAATTAATCGGTACTTATGCAATTTCGGAATCTTTAGTTACTGAAAAATGGAATCTCTTA 
GAGAGTAAGAGAGTGCTTTAATTTAGCTTAATTAGATTTATTTGGATTTCTTTCAGTATT 
TGGATTGTAAACTTTAGAATTTGTGTGTGTGTTGTTGCTTAGTCCTGAGATAAGATATAA 
CATTAGCGACTGTGTATTATTATTATTACTGCATTGTGTTATGTGAAACTTTGTTCTCTT 
GTTGAAAAAAAAAAAAAAAAAAA 

>G280 Amino Acid Sequence (domain in AA coordinates: 97-104,130-137-155-162,185 
192) 

MAFDLHHGSASDTHSSELPSFSLPPYPQMIMEAIESLNDKNGCNKTTIAKHIESTQQTLP 

PSHMTLLSYHLNQMKKTGQLIMVKNNYMKPDPDAPPKRGRGRPPKQKTQAESDAAAAAW 

AATWSTDPPRSRGRPPKPKDPSEPPQEKVITGSGRPRGRPPKRPRTDSETVAAPEPAAQ 

ATGERRGRGRPPKVKPTWAPVGC* 

>G3 (16.. 477) 

GTTTGTCTTTTATCAATGGAAAGAGAACAAGAAGAGTCTACGATGAGAAAGAGAAGGCAG 
CCACCTCAAGAAGAAGTGCCTAACCACGTGGCTACAAGGAAGCCGTACAGAGGGATACGG 
AGGAGGAAGTGGGGCAAGTGGGTGGCTGAGATTCGTGAGCCTAACAAACGCTCACGGCTT 
TGGCTTGGCTCTTACACAACCGATATCGCCGCCGCTAGAGCCTACGACGTGGCCGTCTTC 
TACCTCCGTGGCCCCTCCGCACGTCTCAACTTCCCTGATCTTCTCTTGCAAGAAGAGGAC 
CATCTCTCAGCCGCCACCACCGCTGACATGCCCGCAGCTCTTATAAGGGAAAAAGCGGCG 
GAGGTCGGCGCCAGAGTCGACGCTCTTCTAGCTTCTGCCGCTCCTTCGATGGCTCACTCC 
ACTCCGCCGGTAATAAAACCCGACTTGAATCAAATACCCGAATCCGGAGATATATAGTCA 
ATTTATATACATGTAGTTTGTTTTGTTTGATTAGAAGATTACATTTACATACAAGATACA 
CATAGATACTGGAAAATATAGGTATGTATACATTCATAAATTATCTTATGTATCAAAGAA 
TTTTATAGATTCTGATTAGCTTTTTGTTTTTGTTTTTG 

CGGAGACAAAACCGGCTAAGAGCAATCCATGAGAAGCTAGCGAGTGTTTTTTAGTTCAAG 
TTGTAATATAAATGCATATTAATTCTTTAGTAATTTTGT 

>G3 Amino Acid Sequence (domain in AA coordinates: 28-95) 
MEREQEESTMRKRRQPPQEEVPNHVATRKPYRGIRRRKWGKWAEIREPNKRSRLWLGSY 
TTD I AAARAYDVAVFYLRGPS ARLNFPDLLLQEEDHLSAATTADMPAAL I REKAAEVGAR 
VDALLASAAPSMAHSTPPVIKPDLNQIPESGDI * 
>G343 (1..795) 

ATGGACGTCTATGGeTTATCTTC^CCAGACTTACTTCGAATCGACGACCTTCTTGATTTC 

TCCAACGAAGACATCTTOTCCGCTTCTTCrTCCGGTGGTTCCACCGCCGCTA^ 

TCTTCTTTCCCTCCTCCTCAAAACCCTAGTTTCCACCACCACCATCTCCCTTCCTCCGCC 

GATC^TCACTCCTTCCTCCACGACATTTGCGTTCCCAGTGATGACGCAGCTCATCTTGAA 

TGGCTTTCGCAATTCGTGGACGATTCTTTCGCT^ 

ACTATGACTTCTGTCAAAACTGAAA(m , CCTTTCCGGGGAAACCAAGAAGCAAACGAT^ 
AGAGCTCCTH3CTCCTTTOTCCGGAA(^TGGTCT 

CAGCTTCAOTCCGCCGCCAAATTCAAGCCT^AGAAAGAACAATCCGGCGGAGGAGGAGGA 
GGAGGAGGAAGACATCAGTCATCGTCATCGGAGACTACGGAAGGAGGAGGAATGAGGAGA 
TGTACTCACTGTGCATCGGAGAAAACGCCACAGTGGAGGACAGGACCACTTGGACCTAAA 
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ACACTATGTAACGCTTGTGGAGTCCGGTTTAAATCCGGTAGACTTGTACCGGAATATAGA 
CCGGCTTCGAGTCCTACTTTTGTTTTGACTCAGCATTCAAACTCTCACCGGAAAGTGATG 
GAGCTTCGACGGCAGAAAGAAGTTATGAGACAACCACAACAAGTTCAACTTCATCACCAC 

CACCACCCGTTTTAG 

>G343 Amino Acid Sequence (domain in AA coordinates: 178-214) 

MDVYGLSSPDLLRIDDLLDFSNEDIFSASSSGGSTAATSSSSFPPPQNPSFHHHHLPSSA 

DHHSFLHDICVPSDDAAHLEWLSQFVDDSFADFPANPLGGTMTSVKTETSFPGKPRSKRS 

RAPAPFAGTWSPMPLESEHQQLHSAAKFKPKKEQSGGGGGGGGRHQSSSSETTEGGGMRR 

CTHCASEKTPQWRTGPLGPKTLCNACGVRFKSGRLVPEYRPAS S PTFVLTQHSNSHRKVM 

ELRRQKEVMRQPQQVQLHHHHHPF * 

>G363 (1..780) 

ATGAGACCAATATTAGACCTCGAAATTGAAGCTTCATCGGGCAGTAGTAGCAGCCAAGTG 
GCCTCAAACTTGTCTCCGGTTGGGGAAGATTACAAACCAATCTCGCTGAATCTTAGCCTC 
AGTTTCAACAACAACAACAACAATAATCTGGATCTTGAATCATCGTCTTTGACGCTGCCA 
CTTTCGAGCACGAGTGAGAGTAGTAACCCGGAGCAGCAGCAGCAACAACAACCATCTGTA 
TCAAAGAGAGTCTTCTCTTGTAACTACTGCCAAAGGAAGTTCTATAGCTCTCAAGCGCTA 
GGTGGTCACCAAAACGCTCACAAACGTGAGAGAACACTCGCCAAACGCGCTATGCTATGG 
GTCTTGCTGGGGTCTTCCCCGGTAGAGGATCAAGTAGCAATTATGCGGCTGCTGCCACAG 
CAGCCGCTCTCGTGTTTGCCGCTTCACGGAAGCGGAAACGGGAACATGACATCGTTCAGG 
ACTTTGGGAATCCGGGCACATTCCTCGGCGCACGACGTCAGCATGACAAGGCAGACACCA 
GAAACACTTATTAGAAACATTGCCAGGTTCAACCAGGGGTATTTCGGTAATTGTATACCT 
TTTTACGTGGAGGACGACGAGGCCGAGATGCTCTGGCCGGGGAGTTTCCGGCAAGCTACG 
AATGCGGTTGCGGTTGAAGCGGGTAATGATAATTTAGGTGAAAGAAAAATGGATTTCTTG 
GACGTCAAGCAAGCGATGGATATGGAAAGTTCTCTTCCAGATCTAACCTTGAAGCTTTGA 
>G363 Amino Acid Sequence (domain in AA coordinates: 87-108) 
MRPILDLEIEASSGSSSSQVASl^SPVGEDYKPISLNLSL 

LSSTSESSNPEQQQQQQPSVSKRVFSCNYCQRKFYSSQALGGHQNAHKRERTLAKRAMLW 
VLLGSSPVEDQVAIMRLLPQQPLSCLPLHGSGNGNMTSFRTLGIRAHSSAHDVSMTRQTP 
ETLIRNI ARFNQG Y FGNC I PFYVEDDEAEMLWPGS FRQATNAVAVEAGNDNLGERKMDFL 
DVKQAMDMESSLPDLTLKL* 
>G370 (1..774) 

ATGGACGAAACCAACGGACGAAGAGAAACTCACGATTTCATGAACGTCAACGTTGAATCC ■ 

TTCTCTCAGCTTCCTTTCATCCGCCGTACTCCTCCCAAAGAAAAAGCCGCCATTATTCGT 

CTCTTCGGCCAAGAGCTCGTCGGTGATAACTCCGACAACTTATCCGCAGAACCTTCTGAT * 

CATCAAACCACTACCAAGAACGATGAGAGCTCTGAGAATATCAAGGACAAAGACAAAGAA 

AAAGATAAGGACAAAGACAAAGATAACAACAACAACAGGAGATTCGAGTGTCACTACTGC 

TTCAGAAACTTCCCAACTTCTCAAGCCCTAGGTGGACATCAAAACGCTCACAAACGTGAA 

CGTCAACACGCCAAACGCGGTTCCATGACATCATACCTTCATCATCATCAGCCTCATGAC 

CCTCACCACATCTACGGCTTCCTCAACAACCACCACCACCGTCACTATCCGTCTTGGACG 

ACGGAAGCTAGATCATACTACGGCGGAGGGGGACATCAAACGCCGTCGTACTACTCAAGG 

AATACTCTTGCTCCTCCTTCTTCTAACCCACCGACAATCAACGGAAGTCCTTTAGGTTTG 

TGGCGTGTACCGCCTTCC^CGTCAACAAATACTATTCAAGGCGTTTACTCATCTTCACCA 

GCTTCAGCGTTTAGGTCGCATGAGCAAGAGACTAATAAGGAGCCTAATAACTGGCCGTAC 

AGATTGATGAAACCCAATGTGCAAGATCATGTGAGTCTCGATCTTCATCTCTGA 

>G370 Amino Acid Sequence (domain in aa coordinates: 97-117) 

MDETNGRRETHDFMNVNVESFSQLPFIRRTPPKEKAAIIRLFGQELVGDNSDNLSAEPSD 

HQTTTKNDESSENIKDKDKEKDKDKDKD 

RQHAKRGSMTSYLHHHQPHDPHHIYGFLNNHHHRHYPSWTTEARSYYG 
NTLAPPSSNPPTINGSPLGLWRVPPSTSTNTIQGVYSSSPASAFRSHEQETNKEPimWPY 

RLMKPNVQDHVSLDLHL* I 

>G385 (37.. 2202) 

TAGGGTTTGCITTCAGTTTCCGGAGTA^ 

GCGGCTATGAACAACGCAGACAGCAATAACCACAACTACAACCACGAA^ 

GAAGGATTTCTTCGGGACGATGAATTCGACAGTCCGAATACTAAATCGGGAAGTGAGAAT 

CAAGAAGGAGGATCAGGAAACGACCAAGATCCTCTTCATCCTAACAAGAAGAAACGATAT 

CATCGACACACCCAACTTCAGATCCAGGAGATGGAAGCGTTCTTCAAAGAGTGTCCTCAC 

CCAGATGACAAGCAAAGGAAACAGCTAAGCCGTGAATTGAATTTGGAACCTCTTCAGGTC 
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AAATTCTGGTTCCAAAACAAACGTACCCAAATGAAGAATCATCACGAGCGGCATGAGAAC 

TCACATCTTCGGGCGGAGAACGAAAAGCTTCGAAACGACAACCTAAGATATCGAGAGGCT 

CTTGCAAATGCTTCGTGTCCTAATTGTGGTGGTCCAACAGCTATCGGAGAAATGTCATTC 

GACGAACACC7VACTCCGTCTCGAAAATGCTCGATTAAGGGAAGAGATCGACCGTATATCC 

GCAATCGCAGCTAAATACGTAGGCAAGCCAGTCTCAAACTATCCACTTATGTCTCCTCCT 

CCTCTTCCTCCACGTCCACTAGAACTCGCCATGGGAAATATTGGAGGAGAAGCTTATGGA 

AACAATCCAAACGATCTCCTTAAGTCCATCACTGCACCAACAGAATCTGACAAACCTGTC 

ATCATCGACTTATCCGTGGCTGCAATGGAAGAGCTCATGAGGATGGTTCAAGTAGACGAG 

CCTCTGTGGAAGAGTTTGGCTTTAGACGAAGAAGAATATGCAAGGACCTTTCCTAGAGGG 

ATCGGACCTAGACCGGCTGGATATAGATCAGAAGCTTCGCGAGAAAGCGCGGTTGTGATC 

ATGAATCATGTTAACATCGTTGAGATTCTCATGGATGTGAATCAATGGTCGACGATTTTC 

GCGGGGATGGTTTCTAGAGCAATGACATTAGCGGTTTTATCGACAGGAGTTGCAGGAAAC 

TATAATGGAGCTCTTCAAGTGATGAGCGCAGAGTTTCAAGTTCCATCTCCATTAGTCCCA 

ACACGTGAAACCTATTTCGCACGTTACTGTAAACAACAAGGAGATGGTTCGTGGGCGGTT 

GTCGATATTTCGTTGGATAGTCTCCAACCAAATCCCCCGGCTAGATGCAGGCGGCGAGCT 

TCAGGATGTTTGATTCAAGAATTGCCAAATGGATATTCTAAGGTGACTTGGGTGGAGCAT 

GTGGAAGTTGATGACAGAGGAGTTCATAACTTATACAAACACATGGTTAGTACTGGTCAT 

GCCTTCGGTGCTAAACGCTGGGTAGCCATTCTTGACCGCCAATGCGAGCGGTTAGCTAGT 

GTCATGGCTACAAACATTTCCTCTGGAGAAGTTGGCGTGATAACCAACCAAGAAGGGAGG 

AGGAGTATGCTGAAATTGGCAGAGCGGATGGTTATAAGCTTTTGTGCAGGAGTGAGTGCT 

TCAACCGCTCACACGTGGACTACATTGTCCGGTACAGGAGCTGAAGATGTTAGAGTGATG 

ACTAGGAAGAGTGTGGATGATCCAGGAAGGTCTCCTGGTATTGTTCTTAGTGCAGCCACT 

TCTTTTTGGATCCCTGTTCCTCCAAAGCGAGTCTTTGACTTCCTCAGAGACGAGAATTCA 

AGAAATGAGTGGGATATTCTGTCTAATGGAGGAGTTGTGCAAGAAATGGCACATATTGCT 

AACGGGAGGGATACCGGAAACTGTGTTTCTCTTCTTCGGGTAAATAGTGCAAACTCTAGC 

CAGAGCAATATGCTGATCCTACAAGAGAGCTGCATTGATCCTACAGCTTCCTTTGTGATC 

TATGCTCCAGTCGATATTGTAGCTATGAACATAGTGCTTAATGGAGGTGATCCAGACTAT 

GTGGCTCTGCTTCCATCAGGTTTTGCTATTCTTCCTGATGGTAATGCCAATAGTGGAGCC 

CCTGGAGGAGATGGAGGGTCGCTCTTGACTGTTGCTTTTCAGATTCTGGTTGACTCAGTT 

CCTACGGCTAAGCTGTCTCTTGGCTCTGTTGCAACTGTCAATAATCTAATAGCTTGCACT 

GTTGAGAGAATCAAAGCTTCAATGTCTTGTGAGACTGCTTGAAAACCATCCATTAGC 

>G385 Amino Acid Sequence (domain in AA coordinates: 60-123) 

MFEPlMLLAAMimADSNiraNY^ 

HPNKKKRYHMTQLQIQEMEAFFKECPHPDDKQRKQLSRELNLEPLQVKFWFQNKRTQMK 
NHHERHENSHLRAENEKLRNDNLRYREALANASCPNCGGPTAIGEMSFDEHQLRLENARL 
REEIDRISAIAAKYVGKPVSNYPLMSPPPLPPRPLELAMGNIGGEAYGNNPNDLLKSITA 
PTESDKPVIIDLSVAAMEEIiMRMVQVDEPLWKSLALDEEEYARTFPRGIGPRPAGYRSEA 
SRESAWIMNHVNIVEI LMDVNQWSTI FAGMVSRAMTLAVLSTGVAGNYNGALQVMS AEF 
QVPS PLVPTRETYFARYCKQQGDGSWAVVDI SLDS LQPNPPARCRRRASGCLI QELPNGY 
S KVTWVEHVE VDDRGVHNL YKHMVS TGHAFG AKRWVAI LDRQ CERLAS VMATNI S S GEVG 
VITNQEGRRSMLKLAERMVI S FCAGVS ASTAHTWTTLSGTGAEDVRVMTRKSVDDPGRS P 
GIVLSAATSFWIPVPPKRVFDFLRDENSRNEWDILSNGGWQEMAHIANGRDTGNCVSLL 
RWSANSSQSNMLILQESCIDPTASFVIYAPVDIVAMNIVIiNGGDPDYVT^IiLPSGFAILP 
IX^ANSGAPGGDGGSLLWAFQILVDSVPTAKLSLGSVATVIOTLIACTVERIKASMSCET 
A* 

>G439 (128.. 967) 

TATAAATCTTCGTTTCTACTTTTTTTTCTTCCATAATATAGTCAATTCGTTTT 

AGGGCTTCTTCTCTTTGTTTCTCCAATCTTTATTAGTTTATTTATT^ 

TATACAAATGGCAATGGCTTTAAACATGAATGCTTACGTAGACGAGTTCATGG7UVGCTCT 

TGAACCATTCATGAAGGTAACTTCATCTTCTTCTACTTCGAATTCATCAAATCCAAAACC 

ATTAACTCCTAATTTCATCCCTAATAATGACC^^ 

TCCGATTGGGCTAAACCAGCTCACTCCAACAC^^ 

TCTCCGGCAAAACCAATCTCGTCGTCGCGCTGGTAGTCATCTTCTC^ 

CTCAATGAAGAAAATCGACGTAGCAACTAAACCGGTTAAACTATACCGAGGCGTAAGACA 

GAGGCAATGGGGTAAATGGGTAGCTGAGATTCGGCTACCTAAAAACCGAACCCGGTTATG 

GCTCGGTACGTTCGAAACGGCTCAAGAAGCTGCATTAGCTTACGATCAAGCAGCTCATAA 

GATCAGAGGAGACAACGCTCGTCTCAATTTCCC&GACATTC 
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ACAGATATTGTCTCCGTCTATCAACGCAAAGATCGAATCCATCTGCAATAGTTCTGATCT 
TCCACTGCCTCAGATCGAGAAACAGAACAAAACAGAGGAGGTGCTCTCTGGTTTTTCCAA 
ACCGGAGAAAGAACCGGAATTTGGGGAGATATACGGATGCGGATACTCGGGCTCATCTCC 
TGAGTCGGATATAACGTTGTTGGATTTCTCAAGCGACTGTGTGAAAGAAGATGAGAGTTT 
CTTGATGGGTTTGCACAAGTATCCTTCTTTGGAGATTGATTGGGACGCTATAGAGAAACT 
CTTCTGAATCCATTTTATCTTTTTGATTCATTTGTCTCTAAATTGTAGAATTTTATTTTC 
AGAGCTTTGTAAGGGAAGTTCTTGAATGAGAGTTGCAGAGGACTAGTGGAACCTAACTCT 
GTTTTCTTTTGTAAGTATTGTTTATAATGGGCCGTTGAATGGGCCTTATTGATTTAAACA 
GCCCAAGTTTTTAAAAAAAAAAAAAAAAAAAAAAAAA 

>G439 Amino Acid Sequence (domain in AA coordinates: 110-177) 

MAHALNMNAYVDEFME ALEPFMKVTSS SSTSNS SNPKPLTPNF I PNNDQVLP VSNQTGP I 

GLNQLTPTQILQIQTELHLRQNQSRRRAGSHLLTAKPTSMKKIDVATKPVKLYRGVRQRQ 

WGKWVAE I RLPKNRTRLWLGTFETAQEAALAYDQAAHKIRGDNARLNFPD I VRQGHYKQ I 

LSPSINAKIESICNSSDLPLPQIEKQNKTEEVLSGFSKPEKEPEFGEIYGCGYSGSSPES 

DITLIiDFSSDCVKEDESFLMGLHKYPSLEIDWDAIEKLF* 

>G440 (237.. 1301) 

AAAAAATCACTGTTTCATAACACGTTTTTCTCTCTCACCCACCAAAAAAAAATCTTTTGT 

TCTTGTTACCAAAAAATCTCGTGATAAATCTCTTCAAACTTTGTTTTATTTTCTTCTTGA 

TTCTCTCGAAATCTCTCTCAACAAACCCAGAAACTTTCCTTGATTCGCAAGCTTTTCTTC 

CTTTTATATTCTTCATTTTGATGCGAATATAGAGAGAGTCCATAAAAGAAACAGTAATGG 

ACGAATATATTGATTTCCGACCATTGAAGTACACAGAGCACAAGACTTCAATGACTAAAT 

ACACCAAAAAGTCATCGGAAAAACTTTCCGGTGGTAAGTCATTGAAAAAGGTTAGTATTT 

GTTATACTGATCCTGACGCAACAGATTCATCAAGTGACGAAGACGAAGAAGATTTCTTGT 

TTCCTCGCCGGAGAGTCAAAAGATTCGTTAACGAGATCACTGTTGAGCCTAGCTGTAACA 

ACGTCGTCACCGGAGTTTCGATGAAAGATAGAAAGAGACTCTCTTCTTCCTCCGATGAAA 

CTCAATCTCCGGCGTCGAGTCGTCAACGTCCTAATAACAAAGTTTCAGTCTCCGGTCAGA 

TAAAGAAGTTCCGTGGTGTTAGACAACGGCCATGGGGGAAATGGGCGGCGGAGATTAGAG 

ATCCGGAGCAACGTCGGAGGATTTGGCTCGGGACTTTTGAGACGGCGGAGGAAGCTGCCG 

TGGTTTATGATAACGCCGCTATAAGACTCCGTGGACCGGACGCTTTAACTAATTtCTCCA 

TACCGCCTCAAGAAGAGGAAGAAGAAGAAGAACCGGAACCGGTTATTGAGGAGAAACCGG 

TTATTATGACGACGCCAACACCAACAACATCGAGTTCTGAATCAACTGAAGAAGATTTAC 

AACATCTCTCATCTCCTACTTCGGTTCTCAATCACCGGTCAGAAGAGATTCAACAAGTAC 

AACAACCGTTTAAATCAGCTAAACCCGAACCGGGGGTTTCAAATGCACCATGGTGGCATA 

CCGGGTTTAATACCGGTTTAGGTGAATCAGACGATTCATTTCCTTTGGATACTCCGTTTC 

TTGACAACTATTTCAATGAATCACCACCAGAGATGTCAATATTTGACCAACCAATGGATC 

AAATTTTCTGTGAAAATGATGATATCTTCAATGATATGTTGTTCTTGGGTGGTGAAACTA. 

TGAACATTGAAGATGAGTTAACAAGTTCTAGTATCAAAGATATGGGTTCAACGTTTAGTG 

ATTTTGATGATTCATTGATATCAGATCTATTAGTTGCTTAATATGATGATGAGAGTGAAG 

AAGAAACCATCAAGCAAATATCTATGGTGTGACTGATUU^TTTTGGTGTTACTTTTTTTT 

CTTTCATAAGTTCATGAGCTTTTTTGTTTCTTTTTTTTAATAATTTATTTAGTTTTGTCA 

GGAGCTTGTAAAACAGTTTTGGAGAAATAGTGGAAAAATAGTTTAATTAAAAAAAAAAAA 

AAAAAAA 

>G440 Amino Acid Sequence (domain in AA coordinates: 122-189) 
MDEYIDFRPLKYTEHKTSMTKYTKKSSEKLSGGKSLKKVSICYTDPDATDSSSDEDEEDF 

LF PRRRVKRFVNE I TVE PSCNNWTGVSMKDRKRLS S S SDETQS PAS SRQRPNNKVS VSG 

QIKKFRGVRQRPWGKWAAEIRDPEQRRRIWLGTFETAEEAAVVYDNAAIRLRGPDALTNF 

SIPPQEEEEEEEPEPVIEEKPVIMTTPTPTTSSSESTEEDLQHIiSSPTSVLNHRSEEIQQ 

VQQPFKSAKPEPGVSNAPWWHTGFNTGLGESDDSFPLDTPFLDNYFNESPPEMSIFDQPM 

DQIFCENDDIFNDMLFLGGETMNIEDELTSSSIKDMGSTFSDFDDSLISDLLVA* 

>G5 (417.. 1421)* 

TTTTTTTTTTGCAATCTCCCCCTAATCTGTTGTTTCTCGCTTCTTCTTCTGTTAATCA 

TGTCTTTCAAAAAGAAAGAAAAAAGAAAAATTCGATTTCT 
GAAAAAAATCAAGCTTATGAATTTGTGTTTAATC^ 

TTTTCAGAACGAGATCGTTTTTTCAAATTTCTTCTGATTTTACCTCTTm 
GATTTTAGTGAATCGAGGGTGAAATTTTTGATTCCCTCTTTTCGGATCTACAC^GAGGTT 
GCTTATTTCAAACCTTTTAGATCCATTTTTTTTTAATTTTCTCGGAAAAATCCCTC 
TTTACTTTTTTATAAGTCTCAGGTTCAATTTTTTCGGATTCAAATTTTTAT^ 
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TGGACGCGCTTGTACCTTTTATCAAAAGCGTTTCCGATTOTCCTTCTTCTOCTOOTGCAO 

CGTCTGCGTCTGCGTTTCTTCACCCCTCTGCGTTTTCTCTCCCTCCTCTCCCC^^ATT 

ACCCGGATTCAACGTTCTTGACCCAACCGTTTTCATACGGGTC^^ 

GGTCATTAATCGGACTCAACAACCTCTCTTCrrCTCAGATCCACCAGATC^^ 

TCCATCATCCTCTTCCTCCGACGCATCACAACAACAACAACTCTTOCTCG^TC^CTCA 

gcccaaagccgttactgatgaagcaatctggagtcgctggatSgStc^^ 

CAGGTGTTCCTTCGAAGCCGACGAAGCITTACAGAGGTGTGAGGCAAcS 

acacggcggaggaagctgcgttggcctatgataaggcggcgtacaagctgcgcggcIS? 
tcgcccggcttaacttccctaacctacgtcataacggatttcacS 

GTGAATATAAACCTCTTCACTCCTCAGTCGACGCTAAGCTTGAAGCTATTOCTAMArra 
TCGCGGAGACTCAGAAACAGGACAAATCGACGAAA^^ 

™* GGCCAGATC ^^ 

gatctccaccggtgacggagtttgaagagtccaccgctggatcttcgccgSgtcgS^^ 

TGACGTTTOCTGACCCGGAGGAGCCGCCGCAGTGGAACGAGACGTTCTCGTTC 

GCCGCTTGCAATGGAGTTTTTGTGAAATTGCATGACTGGCCCAAGAGTAATTAATTAAAT 

SI G ™ TCCAGCTTGCGGTOTTOTCTCAGGCTCGACG ATGCCACAG^^^^ 

yypdstfltqpfsygsdlqqtgsligl^sssqihqiqsS^pSSS 

>G550 (1..1374) 

A ™™ TGATCCGGCGATTAA ^^ 

^ GA r CTOCraCTAGC TATACCGGATTTTTAACCGAAACTCAGS^ 

tcagattcgtgtaccggcgatgatgatgatgaagagatgggtgattccggS^ 

GAAG ^ GGTGATGATG ^^ 

aaagatagtgagtgtcaggaagagtcattgaggaatga^ 

A S^ GGTATAACTCA ^ 

CCGCGATGTAACAGCATGGAAACCAAGTTCTGTTACTACAACAACTATAATG^^ 

cctcgccatttctccaagaaatgtcagagatato^acagctggSg^ 

^™r GGTGCTGGGAGACGTAAGAAT ^ 

aatggtgcaaatcttctcacttttggctctgattctgtgctttgtgaatctatcgct^ 

CCG ™ CAAAAGTOCCATGCT ^ ccaggacg ^caccaacttggccttaS 
g^ttocgtggacgatitoaccgwttaccctccaccggcitactggagS 

S™ A ^ CGGGGGCATGGAACAGCCTCACATGGA TGCC^^ 
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VPVGAGRRKNKSPASHYNRHVSITSAEAMQKVARTDLQHPNGANLLTFGSDSVLCESMAS 

GLNLVEKSLLKTQTVLQEPNEGLKITVPLNQTNEEAGTVSPLPKVPCFPGPPPTWPYAWN 

GVSWTILPFYPPPAYWSCPGVSPGAWNSFTWMPQPNSPSGSNPNSPTLGKHSRDENAAEP 

GTAFDETESLGREKSKPERCLWVPKTLRIDDPEEAAKSSIWETLGIKKDENADTFGAFRS 

STKEKSSLSEGRLPGRRPELQANPAALSRSANFHESS* 

>G670 (28.. 1152) 

CACAGCATTGCAGCTGTGAATAACTAAATGGGGAGACATTCTTGCTGTTACAAACAAAAG 

CTGAGGAAAGGGCTTTGGTCTCCTGAAGAAGACGAGAAGCTTCTTACTCACATCACCAAT 

CACGGCCATGGCTGCTGGAGCTCTGTCCCTAAACTCGCTGGTTTGCAGAGATGTGGGAAG 

AGTTGTCGACTCGAGCAGATCTGGTACCGCCGACTAAGATGGATCAATTACTTGAGACCT 

GATTTAAAGAGAGGAGCTTTTTCTCCTGAAGAAGAGAATCTCATCGTCGAACTTCATGCC 

GTCCTTGGAAACAGATGGTCACAGATTGCGTCAAGGCTTCCGGGTAGAACCGACAACGAG 

ATCAAGAATCTATGGAACTCAAGCATCAAGAAGAAACTGAAACAAAGAGGCATTGACCCA 

AACACACACAAGCCCATCTCTGAAGTGGAGAGTTTTAGCGACAAAGACAAACCAACAACA 

AGCAACAACAAAAGAAGCGGTAACGATCACAAGTCTCCTAGTTCCTCTTCTGCGACTAAC 

CAAGACTTCTTCCTCGAAAGGCCATCTGATTTATCCGACTACTTCGGATTTCAGAAGCTT 

AACTTCAACTCCAATCTAGGACTCTCTGTTACAACTGATTCTTCACTCTGCTCGATGATT 

CCGCCGCAGTTTAGCCCCGGGAACATGGTTGGTTCTGTCCTTCAGACACCAGTATGCGTA 

AAGCCCTCGATTAGTCTTCCTCCCGACAACAACAGTTCGAGTCCTATCTCCGGAGGAGAT 

CATGTGAAATTGGCTGCACCAAACTGGGAATTTCAGACAAACAACAATAATACCTCAAAT 

TTCTTCGACAATGGCGGATTCTCATGGTCTATCCCAAATTCTTCTACTTCTTCTTCACAA 

GTCAAACCAAATCATAACTTCGAAGAAATAAAATGGTCAGAGTATTTGAACACACCGTTC 

TTCATAGGGAGTACTGTACAGAGTCAAACCTCTCAACCAATCTACATCAAATCAGAAACA 

GATTACTTAGCCAATGTTTCAAACATGACAGATCCTTGGAGCCAAAACGAGAACTTGGGC 

ACAACTGAAACTAGTGACGTGTTCTCCAAGGATCTTCAGAGAATGGCCGTCTCTTTTGGT 

CAGTCCCTTTAGCTTTTTTCTTTCTTTCTTTCTTATTTCTAACAGATGTAGAGAACATAA 

AGATATACAAATACATACAATGTCAATACGTACAGTGGATTTAAGTGTTCTGTATATTTC 

ATGGGCGAGCTGTCTTTATTTTTATGTTTAAA7UVAAAAAAAAAAAAA 

>G670 Amino Acid Sequence (domain in AA coordinates: 14-122) 

MGRHS CCYKQKLRKGLWS PEEDEKLLTHI TNHGHGCWS S VPKLAGLQRCGKS CRLEQ I WY 

MILRWINYLRPDLKRGAFSPEEENLIVELHAVLGNRWSQIASRLPGRTDNEIKNLWNSSI 

KKKLKQRGIDPNTHKPISEVESFSDKDKPTTS^KRSGNDHKSPSSSSATNQDFFLERPS 

DLSDYFGFQKLNFNSNLGLSVTTDSSLCSMIPPQFSPGNMVGSVLQTPVCVKPSISLPPD 

I^SSSPI SGGDHVKLAAP^EFQTWNNNTSNFFDNGGFS WS I PNSSTSSSQVKPNHNFEE 

IKWSEYLNTPFFIGSTVQSQTSQPIYIKSETDYLANVSNMTDPWSQNENLGTTETSDVFS 
KDLQRMAVS FGQSL * 

>G760 (175. .1878) 

TGCTTAATTCCAATGCCATCGTGATCGATTCATCTCTCTCTCTCTCTTCCAATTTTCCCA 

ATTCrTTTTTAAAACCCTAATTTTTCAGATATCTGATTATCTCTTGTATTTCTTCTACTC 

GATTTGCTCCC^TAAAAACCCTTACTTTCTTCAAGTTCTGGTTTTCACCGATTGATGGGT 

CGTGGCTC^GTGACGTCGCTTGCTCCTGGGTTCCGITTTC^CCCGACGGATGAGGAACTT 

GTTCGCTACTACCTTAAGCGTAAGGTCTGCAACAAACCCTTTAAGTTCGATGCTATTTCC 

GTCACCGACATATACAAGTCTGAGCCTTGGGATCTACCAGATAAGTCGAAGCTGAAAAGT 

AGAGACTTGGAATGGTACTTCTTTAGTATGCTGGATAAGAAGTACAGTAATGGTTCCAAG 

ACGAATCGTGCTACGGAGAAAGGGTATTGGAAGACGACTGGGAAAGATCGGGAGATTCGT 

AATGGTTCAAGAGTCGTTGGGATGAAGAAGACACTTGTTTATCACAAGGGTCGAGCTCCT 

CGTGGTGAAAGGACCAATTGGGTTATGCATGAGTATCGGCTTTCTGATGAGGACTTGAAG 

AAAGCTGGTGTGCC-ACAAGAAGCATATGTGTTATGTAGGATATTCCAGAAAAGTGGTACG 

GGTCCTAAGAATGGGGAGCAGTATGGTGCTCCTTATCTTGAGGAGGAGTGGGAAGAAGAT 

GGAATGACTTATGTAeCTGCTCAAGATGCTTTC^GTGAAGGATTGGCTTTGAATGATGAT 

GTTTATGTCGATATTGATGACATTGACGAGAAGCCCGAAAATCTGGTGGTCTATGATGCC 

GTTCCTATTCTACCTAACTATTGTCATGGGGAATCAAGTAACAATGTT 

TACTCAGACTCTGGAAATTACATTCAACCAGGAA 

■ITTGAACAACCAATTGAAACTTTTC 

ATTCAGCCTTGTTCTCTGTTTCC71GAGGAACAAATTGGCTGTGGTGTGCAAGACGAAAAT 
GTGGTGAATCTGGAATCTTCCAACAATAATGTGTTTGTAGCTGATACATGCTACAGTGAC 
ATTCCTATTGATCATAACTATTTACCCGATGAGCCATTCATGGATCCTAATAACAATCTT 
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CCACTCAACGRTGGTCTGTACCTGGAAACGAATGATCTCAGCTGTGCTCAACAAGATGAT 

TTTAACTTCGAAGATTATCTCAGCTTCTTTGATGATGAGGGTTTGACTTTTGACGATTCT 

CTATTAATGGGACCTGAAGATTTTCTTCCCAACCAAGAAGCCCTTGACCAGAAACCTGCC 

rrTRAAGAATTGGAGAAGGAGGTCGCAGGAGGCAAAGAGGCAGTGGAGGAAAAGGAAAGT 

GGCGAAGGATCTTCTTCAAAACAAGATACAGATTTCAAGGACTTTGATTCAGCTCCGAAG 

TACCCATTTCTCAAAAAGACGAGCCACATGCTTGGAGCCATTCCTACTCCATCTTCATTT 

GCTTCACAGTTCCAAACAAAGGACGCAATGCGTCTACACGCAGCACAATCTTCTGGTTCA 

GTTCACGTGACTGCAGGTATGATGAGAATATCAAACATGACTCTAGCAGCGGACAGCGGT 

ATGGGCTGGTCATATGACAAGAACGGTAACCTCAACGTAGTCCTTTCATTCGGGGTAGTC 

CAACAGGATGATGCGATGACTGCCTCGGGAAGCAAGACAGGAATTACGGCGACAAGAGCT 

ATGTTAGTCTTCATGTGTTTATGGGTTCTCCTACTCTCTGTTAGCTTCAAAATAGTAACC 

ATGGTGTCTGCTCGGTAATAGGATCAAAGTTGAATCGTCTCAAAGACTTTTTTTGGTGTT 

TGTACCTCTCCAATCATATAGCCTTTAACTTTGGCAGTGCTTTGCTGCTCAATATTTAAA 

TTTTAAAAAAAAAAAAAAAAA 

>G760 Amino Acid Sequence (domain in AA coordinates: 12-156) 
MGRGSVTSLAPGFRFHPTDEELVRyYLKRKVCNKPFKFDAISVTDIYKSEPWDLPDKSKL 
KSRDLEWYFFSMLDKKYSNGSKTNRATEKGYWKTTGKDREIRNGSRWGMKKTLVYHKGR 
APRGERTNWVMHEYRLSDEDLKKAGVPQEAYVLCRIFQKSGTGPKNGEQYGAPYLEEEWE 

EDGMTYVP AQDAF S EGLALNDD VYVD IDD IDEKPENLWYDAVP ILPNYCHGE S SHNVE S 
GNYSDSGNYIQPGNWWDSGGYFEQPIETFEEDRKPIIREGSIQPCSLFPEEQIGCGVQD 
ENWNLESSNNNVFVADTCYSDIPIDHNYLPDEPFMDPNNNLPLNDGLYLETNDLSCAQQ 
DDFNFEDYLSFFDDEGLTFDDSLLMGPEDFLPNQEALDQKPAPKELEKEVAGGKEAVEEK 
ESGEGSSSKQDTDFKDFDSAPKYPFLKKTSHMLGAIPTPSSFASQFQTKDAMRLHAAQSS 
GSVHVTAGMMRISNMTLAADSGMGWSYDKNGNLNWLSFGWQQDDAMTASGSKTGITAT 

RAMLVFMCLWVLLLSVSFKIVTMVSAR* 

' ^CTTOCATCGTTCTGTCTATTATAAATATATGTCAATTTGGTTTCTAAAAAATTCTACC 
ATTGATTGATTGATTTTTTTTTCTTTAAGAGATGAATTTATTTACAAGAATCTCATCTCG 
GACTAAGAAGGCCAATCTTTACTACGTAACCCTAGTTGCTCTTCTCTGCATCGCTAGCTA 
CCTTCTCGGTATTTGGCAAAACACGGCGGTTAATCCACGCGCCGCCTTCGATGATTCAGA 
CGGTACACCGTGCGAGGGATTCACCAGACCTAATTCTACGAAAGATCTCGACTTCGACGC 
GCATCACAACATTCAAGATCCACCTCCGGTGACGGAAACCGCCGTTAGTTTCCCGTCGTG 
TGCCGCCGCGTTGAGCGAGCACACGCCATGCGAAGACGCGAAGCGATCGTTGAAATTCTC 
GAGGGAGAGATTGGAGTATAGGCAAAGGCATTGTCCCGAGAGAGAAGAAATCTTGAAGTG 
CAGAATTCCGGCGCCGTACGGTTACAAAACGCCGTTCCGATGGCCGGCGAGTCGTGACGT 
GGCGTGGTTCGCTAATGTGCCTCACACGGAGCTTACGGTTGAGAAAAAGAATCAGAATTG 
GGTCCGGTACGAGAATGATCGGTTTTGGTTCCCTGGTGGAGGTACGATGTTTCCACGTGG 
CGCTGATGCTTACATTGATGATATCGGACGGTTGATTGATCTCAGCGACGGCTCTATCCG 
TACAGCCATCGATACCGGTTGCGGGGTGGCTAGCTTCGGTGCATATCTTTTATCAAGAAA 
CATTACAACGATGTCATTTGCACCAAGAGACACACACGAAGCTCAAGTCCAGTTCGCACT 
CGAGCGTGGTGTGCCGGCGATGATCGGAATCATGGCTACAATCCGCCTACCGTACCCTTC 
TAGAGCCrTTGATTTAGCACATTGCTCTCGTTGCCTTATTCCGTGGGGCCAAAACGATGG 
GGCTTACTTGATGGAGGTGGATAGGGTTTTAAGACCAGGAGGGTACTGGATACTTTCTGG 
ACCGCCGATTAATTGGCAGAAACGGTGGAAAGGGTGGGAACGGACCATGGATGATTTGAA 
TGCAGAGCAGACTCAGATCGAGCAGGTCGCGAGAAGCTTGTGTTGGAAGAAAGTTGTTCA 
AAGAGATGATCTTGCTATTTGGCAAAAACCCTTTAACCACATTGACTGTAAGAAAACCAG 
AGAGGTTTTGAAAAATCCGGAGTTTTGTCGTCATGATCAAGATCCCGACATGGCCTGGTA 
TACGAAGATGGATTGTTGTTTGACACCATTACCTGAAGTTGATGACGCTGAGGATCTAAA 
GACGGTGGCCGGAGGGAAGGTAGAAAAGTGGCCGGCTAGATTAAACGCGATTCCTCCGAG 
AGTAAACAAAGGCGCTCTCGAGGAAATCACACCTGAAGCTTTCTTGGAGAACACGAAACT 
GTGGAAACAGAGAGTTTCTTATTACAAGAAGTTAGATTACCAGTTGGGTGAAACCGGGAG 
ATACAGAAACTTAGTCGACATGAACGCTTACCTCGGTGGATTCGCGGCGGCTCTAGCGGA 
TGATCCGGTCTGGGTCATGAACGTTGTCCCGGTCGAGGCTAAGCTCAATACGCTCGGTGT 
CATCTACGAGCGTGGTCTAATCGGAACGTATCAAAACTGGTGTGAAGCCATGTCGACGTA 
TCCAAGAACGTATGATTTTATCCATGCTGACTCGGTTTTCACATTGTACCAAGGTCAATG 
TGAACCGGAGGAGATATTGTTGGAGATGGACCGAATTCTTAGACCGGGTGGTGGTGTGAT 
TATAAGAGATGACGTGGACGTTTTGATCAAGGTTAAGGAATTAACCAAAGGATTAGAATG 
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GGAAGGTAGAATTGCTGACCACGAGAAGGGTCCTCATGAAAGAGAGAAGATTTACTATGC 
GGTGAAACAGTATTGGACCGTTCCTGCGCCTGATGAAGATAAAAACAACACTAGTGCTCT 
CTCCTGATTTTTGAGTTTTTTTTTTCTTACAATGTTTTTTTTTTTTTTTTTTCAATTTTT 
TATACAACAATAAATTCTCAATAATTGTTGTCGCGGCCG 

>G831 Amino Acid Sequence (domain in AA coordinates: 470-591) 

MNLFTRISSRTKKANLYYVTLVALLCIASYLLGIWQNTAVNPRAAFDDSDGTPCEGFTRP 

NSTKDLDFDAHHNIQDPPPVTETAVSFPSCAAALSEHTPCEDAKRSLKFSRERLEYRQRH 

CPEREEILKCRIPAPYGYKTPFRWPASRDVAWFAOTPHTELTVEKKNQNWVRYENDRFWF 

PGGGTMFPRGADAYIDDIGRLIDLSDGSIRTAIDTGCGVASFGAYLLSRNITTMSFAPRD 

THEAQVQFALERGVPAMIGIMATIRLPYPSRAFDLAHCSRCLIPWGQNDGAYLMEVDRVL 

RPGGYWILSGPPINWQKRWKGWERTMDDLNAEQTQIEQVARSLCWKKVVQRDDLAIWQKP 

FNHIDCKKTREVLKNPEFCRHDQDPDMAWYTK^SCLTPLPEVDDAEDLKTVAGGI<^CT 

PARLNAIPPRWKGALEEITPEAFLENTKLWKQRVSYYKKLDYQLGETGRYRNLVDMNAY 

LGGFAAALADDPVWVMNWPVEAKLOT^ 

SVFTLYQGQCEPEEIIiLEMDRILRPGGGVIIRDDVDVLIKVKELTKGLEWEGRIADHEKG 
PHEREKI YYAVKQYWTVPAPDEDKNNTSALS * 
>G864 (503.. 1534) 

TGCAAAAACATTTTCTTGTCTCTCCTCTGCCCAAATTTTTTTTCTTTCCAGGAATATTTC 
CTAGAAAAACCCAAGCAAAGCTTTAACCCCTTCCTCCTCCAAAAGTAGCATCTTCCTCTT 
TTTCTATTTCTCCTTTCCTCTTCTTATCTCTCTCTCGTTTGTGAACGATTCCTTAAGAAT 
ATAACCAAAAGCCCTTTTCTCCTTTCTTCAACTTTCCGGGAAAAATCTTCACGCAGCAAG 
GTTTCTCTCTCGGCTCTCGCAGTGTTTTTCGGGCCTTTTGTTCTTTCTATAAAAAAAAAA 
TTCGCGTCCTTTAAGAAAACTTTTTCCACCTAGAGAAGAAGAAGAGTATCACTCTTGTTG 
TTCAAGTTTCTCTCTTTAATAAAAAATCCATCTTTATTCTTTGTCTTCTTTCCTTTTTGC 
TTTCCCTAATCTCTATGTTATAAACACACAGAGAGAAACAAAGTCACAGTCTCGAGTCAA 
AAACAGAGAATACGAAAGAAAAATGGAAGCGGAGAAGAAAATGGTTCTACCGAGAATCAA 
ATTCACAGAGCACAAAACCAACACGACAACAATCGTATCGGAGTTAACCAACACTCACCA 
. AACCAGGATTCTTCGTATCTCAGTCACTGACCCAGACGCTACTGATTCCTCCAGTGACGA 
CGAAGAAGAAGAACATCAACGCTTTGTCTCTAAACGCCGTCGTGTTAAGAAGTTTGTCAA 
CGAAGTCTATCTCGATTCCGGTGCTGTTGTTACTGGTAGTTGTGGTCAAATGGAGTCGAA 
GAAGAGACAAAAGAGAGCGGTTAAATCGGAGTCTACTGTTTCTCCGGTTGTTTCAGCGAC 
GACGACTACGACGGGAGAGAAGAAGTTCCGAGGAGTGAGACAGCGTCCATGGGGAAAATG 
GGCGGCGGAGATAAGAGATCCGTTGAAACGTGTACGGCTCTGGTTAGGTACTTACAACAC 
GGCGGAAGAAGCTGCTATGGTTTACGATAACGCCGCTATTCAGCTTCGTGGTCCCGACGC 
TCTGACTAATTTCTCAGTCACTCCGACAACAGCGACGGAGAAGAAAGCCCCACCACCGTC 
TCCGGTGAAGAAGAAGAAGAAGAAAAACAACAAAAGCAAAAAATCCGTTACTGCTTCTTC 
CTCCATCAGCAGAAGCAGCAGCAACGATTGTCTCTGCTCTCCGGTGTCTGTTCTCCGATC 
TCCTTTCGCCGTCGACGAATTCTCCGGCATTTCTTCATCACCAGTCGCGGCCGTTGTAGT 
CAAGGAAGAGCCATCCATGACAACGGTATCTGAAACTTTCTCTGATTTCTCGGCGCCCTT 
GTTCTCAGATGATGACGTGTTCGATTTCCGGAGCTCAGTGGTTCCCGACTATCTCGGCGG 
CGATTTATTTGGGGAAGATCTATTCACGGCGGATATGTGTACGGATATGAACTTCGGATT 
CGATTTCGGATCCGGATTATCC^GCTGGCACATGGAGGACCATTTTCAAGATATCGGGGA 
TCTATTCGGGTCGGATCCTCTTTTAGCTGTTTAATAATATTTTAAATAAATAAATAGTTA 
TACCGGCCGTTACTAAACGGAACCGGAGAAAGTTTTGTATACCGGTGACATAAAATCTCG 
GTTATGTTCGTAATCTTTTTTTCTTTGTTATATATAAAAATATGAATGAAACTGAATTAA 
TGTAAGTTAATGGTGATAATTATTAACGTTTTAAGTTTTGAAAAAAAAAAAAAAAAAAAA 

AAAAAAA 

>G864 Amino* Acid Sequence (domain in AA coordinates: 119-186) 
MEAEKKMVLPRIKFTEHKTNTTTIVSELT 

FVS K3UIRVKKFVNE VYLDSGAVVTGS CGQMES KKRQKRAVKSESTVS P WS ATTTTTGEK 
KFRGVRQRPWGKWAAE I RD PLKRVRLWLGTYNTAE EAAMVYDNAAI QLRG PD ALTNF S VT 
PTTATEKKAPPPSPVKKKKKKl^KSKKSOTASSSISRSSSN^ 

SGISSSPV7VAVWKEEPSMTTVSETFSDFSAPLFSDDDVFDFRSSWPDYLGGDLFGEDL 

FTADMCTDMNFGFDFGSGLSSWHMEDHFQDIGDLFGSDPLLAV* 

>G884 (31.. 1575) 
TTTTTTTTTGTTTGTTAA 

TCGAAGTCCACCGGAGCTCCGTCGCGTCCGACTTTATCTCTTCCTCCACGGCCGTTTAGT 
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GAGATGTTCTTTAACGGTGGCGTTGGATTCAGTCCTGGTCCGATGACTCTGGTCTCTAAT 

ATGTTCCCTGATTCCGATGAGTTTAGGTCTTTCTCTCAGCTTCTCGCTGGAGCCATGTCT 

TCTCCAGCGACTGCAGCTGCTGCTGCTGCTGCTGCGACGGCTAGTGATTACCAGAGACTT 

GGTGAAGGGACTAATAGCTCTAGTGGTGATGTTGACCCGAGATTCAAGCAAAACAGACCA 

ACCGGTTTGATGATTTCTCAATCTCAATCGCCGTCGATGTTCACCGTACCGCCTGGTTTA 

AGTCCAGCTATGTTGCTCGATTCACCAAGCTTTTTGGGTCTTTTCTCTCCCGTTCAGGGA 

TCATATGGAATGACACATCAGCAAGCTCTAGCTCAAGTCACTGCTCAAGCAGTTCAAGCC 

AATGCCAATATGCAACCACAAACAGAGTACCCTCCTCCCTCTCAAGTTCAATCATTTTCA 

TCGGGTCAAGCGCAGATCCCGACCTCGGCTCCACTACCAGCTCAAAGAGAAACCTCAGAT 

GTAACCATCATAGAGCACAGGTCACAACAGCCTCTAAATGTTGACAAACCAGCTGATGAT 

GGCTATAACTGGCGAAAATATGGGCAAAAGCAAGTTAAAGGTAGCGAGTTTCCACGAAGC 

TATTACAAGTGTACTAATCCAGGATGTCCTGTCAAGAAGAAGGTTGAGAGATCTCTTGAT 

GGACAAGTAACGGAGATTATCTACAAAGGTCAGCACAATCATGAACCTCCTCAAAACACT 

AAGCGAGGTAACAAAGATAACACCGCGAATATAAATGGGAGTTCGATAAATAACAATCGC 

GGGAGTTCTGAATTGGGGGCATCACAGTTTCAAACTAATAGCTCCAACAAGACTAAGAGA 

GAGCAACATGAAGCAGTAAGTCAAGCTACGACAACAGAGCACTTGTCTGAGGCAAGTGAC 

GGTGAAGAAGTTGGTAATGGAGAAACTGATGTGAGAGAGAAAGATGAGAATGAGCCTGAT 

CCCAAGAGAAGAAGTACAGAAGTTCGGATTTCAGAACCAGCTCCTGCTGCTTCACATAGA 

ACTGTGACAGAGCCTAGAATTATTGTCCAAACGACGAGTGAAGTTGATCTTCTAGATGAT 

GGATATAGGTGGCGTAAATATGGACAGAAAGTTGTCAAAGGGAATCCTTATCCGAGGAGC 

TACTACAAGTGCACAACACCAGGATGTGGTGTGAGGAAACATGTAGAGAGAGCAGCAACA 

GATCCAAAAGCTGTAGTAACAACATATGAAGGAAAACATAACCATGACCTTCCCGCTGCT 

AAATCAAGCAGCCATGCCGCTGCAGCGGCACAGTTAAGGCCAGATAATCGACCTGGCGGT 

TTGGCTAACTTAAATCAACAGCAGCAGCAACAGCCCGTTGCGCGGCTAAGGCTTAAAGAA 

GAGCAAACAACTTGAGAGAAGAAAACTCTTGACCGTTTTTCATTACAAAAGCTTTCAAAT 

TCCACTCACACACTTGTCTGAAAAATCTAGCAGTTTGCAGGAAAGAAACAGCTTCAAGAG 

GTTGTAGTTCTTCTATGTTCTGGTGTAAAACTTAAAAGCTTTTTAGGGTTTTCAGATTTC 

TGTTTACTAATACTGTATGTGAATTCTTTTGTACATGAGGAAGAAAATTACAGGGGGATA 

TTTTGTGTTGTATCTTTTGTGTTATTGTTTCAGTAAAAGATAGGTCTTACATTTTGTGTA 

AAAAAAAAAAAAAAAAAAA . . 

>G884 Amino Acid Sequence (conserved domain xn AA coordinates =227-285, 407-465) 

MSEKEEAPSTSKSTGAPSRPTLSLPPRPFSEMFFNGGVGFSPGPMTLVSNMFPDSDEFRS 

FSQLLAGAMSSPATAAAAAAAATASDYQRLGEGTNSSSGDVDPRFKQNRPTGLMISQSQS 

PSMFTVPPGLSPAMLLDSPSFLGLFSPVQGSYGMTHQQALAQVTAQAVQANANMQPQTEY 

PPPSQVQSFSSGQAQIPTSAPLPAQRETSDVTIIEHRSQQPLNVDKPADDGYNWRKYGQK 

QVKGSEFPRSYYKCTNPGCPVKKKVERSLDGQVTEIIYKGQHNHEPPQKTKRGNKDNTAN 

INGSSINNNRGSSELGASQFQTNSSNKTKREQHEAVSQATTTEHLSEASDGEEVGNGETD 

VREKDENEPDPKRRSTEVRISEPAPAASHRTVTEPRIIVQTTSEVDLLDDGYRWRKYGQK 

VVKGNPYPRSYYKCTTPGCGWKHVERAATDPKAWTTYEGKHNHDLPAAKSSSHAAAAA 

QLRPDNRPGGLANLNQQQQQQPVARLRLKEEQTT* 
>G898 (161.. 772) 

GAAAAAAAGATTCAAAAACCCTAGATTTCACAAAATCGATTGGCTGTCTiAATTTCTCTCC 

GGCGATTTTCCTCGAGTGAAATTCGGCTCAAGGTGATTATAGCGATCATCGAATCAAATT 

GATTGAAGAGGTACAAAGGTTAGTTACTTTGAGCTGAAAGATGAACACGTCAGAGGTGAG 

AGTACCTCGAGGAAATCGACGGAGGAAAGCTGTGATTGATCTGAATGCGGTACCTGTTGA 

TCAAGAAGGGACCTCTGCTTCTGTTAGAACTCTTACGGTGCCTATTACACCGTCTCAGCC 

TGCTCCTACGATGATTGATGTCGATGCTATTGAGGATGATGTTATTGAATCATCCGCTAG 

TGCTTTTGCTGAAG€TAAAAGCAAATCAAGAAATGCACGTCGGAGACCTTTGATGGTTGA 

TGTAGAGTCAGGAGGTACGACTAGATTCCCTGCCAACATAAGCAACAAACGCAGAAGGAT 

TCCTTCTAGTGAATCTGTCATCGACTGTGAGCATGCCTCTGTAAATGATGAAGTCAACAT 

GTCTTCGAGAGTGTCTAGATCAAAGGCTCCAGCTCCTCCACCAGAAGAGCCAAAGTTTAC 

ATGTCCAATCTGCATGTGTCCCTTTACGGAGGAGATGTCAACCAAGTGCGGTCACATCTT 

CTGCAAGGGATGTATAAAGATGGCAATATCTCGCCAGGGCAAATGCCCTACTTGTAGGAA 

AAAGGTTACTGCAAAAGAGCTGATTCGAGTTTTCCTTCCAACCACTAGATGAGTGGTCCG 

GCAACATCACCAGCCACCCTGTCTAATGGTTTATCAGACTATCCTCCTATTCACTTTGGA 

AC^TTGAAGGGACTTCGTTGACTTGGTATTTTTGAATATTTTGCTTTGTrGGAAGAGAAA 

TATTCAGTGATCAAGAAGCCAGAAGGCCCTATCATTCGATGGATATCATTGGTAATAACT 
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CTTTGTTTTTAGTTGTTGTTCTATGTAATTTAGGTCTCTGCAAACCTCTCAGTCGATACT 

CTTCTCTCTTGATAGATGATAAGATATATGGAAAAAAAAATTAATATTGAATCTTTACTA 
AAA 

>G898 Amino Acid Sequence (domain in AA coordinates: 148-185) 

MNTSEVRVPRGNRRRKAVIDLNAVPVDQEGTSASVRTLTVPITPSQPAPTMIDVBAIEDD 

VIESSASAFAEAKSKSRNARRRPLMVDVESGGTTRFPANISNKRRRIPSSESVIDCEHAS 

VNDEVNMSSRVSRSKAPAPPPEEPKFTCPICMCPFTEEMSTKCGHIFCKGCIKMAISRQG 
KCPTCRKKVTAKELIRVFLPTTR* 

>G900 (1..648) 

ATGGGGAAGAAGAAGTGCGAGTTATGTTGTGGTGTAGCGAGAATGTATTGTGAGTCAGAT 
CAAGCGAGTTTATGTTGGGATTGTGACGGTAAAGTTCACGGAGCTAATTTTCTGGTGGCG 
AAACACATGCGTTGTCTTCTATGTAGCGCGTGTCAGTCACACACGCCTTGGAAAGCTTCT 
GGGCTGAATCTTGGCCCAACTGTTTCTATCTGTGAGTCTTGTTTAGCTCGTAAGAAGAAT 
AACAACAGCTCCCTCGCCGGGAGGGATCAGAATCTTAACCAAGAAGAAGAGATCATTGGT 
TGTAACGACGGAGCTGAGTCTTATGATGAGGAAAGCGATGAGGATGAAGAAGAAGAAGAA 
GTGGAGAATCAGGTTGTTCCGGCTGCGGTGGAGCAAGAACTTCCGGTGGTGAGTTCGTCG 
TCTTCGGTTAGTAGTGGTGAAGGAGATCAGGTGGTGAAAAGGACGAGACTTGATTTGGAT 
CTTAACCTCTCCGATGAGGAGAACCAATCTAGACCATTGAAAAGATTATCGAGAGACGAA 
GGTTTGTCAAGATCAACTGTTGTGATGAATAGCTCAATCGTGAAATTACACGGAGGGAGG 
AGAAAAGCAGAGGGATGTGATACATCATCGTCGTCTTCGTTTTATTGA 

>G900 Amino Acid Sequence (domain in AA coordinates: 6-28, 48-74) 

MGKKKCELCCGVARMYCESDQASLCWDCDGKVHGANFLVAKHMRCLLCSACQSHTPWKAS 

GLNLGPTVSICESCLARKKNNNSSLAGRDQNLNQEEEIIGCNDGAESYDEESDEDEEEEE 

VENQWPAAVEQELPWSSSSSVSSGEGDQWKRTRLDLDLNLSDEENQSRPLKRLSRDE 

GLSRSTWMNSSIVKLHGGRRKAEGCDTSSSSSFY* 

>G913 (108.. 806) 

CATTCAAAAACATCATATATATACACAAACACACTTTGATACAACAAAAAAAAACAGAAC 

ACAAACAAAAACACATTGTAACATTAGTTTAAGCATTAAGCTTCTTTATGTCGAATAATA 

ATAATTCTCCGACGACCGTGAATCAAGAAACGACGACGTCTCGTGAAGTCTCAATCACAT 

TGCCTACTGATCAATCTCCTCAAACCTCACCAGGATCATCTTCTTCTCCTTCACCGAGAC 

CTTCCGGTGGATCACCGGCGAGAAGAACGGCGACTGGATTATCCGGCAAGCACTCTATTT 

TCAGGGGGATTCGACTACGTAACGGAAAATGGGTATCGGAGATTAGAGAGCCACGTAAAA 

CGACAAGAATTTGGCTCGGGACTTATCCGGTACCGGAGATGGCTGCCGCCGCTTACGACG 

TGGCTGCGTTAGCTTTAAAAGGACCCGACGCCGTTTTGAATTTTCCTGGTTTAGCTTTGA 

CTTACGTGGCTCCGGTTTCAAACTCTGCTGCGGATATAAGAGCGGCTGCTAGTAGAGCAG 

CGGAGATGAAGCAACCGGATCAGGGTGGGGATGAGAAGGTATTGGAACCGGTTCAACCCG 

GCAAAGAGGAAGAATTAGAAGAAGTGTCGTGTAACTCGTGTTCGTTGGAGTTTATGGATG 

AGGAAGCGATGTTGAATATGCCGACTTTGTTGACGGAGATGGCTGAAGGGATGTTGATGA 

GTCCACCGAGAATGATGATACATCCGACGATGGAAGATGATTCGCCGGAGAATCATGAAG 

GAGATAATCTTTGGAGTTATAAATGAATCCATTGAAGCTGCTCTCTTTTTTATTGTTTTC 

CGGTCGAATGAGATTTTCCCCCTTTTTTTTTTTCTTTTTGGGTCGCTGTT 

>G913 Amino Acid Sequence (domain in AA coordinates: 62-128) 

MSNNNNS PTTVNQETTTSRE VS I TLPTDQS PQTS PGS S S S PS PRPS GG S PARRTATGLS G 

KHS I FRGIRLRNGKWVSE IREPRKTTR IWLGTYPVPEMAAAAYDVAALALKGPDAVLNFP 

GIiALTYVAP VSNS AAD I RAAASRAAEMKQPDQGGDEKVLEPVQPGKEEELEE VS CNS CS L 

EFMDEEAMLlimPTLLTEMAEGMLMSPPRMMIHPTMEDDSPENHEGDNLWSYK* 
>G937 (45.. 1046) 

TGGAAAAAGTTTGA6TTTTTAATTCGAATCGAGAAAAAATAAAAATGGGTTCTTTAGGTG 
ATGAGCTTAGTTTGGGATCGATCTTTGGGAGAGGAGTTTCGATGAATGTTGTGGCGGTTG 
AGAAAGTTGATG AAC ATGTTAAG AAG CTTG AAG AAG AG AAG AG AAAG CTCG AAAGTTGTC 
AACTTGAG CTTCCTCTGTCTTTG CAGATTTTAAACG ATG CG ATTTTGTATCTGAAGGATA 
AGAGATGTTCAGAGATGGAGACTCAACCATTGTTGT^AAGATTTCATTTCTGTTAATAAAC 
CTATTCAAGGAGAAAGAGGAATAGAATTGCTGAAAAGAGAGGAGCTAATGAGGGAGAAGA 
AGTTTCAGCAATGGAAAGCTAATGATGATCACACTAGTAAGATCAAGAGCAAGCTTGAGA 
TTAAGAGAAATGAGGAGAAATCTCCTATGTTGTTGATTCCAAAGGTGGAAACTGGTTTAG 
GCCTCGGTTTAAGTTCGAGTTCGATAAGAAGAAAAGGGATTGTTGCCTCATGTGGCTTTA 
CTTCTAACTCTATGCCACAACCACGAACACCAGCAGTACC^ 
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AGCAGCAAGCTTTACGGAAGCAAAGAAGGTGTTGGAATCCAGAGTTGCATCGCCGATTTG 

TCGATGCATTGCAACAGCTAGGTGGACCGGGAGTGGCAACTCCTAAACAAATTAGAGAAC 

ATATGCAAGAAGAAGGCTTAACCAATGATGAAGTCAAGAGTCATTTACAGAAATACAGGT 

TACACATCAGGAAGCCAAATTCGAATGCGGAGAAACAATCAGCAGTTGTTTTAGGGTTTA 

ACTTGTGGAATTCTTCAGCACAAGATGAAGAAGAGACATGTGAAGGAGGAGAATCATTGA 

AGAGAAGCAATGCGCAATCAGATTCTCCTCAAGGTCCTTTGCAGTTACCGTCTACAACAA 

CAACAACTGGTGGAGATAGTAGCATGGAAGATGTTGAAGATGCTAAGTCTGAGAGCTTTC 

AACTGGAGAGATTGAGATCACCATAAATCTCAAGAAACCAAACTCTTGATCACGGTTTTG 

TTATTTTGGATTCATTACTATATCTATTAGTAGTGAATGAGAACAATAATTATAGAAAGG 

TTTATAGATATATATATAGAGAAAAAGAGAGAGTGAGGATGGTTCAAATTATTTGCAGA 

>G937 Amino Acid Sequence (conserved domain in AA coordinates : 197-246) 

MGSLGDELSLGSIFGRGVSMNWAVEKVDEHVKKLEEEKRKLESCQLELPLSLQILNDAI 

LYLKDKRCSEMETQPLLKDFISWKPIQGERGIELLKREELMREKKFQQWKANDDHTSKI 

KSKIjEIKRNEEKSPMLLIPKVETGLGLGLSSSSIRRKGIVASCGFTSNSMPQPPTPAVPQ 

QPAFLKQQALRKQRRCWNPELHRRFVDALQQLGGPGVATPKQIREHMQEEGLTNDEVKSH 

LQKYRLHIRKPNSNAEKQSAWLGFNLWNSSAQDEEETCEGGESLKRSNAQSDSPQGPLQ 

LPSTTTTTGGDSSMEDVEDAKSESFQLERIiRSP* 

TACCGTCGACCCACGCGTCCGAGTGTATTCAAAGTCGGAAAGAAACCCTAAAGAAGAGGA 

TTATGGGTGCTGTATCGATGGAGTCGCTTCCTTTAGGTTTCAGATTCAGACCTACCGATG 

AAGAGCTCGTCAATCACTACCTCCGTCTCAAGATCAACGGACGTCACTCCGATGTCCGTG 

TCATCCCTGATATCGATGTCTGCAAATGGGAACCTTGGGATCTTCCTGCTCTCTCGGTGA 

TTAAGACGGATGATCCAGAGTGGTTCTTTTTCTGCCCTCGTGATCGGAAATACCCTAATG 

GTCATCGCTCTAACAGAGCAACTGACTCTGGCTATTGGAAAGCTACTGGTAAAGATCGTA 

GCATCAAGTCTAAGAAGACTTTAATCGGTATGAAGAAGACTCTTGTCTTCTATCGTGGAC 

GAGCTCCTAAAGGTGAGCGGACTAATTGGATTATGCACGAGTATCGTCCCACTCTTAAGG 

ATCTTGATGGCACTTCCCCTGGCCAAAGCCCTTACGTTCTTTGTCGCCTCTTCCACAAGC 

CTGATGATCGGGTTAATGGTGTCAAGTCCGATGAAGCAGCTTTTACGGCCAGCAACAAAT 

ACTCACCTGATGATACATCATCTGATCTTGTTCAAGAAACACCTTCCTCTGATGCTGCTG 

TTGAGAAACCATCAGATTATTCAGGTGGATGCGGTTATGCTCATAGTAATAGTACCGCAG 

ATGGGACAATGATTGAGGCACCTGAAGAGAATCTTTGGTTATCTTGTGACCTTGAAGATC 

AAAAGGCACCACTACCGTGTATGGATTCTATATATGCTGGTGATTTCAGTTACGATGAGA 

TTGGATTCCAATTTCAAGATGGTACCAGCGAACCAGATGTATCACTAACAGAATTGTTGG 

AGGAGGTGTTCAATAACCCTGATGACTTCTCTTGCGAGGAATCGATCAGTCGAGAGAATC 

CAGCAGTCTCACCAAATGGGATATTTTCATCTGCTAAAATGCTGCAGTCTGCAGCACCAG 

AGGATGCTTTCTTCAACGACTTCATGGCTTTCACTGATACAGATGCTGAGATGGCGCAAT 

TGCAGTATGGTTCAGAAGGTGGAGCTTCTGGTTGGCCAAGTGACACTAATTCATACTATA 

GTGATTTGGTTCAGCAAGAGCAAATGATCAATCATAACACAGAGAACAACCTCACAGAAG 

GGAGAGGGATAAAGATCCGGGCTCGACAGCCTCAGAACCGGCAGAGTACAGGATTGATAA 

ACCAGGGTATTGCTCCAAGGAGAATCCGTCTGCAGCTGCAGTCTAACTCTGAAGTAAAAG 

AACGAGAGGAGGTGAATGAAGGACACACTGTTATTCCCGAGGCCAAAGAAGCTGCAGCTA 

AATACTCAGAGAAGAGTGGTTCTTTGGTTAAACCTCAAATAAAGCTCAGGGCGCGGGGAA 

CTATAGGCCAAGTAAAAGGAGAGAGATTTGCAGACGACGAGGTACAGGTGCAGAGCACAA 

AGAGAGAGAGAGAGAGAATCAAATGTAGTTTAATGTAATTAGGGATGATGCAATGTTAGC 

ATGTTTGTGTGTTGTAACTTAAAAACTTATTTAGGAATCTGATAAAAGTTACTGTTGAAA 

AAAGAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

>G960 Amino Acid Sequence (domain in AA coordinates : 13-156) 

MGAVSMESLPLGFRFRPTDEELVNHYLRLKINGRHSDVRVIPDIDVCKWEPWDLPALSV1 

KTDDPEWFFFCPRDRKYPNGHRSNRATDSGYWKATGKDRSIKSKKTLIGMKKTLVFYRGR 

APKGERTNWIMHEYRPTLKDLDGTSPGQSPYVLCRLFHKPDDRVNGVKSDEAAFTASNKY 

SPDDTSSDLVQETPSSDAAVEKPSDYSGGCGYAHSNSTADGTMIEAPEENLWLSCDLEDQ 

KAPLPCMDSIYAGDFSYDEIGFQFQDGTSEPDVSLTELLEEVFNNPDDFSCEESISRENP 

AVSPNGIFSSAKMLQSAAPEDAFFNDFMAFTDTDAEMAQLQYGSEGGASGWPSDTNSYYS 

DLVQQEQMINHNTENNIiTEGRGIKIRARQPQNRQSTGLINQGIAPRRIRI'QLQSNSEVKB 

REEVNEGHTVIPEAKEAAAKYSEKSGSLVKPQIKLRARGTIGQVKGERFADDEVQVQSTK 

RERERIKCSLM* 
>G991 (6.. 533) 
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GAAAAATGGAAGAAGAAAAGAGATTGGAGCTAAGGCTAGCTCCTCCTTGTCACCAATTCA 
CTTCCAACAACAACATCAATGGATCTAAACAAAAAAGCTCGACCAAAGAAACATCATTCC 
TTTCCAATAACAGGGTTGAGGTAGCTCCAGTGGTGGGATGGCCGCCGGTGAGATCATCCC 
GGAGAAACCTAACGGCACAACTAAAGGAGGAGATGAAGAAGAAGGAGAGTGATGAAGAGA 
AGGAATTGTACGTTAAGATCAACATGGAAGGAGTTCCT^ATAGGAAGAAAAGTCAACCTTT 
CAGCTTATAACAACTACCAACAGCTTTCACATGCCGTTGACCAACTCTTCTCTAAGAAAG 
ATTCGTGGGATCTAAACAGACAATACACTTTGGTCTACGAAGACACTGAAGGAGATAAAG 
TTCTGGTCGGGGATGTTCCTTGGGAGATGTTTGTATCTACTGTAAAGAGGTTGCATGTTT 
TAAAGACCTCCCACGCCTTCTCACTCTCACCTAGAAAACATGGCAAGGAATAGAGAGAGG 
TTGGCCAAAATCATCAGTTCGATGGTTTGTTTTTAATGTAATTTTTGTGGAAACTAATGG 
GGTTTGGCTTTGATTTACTGGTTTTCTTTTTCACTTATGTACTAGGTTTTTGCTTGCTAT 
GTTATTTCTTGTTTTGGTTGTAAATATGCTGTTCGTTTAAGAAATCGGGGGTTAGTATGT 
TATCGTGTGTATAAAAATAGTGTAAGCACGTAAGTTGATTACAAAAAAATU^AAAAAAAAA 
AAAAAAAAA 

>G991 Amino Acid Sequence (domain in AA coordinates: 7-14,48-59,82-115,128-164) 

MKEEKRLELRLAPPCHQFTSNNNINGSKQKSSTKETSFLSNNRVEVAPVVGWPPVRSSRR 

NLTAQLKEEMKKKESDEEKELYVKINMEGVPIGRKV^^ 

WDLNRQYTLWEDTEGDKVLVGDVPWEMFVSTVKKLHVLKTSHAFSLSPRKHGKE* 
>G748 (98.. 1444) 

CCACGCGTCCGCACTCTCCCAAATCTCTCTTCTTTAACAACAAAAAAAAAATCACAGAGA 
CATAGAGAGAAGAAGACGGAACAGAGGCTCCAAAAAAATGATGATGGAGACTAGAGATCC 
AGCTATTAAGCTTTTCGGTATGAAAATCCCTTTTCCGTCGGTTTTTGAATCGGCAGTTAC 
GGTGGAGGATGACGAAGAAGATGACTGGAGCGGCGGAGATGACAAATCACCAGAGAAGGT 
AACTCCAGAGTTATCAGATAAGAACAACAACAACTGTAACGACAACAGTTTTAACAATTC 
GAAACCCGAAACCTTGGACAAAGAGGAAGCGACATCAACTGATCAGATAGAGAGTAGTGA 
CACGCCTGAGGATAATCAGCAGACGACACCTGATGGTAAAACCCTAAAGAAACCGACTAA 
GATTCTACCGTGTCCGAGATGCAAAAGCATGGAGACCAAGTTCTGTTATTACAACAACTA 
CAACATAAACCAGCCTCGTCATTTCTGCAAGGCTTGTCAGAGATATTGGACTGCTGGAGG 
GACTATGAGGAATGTTCCTGTGGGGGCAGGACGTCGTAAGAACAAAAGCTCATCTTCTCA 
TTACCGTCACATCACTATTTCCGAGGCTCTTGAGGCTGCGAGGCTTGACCCGGGCTTACA 
GGCAAACACAAGGGTCTTGAGTTTTGGTCTCGAAGCTCAGCAGCAGCACGTTGCTGCTCC 
CATGACACCTGTTATGAAGCTACAAGAAGATCAAAAGGTCTCAAACGGTGCTAGGAACAG 
GTTTCACGGGTTAGCGGATCAACGGCTTGTAGCTCGGGTAGAGAATGGAGATGATTGCTC 
AAGCGGATCCTCTGTGACCACCTCTAACAATCACTCAGTGGATGAATCAAGAGCACAAAG 
CGGCAGTGTTGTTGAAGCACAAATGAACAACAACAACAACAATAACATGAATGGTTATGC 
TTGCATCCCAGGTGTTCCATGGCCTTACACGTGGAATCCAGCGATGCCTCCACCAGGTTT 
TTACCCGCCTCCAGGGTATCCAATGCCGTTTTACCCTTACTGGACCATCCCAATGCTACC 
ACCGCATCAATCCTCATCGCCTATAAGCCAAAAGTGTTCAAATACAAACTCTCCGACTCT 
CGGAAAGCATCCGAGAGATGAAGGATCATCGAAAAAGGACAATGAGACAGAGCGAAAACA 
GAAGGCCGGGTGCGTTCTGGTCCCGAAAACGTTGAGAATAGATGATCCTAACGAAGCAGC 
AAAGAGCTCGATATGGACAACATTGGGAATCAAGAACGAGGCGATGTGCAAAGCCGGTGG 
TATGTTCAAAGGGTTTGATCATAAGACAAAGATGTATAACAACGACAAAGCTGAGAACTC 
CCCTGTTCTTTCTGCTAACCCTGCTGCTCTATCAAGATCACACAATTTCCATGAACAGAT 
TTAGAGTTACATATGTATATGTATATATGTATGATTGATTGTATGTATAGATGATACTGG 
AGAATGATGAGTTTTTGAGAATCAAACTCTTTTCTTCTTTCTAGTGATTGCCTTTATTCC 
TTTACATGTTTTGGTTCTCTGTACACTATTTGATTTACCTTTTTTACTTTCTTTCTTCAT 
TTGTCAGGAAATGTTGGAAGATAACATTAATGGTAAAAAGTTGGTGTGGACCGTTGTTGC 
GTTGGCATTTCAAAAAAAAAAAAAAAA 

>G748 Amino Acid Sequence (domain in AA coordinates: 112-140) 
MMMETRDPAIKLFGMKIPFPSVFESAVTVEDDEEDDWSGGDDK^ 

NDNSFNNSKPETLDKEEATSTDQIESSDTPEDNQQTTPDGKTLKKPTKILPCPRCKSMET 
KFCYYNNYN INQPRHFCKACQRYWTAGGTMRNVPVGAGRRKNKS S S SHYRHITI SEALEA 
ARLDPGLQANTRVLSFGLEAQQQHVAAPMTPVMKLQEDQKVSNGARNRFHGLADQ 
VENGDDCSSGSSVTTSmmSVDESRAQSGSVVEAQMNNl^ 

PAMPPPGFYPPPGYPMPFYPYWTI PMLPPHQS S S P ISQKCSNTNS PTLGKHPRDEGS SKK 

DNETERKQKAGCVLVPKTLRIDDPNEAAKSSIWTTLGIKNEAMCKAGGMFKGFDH 

NNDKAENSPVLSANPAALSRSHNFHEQI* 
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aSagaatcacaagagatggaaaag^ 

gaagSatcctcatggattatgtccgaactcatggccagggccactgg^ 
SSacSotcctcggc^cagatggtcgttgatagcgaaaagagitccgggaagaaca 

ScScC^GTAAAGAATTACTGGAACACACATCTCAGCAAG^ 

cStcSctcccgtcaaagccgcatgcggtgtagagtctccaccgtctatggcccttata 
aSScScctcctctcatcaagagatctccggtc^ 

™S^aSgacgaatcca^ 
aotSSSagIISScagctacggttcca^^ 

>G247 Amino Acid Sequence (domain in AA coordinates : 15- 116) 

mrmtodg^heykkglwtveedkilmdyvrthgqghwnriakktglkrcgkscr^ 
lsSSftdqeedliirm^^ 

TDVEVAATVPfTLFDTFWVLEDDFEIiSSLTMMDFTNGYCL* 

Sctcaaacatttc^ 

raCAAAA^CCTTAACATCTAGTTTGTATCCTCTCTGATACTTCAAAAAAAATGGATGAAG 

^^SgScgaaacattcaatggagttatggtatcttttggtctgtctctgcttctc 
cgSc^gcttcggagatcaaagctgatcagcttcgtctacggaggagcgagcagc^a 

rrPAGOTTTAdGAGTCTCTCTCCGTCGCTGAATCTTCTTCTTCAGGCGTTGCTGCCGGAT 

ScaISSgacgIgc^ 

cStccgattatcacatcgacaacgttcttgatccgcaacagattctaggcga^ 
^acgSctatgttcagtacggagcc^^ 

SStcactggaggagcttctcaggtgcaaagctggcagctcatggacgacgagcttagta 

cggctcgac^ggttgcttacggtgcaagaaagagt^gagttcaaagactagggcaaattc 
aagSSa^ 

caggaa^cgccacggtcacggcaccatcacaaggaatgttaaagaaaattattttcga 

Sgggaaccatgcggttttagagaagaagcgccgcgagaaattgaa^^ 
cStoaSaaa^tcattccgtcaatcaacaagatcgataaagtatcg^ 
cgatagacta^cttcaagaactcgagagacgggttcaagaactagaatcttgcagagaat 

caaSaScS^ 

SaStcSSaagccgagccagcagataccgc^actggtttaaccgataa^ 

taaggatcggotcgtttggtaatgaggtggtta 
gagtattgcttgagataatggatgtgattagtgatctccaittg^ 

AATCCTCGACCGGAGACGGTTTGCTCTGOT^^ 

AAATAGCGAC^CCAGGAATGATCAAAGAAGCACTTCAAAGGGTTGCATGGA 
GACTACTTAGTTAAAATTCACAGCAA^^ 

GGTTTTCTTCTAACCGGGTTTTAGGAATTAATGTTATGTTTATCATTTGTTTTT^ 
TT^GTGTOTrTTTTCCGTTGCTTAACGTAGGTGAAGAGGAACATACACTATGCGTA 

TTTGTGTTCTTTTGTTGTT 
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>G585 Amino Acid Sequence (domain in AA coordinates -436-501) 
MDEETMATGQNRTTVPENLKKHLAVSVRNIQWSYGIFWSVSASQSGVLEWGDGYYNGDIK 
TRKTIQASEIKADQLGLRRSEQLSELYESLSVAESSSSGVAAGSQVTRRASAAALSPBDL 
ADTEWYYLVCMSFVFNIGEGMPGRTFANGEPIWLCNAHTADSKVFSRSLLAKSAAVKTW 
CFPFLGGWEIGTTEHITEDMNVIQCVKTSFLEAPDPYATILPARSDYHIDNVLDPQOIL 
GDEIYAPMFSTEPFPTASPSRTTNGFDQEHEQVADDHDSFMTERITGGASQVQSWOLMDD 
ELSNCVHQSLNSSDCVSQTFVEGAAGRVAYGARKSRVQRLGQIQEQQRNVKTLSFDPRND 
DVHYQSVISTIFKTNHQIiILGPQFRNCDKQSSFTRWKKSSSSSSGTATVTAPSQGMLKKI 
IFDVPRVHQKEKLMLDSPEARDETGNHAVLEKKRREKLNERFMTLRKI I PS IWKIDKVS I 
LDDTIEYLQELERRVQELESCRESTDTETRGTMTMKRKKPCDAGERTSANCANNETGNGK 
KVSVNNVGEAEPADTGFTGLTDNLRIGSFGNEWIELRCAWREGVLLEIMDVISDLHLDS 
HSVQSSTGDGLLCLTVNCKHKGSKIATPGMIKEALQRVAWIC* 
>G634 (1..798) 



ATGGAGCAAGGAGGAGGTGGTGGTGGTAATGAAGTTGTGGAGGAAGCTTCACCTATTAGT 

TCAAGACCTCCTGCTAACAACTTAGAAGAGCTTATGAGATTCTCAGCCGCCGCGGATGAC 

GGTGGATTAGGAGGTGGAGGTGGAGGAGGAGGAGGAGGAAGTGCTTCTTCTTCATCGGGA 

AATCGATGGCCGAGAGAAGAAACTTTAGCTCTTCTTCGGATCCGATCCGATATGGATTCT 

ACTTTTCGTGATGCTACTCTCAAAGCTCCTCTTTGGGAACATGTTTCCAGGAAGCTATTG 

GAGTTAGGTTACAAACGAAGTTCAAAGAAATGCAAAGAGAAATTCGAAAACGTTCAGAAA 

TATTACAAACGTACTAAAGAAACTCGCGGTGGTCGTCATGATGGTAAAGCTTACAAGTTC 

TTCTCTCAGCTTGAAGCTCTCAACACTACTCCTCCTCCTCCTCCTTCTCATCCTCACGCT 

CATCAACCAGAACAGAAACAACAACAACAACCACAACAAGAGATGGTCATGAGCTCGGAA 

CAATCATCATTACCATCATCATCAAGATGGCCAAAGGCAGAGATTCTAGCGCTTATAAAC 

CTGAGAAGTGGAATGGAACCAAGGTACCAAGATAATGTACCTAAAGGACTTCTATGGGAA 

GAGATCTCAACTTCAATGAAGAGAATGGGATACAACAGAAACGCTAAGAGATGTAAAGAG 

AAATGGGAAAACATAAACAAATACTACAAGAAAGTTAAAGAAAGCAACAACAGCAACTAC 
AACAACAAGAATCAATGA 

>G634 Amino Acid Sequence (domain in aa coordinates- 62-147 189 
MEQGGGGGGNEVVEEASPISSRPPANNLEELMRFSAAADDGGLGGGGGGGGGGSASSSSG 
NRWREETLALLRIRSDMDSTFRDATLKAPLWEHVSRKLLELGYKRSSKKCKEKFENVQK 
YYKRTKETRGGRHDGKAYKFFSQLEALNTTPPPPPSHPHAHQPEQKQQQQPQQEMVMSSE 

QSSLPSSSRWPKAEILALINLRSGMEPRYQDNVPKGLLWEEISTSMKRMGYNRNAKRCKE 
KWENINKYYKKVKESNNSNYNNKNQ* ^ CKE 

>G676 (1..612) 

atgagaaagaaagtaagtagtagtggtgacgaaggaaacaatgagtacaagaaaggtttg 
tggacagtagaagaagacaaaatcctcatggattatgtcaaagctcatggcaaaggtcac 
Sl^=Sf^ 9 ^ Caaaaa9aCt99tttaaa9agat3t 9Saaaga3ttgtagattgagg 
S tC ^ Ca9CCCtaat 9 t 9 aaaa 9 a 9gcaatttcaccgagcaagaagaggat 
ZllttZ ttaggctccacaa 3 tt g c ttggtaataggtggtctttaattgctaaaagagtg 



ccgggtcgaacggataatcaagtgaagaactattggaacacgcatcttagtaagaaactc 
ggaatcaaagatcagaaaaccaaacagagcaatggtgatattgtttatcaaatcaatctc 
ccgaatcctaccgaaacatcagaagaaacgaaaatctcgaatattgtcgataacaataat 




cactgtttttga 

^^o^i n0 AGid S^ence (domain in aa coordinates: 17-119) 
S^o^^ GmEY ^ GLW ^ EDKIL ™ Y ™ GKG ^ IAi ^GLKRCGKSCRLR 
n™!! ^ EQE ™ LIIRM ^ LGTOWSLI ^ V PGRTDNQVKNYWNTHLSKKL 

>G682 (1..228) 

ATGGATAACCATCGCAGGACTAAGCAACCCAAGACCAACTCCATCGTTACTTCTTCTTCT 

^y-Zt mia ° Acid Se ^ Qnce (domain in AA coordinates 27-63) 
fTONHRRTKQPKTNSIVTSSSEEVSSLEWEVVNMSQEEEDLVSRMHKLVGDRWELIAGRIP 
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GRTAGE I ERFWVMKN* 
>G635 (1..993) 

ATGGAGATCATGCGTCCAGGGGTCTCAGAAAACACTTTGAAAGGAAAAATAAGAATCACA 

ACGCGGTGCATGTGGCTTGACAAAGGAAGACTTTTAGATGCACTTCACAAAGCAGCTCAT 

GCTGCTCTATCAAGTTGTCCTGTGACATGTCCCTTGTCTCACATGGAAAGAACAGTCTCC 

GAAGTCCTGAGGAAGATTGTAAGGAAGTACAGTGGTAAAAGGCCTGAAGTCATCGCTATA 

GCCACTGAGAATCCAATGGCTGTCCGAGCTGATGAGGTCAGTGCGAGACTGTCTGGTGAT 

CCAAGTGTTGGTTCTGGAGTTGCAGCTTTAAGGAAAGTTGTTGAAGGAAATGACAAAAGA 

AGTCGGGCGAAGAAAGCACCTTCACAAGAAGCTTCCCCCAAAGAAGTAGATCGCACTTTG 

GAAGATGATATCATTGATAGTGCAAGACTACTGGCTGAAGAAGAAACTGCGGCATCAACA 

TACACGGAAGAAGTTGATACGCCCGTTGGGAGTTCTTCAGAAGAGTCAGACGATTTTTGG 

AAATCATTCATCAATCCATCATCGTCACCTTCACCGAGTGAAACAGAAAATATGAATAAG 

GTAGCTGATACGGAGCCTAAAGCAGAGGGTAAGGAAAACAGCAGAGACGACGATGAATTA 

GCTGATGCTTCAGATTCTGAAACCAAGTCATCACCAAAACGTGTGAGGAAGAACAAATGG 

AAACCGGAGGAGATAAAGAAGGTAATCAGAATGCGAGGAGAGCTGCACAGTAGATTTCAA 

GTGGTGAAAGGTAGAATGGCATTGTGGGAAGAGATCTCTTCAAATCTATCAGCTGAAGGA 

ATCAATCGAAGCCCGGGACAATGCAAATCTCTCTGGGCATCACTTATTCAGAAATACGAG 

GAGAGCAAGGCTGATGAGAGAAGCAAGACGAGTTGGCCACATTTTGAGGATATGAACAAC 

ATTTTGTCAGAGCTAGGCACACCTGCGTCTTAA 

>G635 Amino Acid Sequence (domain in AA coordinates: 239-323) 

MEIMRPGVSENTLKGKIRITTRCMWLDKGRLLDALHKAAHAALSSCPVTCPLSHMERTVS 

EVLRKIVRKYSGKRPEVIAIATENPMAVRADEVSARLSGDPSVGSGVAALRKWEGNDKR 

SRAKKAPSQEASPKEVDRTLEDDIIDSARLLAEEETAASTYTEEVDTPVGSSSEESDDFW 

KSFINPSSSPSPSETENMNKVADTEPKAEGKENSRDDDELADASDSETKSSPKRVRKNKW 

KPEEIKKVIRMRGELHSRFQWKGRMALWEEISSNLSAEGINRSPGQCKSLWASLIQKYE 

ESKADERSKTSWPHFEDMNNILSELGTPAS * 
>G1068 (150.. 1310) 

GAGAGTTGTTAGCTAGCTCACACGCTTTCGCTTAAAACTCAAAAACCTGCACTTTCTCGT 
CTATTTTOTCGGCATTCGTAAAACAGAAAAGTGGGTCTCCAAGAAAATTACCCTAAATTC 
ACAAAGATTCATACTTTTCTCCACCTCCAATGGATTCCAGAGAGATCCACCACCAACAAC 
AGCAACAACAACAACAACAACAGCAGCAGCAGCAACAACAGCAACATCTACAACAACAGC 
AACAACCACCGCCAGGGATGTTAATGAGTCACCACAATTCCTACAATCGAAACCCTAACG 
CCGCCGCCGCTGTTTTAATGGGTCACAACACCTCCACATCTCAAGCTATGCATCAAAGAT 
TACCTTTTGGTGGTTCTATGTCACCGCATCAGCCTCAACAACATCAGTATCATCATCCTC 
AGCCTCAGCAACAGATAGATCAGAAGACTCTTGAATCTCTTGGATTTCCTACTTCGCCTC 
TTCCTTCTGCTTCTAATTCTTACGGTGGTGGAAATGAAGGAGGTGGTGGTGGTGATAGCG 
CCGGAGCTAATGCTAACTCTTCCGATCCACCTGCTAAACGGAACAGAGGACGTCCTCCTG 
GCTCCGGTAAGAAGCAGCTCGATGCTTTAGGAGGAACAGGAGGAGTTGGGTTCACGCCTC 
ATGTCATTGAGGTTAAAACAGGAGAGGACATAGCTACGAAGATATTGGCGTTTACGAACC 
AAGGGCCACGCGCAATCTGTATTCTCTCAGCTACAGGAGCTGTAACTAATGTGATGCTTC 
GTCAAGCTAACAATAGCAATCCTACTGGAACTGTTAAGTATGAGGGCCGATTTGAAATCA 

TTTCTCTGTCAGGTTCTTTCrrGAATO 

ACTTGAGTGTGTCGCTGGCTGGACACGAAGGCCGGATTGTGGGTGGATGTGTTGATGGAA 
TGCTAGTAGCTGGATCACAAGTCCAGGTCATTGTGGGAAGCTTTGTACCAGATGGAAGGA 
AGCAGAAACAAAGTGCGGGGCGTGCTCAGAATACTCCGGAGCCAGCTTCAGCACCAGCCA 
ATATGTTGAGCTTTGGTGGTGTTGGTGGACCGGGAAGCCCTCGATCTCAAGGACAACAAC 
ACTCGAGCGAGTCATCAGAGGAAAACGAAAGTAATTCTCCGTTGCACCGTAGAAGCAACA 
ACAAGAACAGCAACAATCATGGGATATT^ 

TTCCTATGCAGATGTACCAGAATCTCTGGCCTGGCAACAGTCCTCAATAAACAGATGGTT 
CATGGGTCAAGATTTGACCGGGTTTGCTTCTCTGTTCCTTTTGACACATCTCTCCATCAG 

ATTTATCTCTATAAAGTAGATTGAGCT^ 

TTCTCTTAAATTTAGCTTTGGTTTTAGATAAATAGAGAGAGAGAGACATGTTAAGTAGGT 
TTCAAATTCAATCTTGTTTAGTTTGT 

AAGACTTGTTCTTTTTCTCCTATATTCAACGAATTATCCACTTTAA 

>G1068 Amino Acid Sequence (domain in AA coordinates: 143-150) 

MDSREIHHQQQQQQQQQQQQQQQQQHLQQQQQPPPGI^SHHNSYNRNPNAAAAVLMGHN 

TSTSQAMHQRLPFGGSMSPHQPQQHQYHHPQPQQQIDQKTLESLGFPTSPLPSASNSYGG 
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GNEGGGGGDSAGANANSSDPPAKRNRGRPPGSGKKQLDALGGTGGVGFTPHVIEVKTGED 
IATKILAFTNQGPRAICILSATGAVTNVMLRQANNSNPTGTVKYEGRFEIISLSGSFLNS 
ESNGTVTKTGNLSVSLAGHEGRIVGGCVDGMIiVAGSQVQVIVGSFVPDGRKQKQSAGRAQ 
NTPEPASAPANMLSFGGVGGPGSPRSQGQQHSSESSEENESNSPLHRRSNNNNSNNHGIF 

GNSTPQPLHQI PMQM YQNLW PGN S PQ * 
>G1225 (1..984) 

ATGACTCTAGAAGCTTTATCATCAAACGGTCTTTTAAACTTTTTGCTCTCTGAAACTCTT 
TCACCAACTCCATTCAAGTCTCTCGTCGATCTCGAGCCATTGCCGGAAAATGATGTCATC 
ATATCGAAGAACAGAATTTCGGAGATATCTAATCAAGAACCGCCACCACAGCGACAACCA 
CCAGCTACGAATCGAGGGAAGAAGCGGCGGAGGAGGAAGCCTAGGGTTTGCAAAAACGAG 
' GAAGAAG CTG AGAATCAACGAATGACTCACATTGCCGTCGAAAGAAATCGAAGAAGACAA 
ATGAATCAACATCTCTCTGTCTTGCGATCTCTCATGCCTCAACCTTTTGCTCACAAGGGT 
GATCAAGCTTCAATAGTTGGTGGAGCCATAGATTTCATCAAAGAACTTGAACACAAATTA 
CTATCTCTTGAAGCTCAAAAACATCATAATGCTAAATTAAACCAGTCGGTTACTTCTTCA 
ACAAGTCAAGACTCAAATGGTGAACAAGAGAATCCTCATCAACCATCTTCACTATCTCTA 
TCGCAGTTCTTTCTTCATTCATACGATCCGAGCCAAGAGAATAGGAACGGCTCAACAAGC 
TCGGTGAAAACCCCTATGGAAGATCTTGAGGTGACTCTAATCGAAACTCATGCTAACATC 
AGAATCTTGTCGAGAAGAAGAGGTTTCCGGTGGAGCACGTTGGCCACCACCAAACCGCCG 
CAGCTTTCGAAGCTGGTGGCTTCTCTACAATCGCTGTCCCTCTCCATTCTTCACCTTAGT 
GTCACAACATTGGACAATTATGCTATTTACTCCATCAGCGCTAAGGTGGAAGAGAGTTGC 
CAGCTAAGTTCAGTAGATGACATTGCAGGAGCAGTTCACCACATGCTAAGTATCATTGAA 
GAGGAGCCTTTTTGTTGCTCATCAATGTCAGAATTACCATTTGACTTCTCTTTGAATCAC 
TCAAATGTCACTCATTCTCTCTGAGAAATCTCTTTTTTGTTGTTGTTATTCCTTCTTTTA 
ATTTTATCACATAGCACATCTTTAGTTTTTTTTTTT 

>G1225 Amino Acid Sequence (domain in AA coordinates: 78-147) 
MTLEALSSNGLLNFLLSETLSPTPFKSLVDLEPLPENDVIISKNTISEISNQEPPPQRQP 

PATNRGKKRRRRKPRVCKNEEEAENQ^ 

DQASIVGGAIDFIKELEHKLLSLEAQKHHNAICLNQSVTSSTSQDSNGEQENPHQPSSLSL 
SQFFLHSYDPSQENRNGSTSSVKTPMEDLEVTLIETHANIRILSRRRGFRWSTLATTKPP 
QLSKLVASLQSLSLS ILHLS VTTLDNYAI YS IS AKVEESCQLS S VDDI AGAVHHMLS I IE 
EEPFCCSSMSELPFDFSLNHSNVTHSL* 
>G1337 (97.. 1398) 

AATGGATTTGTCATCATTCTTCTCACCGTCCTTAGTCTCTGAAAATAAATTCTGATTTTG 

ATTTCGAATTTTAGGGATTTTGAGAGAGAGTCAGTTATGAGTAGTTCGGAGAGAGTACCG 

TGCGATTTCTGCGGCGAGCGTACGGCGGTTTTGTTTTGTAGAGCCGATACGGCGAAGCTG 

TGTTTGCCTTGTGATCAGCAAGTTCACACGGCGAATCTGTTGTCGAGGAAGCACGTGCGA 

TCTCAGATCTGCGATAATTGCGGTAACGAGCCAGTCTCTGTTCGGTGTTTCACCGATAAT 

CTGATTTTGTGTCAGGAGTGTGATTGGGATGTTCACGGAAGTTGTTCAGTTTCCGATGCT 

CATGTTCGATCCGCCGTGGAAGGTTTTTCCGGTTGTCCATCGGCGTTGGAGCTTGCTGCT 

TTATGGGGACTTGATTTGGAGCAAGGGAGGAAAGATGAAGAGAATCAAGTTCCGATGATG 

GCGATGATGATGGATAATTTCGGGATGCAGTTGGATTCTTGGGTTTTGGGATCTAATGAA 

TTGATTGTTCCCAGCGATACGACGTTTAAGAAGCGTGGATCTTGTGGATCTAGTTGTGGG 

AGGTATAAGCAGGTATTGTGTAAGCAGCTTGAGGAGTTGCTTAAGAGTGGTGTTGTCGGT 

GGTGATGGCGATGATGGTGATCGTGACCGTGATTGTGACCGTGAGGGTGCTTGTGATGGA 

GATGGAGATGGAGAAGCAGGAGAGGGGCTTATGGTTCCGGAGATGTCAGAGAGATTGAAA 

TGGTCAAGAGATGTTGAGGAGATCAATGGTGGCGGAGGAGGAGGAGTTAACCAGCAGTGG 

AATGCTACTACTACTAATCCTAGTGGTGGCCAGAGTTCTCAGATATGGGATTTTAACTTG 

GGACAGTCACGGGGACCTGAGGATACGAGTCGAGTGGAAGCTGCATATGTAGGGAAAGGT 

GCTGCTTCTTC^TTCACAATCAACAATTTTGTTGACCATATGAATGAAACTTGTTCCACT 

AATGTGAAAGGTGTCAAAGAGATTAAAAAGGATGACTACAAGCGATCAACTTCAGGCCAG 

GTACAACCAACAAAATCTGAGAGCAACAATCGTCCAATTACCTTTGGCTCTGAGAAAGGT 

TCGAACTCCTCCAGTGACTTGCATTTCACAGAGCATATTGCTGGAACTAGTTGTAAGACC 

ACAAGACTAGTTGCAACTAAGGCTGATCTGGAGCGGCTGGCTCAGAACAGAGGAGATGCA 

ATGCAGCGTTACAAGGAAAAGAGGAAGACACGGAGATATGATAAGACCATAAGGTATGAA 

TCGAGGAAGGCAAGAGCTGACACTAGGTTGCGTGTCAGAGGCAGATTTGTGAAAGCTAGT 

GAAGCTCCTTACCCTTAACCTTAAGTTTTTTCACATAGGCTTCCTTTTAGCTACAAACTT 

AGTTACTTTTTTTACTCCACTGCCTCATAAATGTACAGACCGGTCTCGTTTCATCTGGCC 
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GCCCTTCTTGTTTTATTGCCTTATCTGGCCCTTTTATGTACCTTGGAATCTTATCTAGTT 
TAAAAAAGATTGTAACCTTCTAGAAAACCATATTCTGTTGACAGTATATACATGTCTATC 

CAAGCAAAAA _ 

>G1337 Amino Acid Sequence (domain in AA coordinates: 9-75) 

MSSSERVPCDFCGERTAVTjFCRADTAKLCLPCDQQVHTANLLSRKHVRSQICDNCGNEPV 

SVRCFTDNLILCQECDWDVHGSCSVSDAHVRSAVEGFSGCPSALELAALWGLDLEQGRKD 

EENQVPMMAMMMDNFGMQLDSWVLGSNELIVPSDTTFKKRGSCGSSCGRYKQVLCKQLEE 

LIiKSGWGGDGDDGDRDRDCDREGACDGDGDGEAGEGIiMVPEMSERLKWSRDVEEINGGG 

GGGVNQQWNATTTNPSGGQSSQIWDFNLGQSRGPEDTSRVEAAYVGKGAASSFTINNFVD 

HMNETCSTNVKGVKEIKKDDYKRSTSGQVQPTKSESNNRPITFGSEKGSNSSSDLHFTEH 

IAGTSCKTTRLVATKADLERLAQNRGDAMQRYKEKRKTRRYDKTIRYESRKARADTRLRV 

RGRFVKASEAPYP* 
>G1759 (110.. 700) 

CGAGAAAAGGAAAAAAAAAAATAGAAAGAGAAAACGCTTAGTATCTCCGGCGACTTGAAC 

CCAAACCTGAGGATCAAATTAGGGCACAAAGCCCTCTCGGAGAGAAGCCATGGGAAGAAA 

AAAACTAGAAATCAAGCGAATTGAGAACAAAAGTAGCCGACAAGTCACCTTCTCCAAACG 

TCGCAACGGTCTCATCGAGAAAGCTCGTCAGCTTTCTGTTCTCTGTGACGCATCCGTCGC 

TCTTCTCGTCGTCTCCGCCTCCGGCAAGCTCTACAGCTTCTCCTCCGGCGATAACCTGGT 

CAAGATCCTTGATCGATATGGGAAACAGCATGCTGATGATCTTAAAGCCTTGGATCATCA 

GTCAAAAGCTCTGAACTATGGTTCACACTATGAGCTACTTGAACTTGTGGATAGCAAGCT 

TGTGGGATCAAATGTCAAAAATGTGAGTATCGATGCTCTTGTTCAACTGGAGGAACACCT 

TGAGACTGCCCTCTCCGTGACTAGAGCCAAGAAGACCGAACTCATGTTGAAGCTTGTTGA 

GAATCTTAAAGAAAAGGAGAAAATGCTGAAAGAAGAGAACCAGGTTTTGGCTAGCCAGAT 

GGAGAATAATCATCATGTGGGAGCAGAAGCTGAGATGGAGATGTCACCTGCTGGACAAAT 

CTCCGACAATCTTCCGGTGACTCTCCCACTACTTAATTAGCCACCTTAAATCGGCGGTTG 

AAATCAAAATCCAAAACATATATAATTATGAAGAAAAAAAAAATAAGATATGTAATTATT 

CCGCTGATAAGGGCGAGCGTTTGTATATCTTAATACTCTCTCTTTGGCCAAGAGACTTTG 

TGTGTGATACTTAAGTAGACGGAACTAAGTCAATACTATCTGTTTTAAGACAAAAGGTTG 

ATGAACTTTGTACCTTATTCGTGTGAGAAAAAAAAAAAAAAAA 

>G1759 Amino Acid Sequence (conserved domain in AA coordinates: 2-57) 
MGRKKLEIKRIENKSSRQVTFSKRRNGLIEKARQLSVLCDASVALLWSASGKLYSFSSG 
D^VKILDRYGKQHADDLKALDHQSKALNYGSHYELLELVDSKLVGSNVKNVSIDALVQIi 
EEHLETALSVTRAKKTELMLKDVEl^KEKEKMLKEENQVLASQMENNHHVGAEAEMEMSP 

AGQISDNLPVTLPLLN* 
>G1804 (169.. 1497) 

TATCTCTCTCTTTCTCAAAACCTTTCAGTCAAAATTCTCCGGCGGCTTTTAAACTATGTG 

AAGGAGGAGAACCTCCATAACAAGAAGCGGATTCTCTCAGTTTTCCGGCGGCGGAGGAAC 

ACAAAGCCACCGGTTTTTAGAC^CAC^GATrrCATTTTCAGTTGTTAAATGGTAACTAGA 

GAAACGAAGTTGACGTCAGAGCGAGAAGTAGAGTCGTCCATGGCGCAAGCGAGACATAAT 

GGAGGAGGTGGTGGTGAGAATCATCCGTTTACTTCTTTGGGAAGACAATCCTCTATCTAC 

TCATTGACCCTTGACGAGTTCCAACATGCTTTATGTGAGAACGGCAAGAACTTTGGGTCC 

ATGAACATGGACGAGTTTCTTGTCTCTATTTGGAACGCAGAGGAGAATAATAACAATCAA 

CAACAAGCAGCAGCAGCTGCAGGTTCACATTCTGTTCCGGCTAATCACAATGGTTTCAAC 

AACAACAATAACAATGGAGGCGAGGGTGGTGTTGGTGTCTTTAGTGGTGGTTCTAGAGGC 

AACGAAGATGCTAACAATAAGAGAGGGATAGCGAACGAGTCTAGTCTTCCTCGACAAGGC 

TOTTGACACTTCCAGCTCCGCTTTGTAGGAAGACTGTTGATGAGGTTTGGTCTGAGATA 

CATAGAGGTGGTGGTAGCGGTAATGGAGGAGACAGCAATGGACGTAGTAGTAGTAGTAAT 

GGACAGAACAATGCTCAGAACGGCGGTGAGACTGCGGCTAGACAACCGACTTTTGGAGAG 

ATGACACTTGAGGATTTCTTGGTGAAGGCTGGTGTGGTTAGAGAACATCCCACTAATCCT 

AAACCTAATCCAAACCCGAACCAAAACCAAAACCCGTCTAGTGTAATACCCGCAGCTGCA 

CAGCAACAGCTTTATGGTGTGTTTCAAGGAACCGGTGATCCTTCATTCCCGGGTCAAGCT 

ATGGGTGTGGGTGACCCATCAGGTTATGCTAAAAGGACAGGAGGAGGAGGGTATCAGCAG 

GCGCCACCAGTTCAGGCAGGTGTTTGCTATGGAGGTGGCGTTGGGTTTGGAGCGGGTGGA 

CAGCAAATGGGAATGGTTGGACCGTTAAGCCCGGTGTCTTCAGATGGATTAGGACATGGA 

CAAGTGGATAACATAGGAGGTCAGTATGGAGTAGATATGGGAGGGCTAAGGGGAAGGAAA 

AGAGTAGTGGATGGTCCAGTGGAGAAAGTAGTGGAGAGAAGACAGAGGAGGATGATCAAG 

AACCGCGAGTCTGCTGCTAGATCTAGAGCAAGAAAACAAGCATATACAGTGGAATTGGAA 
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GCTGAACTTAACCAGTTGAAAGAAGAGAATGCGCAGCTAAAACATGCATTGGCGGAGTTG 
GAGAGGAAGAGGAAGCAACAGTATTTTGAGAGTTTGAAGTCAAGGGCACAACCGAAATTG 
CCGAAATCGAACGGGAGATTGCGGACATTGATGAGGAACCCGAGTTGTCCACTCTAAACA 
AACAATAGGAAGATGGAGAAGAAGTCGGAGACAGAACGAGGGAAAAACTGATGATTTTCT 
ACGTTGTTGTTTTGTCTTTGAGGAATGAGGTTATAGAATCTTTATACTTTGATGTTTTCT 
GTGTTGGTAGGAGGAACACCATCTGATCTGCTTTACTAGTGTTCCCTGTGAACAAAGAAA 
GTGATTCTGTGTTTCAACATCATCAATCTTTGGAAA 

>G1804 Amino Acid Sequence (domain in AA coordinates: 357-407) 

MVTRETKLTSEREVESSMAQARHNGGGGGENHPFTSLGRQSSIYSLTLDEFQHALCENGK 
NFGSMNMDEFLVS I WNAEENNNNQQ 

GSRGNEDANNKRGIANESSLPRQGSLTLPAPLCRKTVDEVWSEIHRGGGSGNGGDSNGRS 
SSSNGQNNAQNGGETAARQPTFGEMTLEDFLVKAGWREHPTNPKPNPNPNQNQNPSSVI 
PAAAQQQLYGVFQGTGDPSFPGQAMGVGDPSGYAKRTGGGGYQQAPPVQAGVCYGGGVGF 
GAGGQQMGMVGPLSPVSSDGLGHGQVDNIGGQYGVDMGGLRGRKRVVDGPVEKVVERRQR 

RMIKNRESAARSRARKQAYTVELEAELNQLKEENAQLKHALAELERKRKQQYFESLKSRA 

QPKLPKSNGRLRTLMRNPSCPL* 

>G207 (16. .930) 

aaaagatctgtttcaatggcggatcgtgttaaaggtccatggagtcaagaagaagatgag 

cagctacgaaggatggttgagaaatacggaccgaggaattggtctgcgattagcaaatcg 

attccaggtcgatctggtaaatcgtgtagattacgttggtgtaatcagttatctccggag 

gttgagcatcgtcctttctcgccggaggaagatgagactattgtaaccgcccgtgctcag 

tttggtaacaagtgggcgacgattgctcgtcttcttaacggtcgtacggataacgccgtt 

aaaaatcactggaactctacgcttaagaggaaatgcagcggaggtgtggcggttacgacg 

gtgacggagacggaggaagatcaggatcggccgaagaagaggagatctgttagctttgat 

cctgcttttgctccggtggatactggattgtacatgagtcctgagagtcctaacggaatc 

gatgttagtgattctagcacgattccgtcaccgtcgtctcctgttgctcagctgtttaaa 

ccaatgccgatttccggcggttttacggtggttccgcagccgttaccggttgaaatgtct 

tcgtcttcggaggatccacctacttcgttgagtttgtcactacctggagctgagaacacg 

agttcgagccataacaataacaacaacgcgttgatgtttccgagatttgagagtcagatg 

aagattaatgtagaggagagaggaggaggaggagaaggacgtagaggtgagtttatgacg 

gtggtgcaggagatgataaaagctgaagtgaggagttacatggcggaaatgcagaaaaca 

agtggtggattcgtcgtcggaggtttatacgaatccggcggcaatggtggttttagggat 

tgtggagtaataacacctaaggttgagtagttttggtttagggttaaaacttgaatcgat 

tggggattttcaagagcattcatttttggggtttatggtaaaattaaaaacaaaaacaaa 

atgtacagaggaattaaaatttctatggaataatcttaaatctcaaatatttgttacttg 
ttttggtgattcataaccaaaatcaaa 

>G207 Amino Acid Sequence (domain in AA coordinates: 6-106) 

MADRVKGPWSQEEDEQLRRMVEKYGPRNWSAISKSIPGRSGKSCRLRWCNQLSPEVEHRP 
FSPEEDETIVTARAQFGNKWATIARLLNGRTO^^ 

EDQDRPKKRRSVSFDPAFAPVDTGLYMSPESPNGIDVSDSSTIPSPSSPVAQLFKPMPIS 

GGFTVVPQPLPVEMSSSSEDPPTSLSLSLPGAENTSSSHNNNNNALMFPRFESQMK 

ERGGGGEGRRGEFMTWQEMIKAEVRSYMAEMQKTSGGFWGGLYESGGNGGFRDCGVIT 
PKVE* 

>G218 (1..1182) 

ATGGAGGCAGAGATCGTGAGACGATCGGAGGTAACGGGATTAAGAAGGGAGGTGGAAGAA 
TCGTCAATTGGTAGAGGAGATTGCGATGGTGATGGCGGCGATGTGGGAGAAGATGCGGCA 
GGGTTCGTTGGGACGAGCGGGAGAGGAAGAAGAGATCGAGTTAAAGGGCCGTGGTCGAAG 
GAGGAGGATGATGTGTTGAGTGAGCTCGTTAAGAGGTTGGGAGCGAGGAATTGGAGTTTT 
ATCGCTCGGAGTATTCCTGGTCGTTCAGGCAAGTCTTGTCGTCTTCGTTGGTGTAATCAG 
CTCAATCCAAATCTTATACGCAATTCATTTACTGAGGTAGAGGATCAGGCTATCATCGCA 
GCACATGCCATCCACGGAAACAAATGGGCTGTTATCGCGAAGCTCCTCCCCGGAAGAACA 
GATAATGCTATCAAGAACCACTGGAACTCTGCTTTAAGACGTCGATTCATAGACTTTGAA 
AAGGCCAAGAATATAGGAACTGGAAGCTTGGTCGTGGATGATTCTGGATTTGACAGAACG 
ACAACAGTAGCCTCATCAGAAGAAACTTTATCTTCAGGCGGTGGTTGCCATGTAACTACT 
CCAATTGTATCTCCAGAAGGCAAAGAAGCTACCACCTCCATGGAAATGTCTGAAGAACAA 
TGCGTAGAGAAAACAAACGGAGAAGGTATTTCTAGGCAAGATGATAAGGATCCTCCAACG 
CTTTTCCGCCCAGTGCCTCGGCTCAGTTCTTTTAATGCTTGCAATCACATGGAAGGATCA 
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CCCTCTCCACATATACAAGACCAAAATCAGCTCCAATCATCTAAACAAGACGCAGCAATG 

CTAAGATTGCTTGAAGGAGCTTACAGCGAACGGTTTGTGCCTCAAACATGTGGAGGTGGT 

TGTTGCAGCAACAATCCCGATGGCAGTTTTCAGCAAGAATCATTGTTGGGTCCAGAGTTT 

GTGGATTACTTAGACTCACCAACGTTTCCGAGTTCCGAACTAGCTGCTATAGCAACGGAA 

ATAGGCAGCCTCGCTTGGCTGAGAAGCGGTTTAGAGAGTAGCAGCGTGAGGGTGATGGAA 

GACGCAGTTGGTCGGTTAAGGCCTCAAGGCTCCAGGGGTCATCGAGATCATTATCTTGTA 

TCTGAACAGGGGACGAACATAACCAATGTCCTGTCCACATAA 

>G218 Amino Acid Sequence (domain in AA coordinates: TBD) 

MEAEIVRRSEVTGLRREVEESSIGRGDCDGDGGDVGEDAAGFVGTSGRGRRDRVKGPWSK 

EEDDVLSELVKRLGARNWSFI ARS IPGRSGKSCRLRWCNQLNPNLIRNSFTEVEDQAI I A 

AHAIHGNKWAVIAKLLPGRTDNAIKNHWNSALRRRFIDFEKAKNIGTGSLVVDDSGFDRT 

TTVASSEETLSSGGGCHVTTPIVSPEGKEATTSMEMSEEQCVEKTNGEGISRQDDKDPPT 

LFRPVPRLSSFNACNHMEGSPSPHIQDQNQLQSSKQDAAMLRLLEGAYSERFVPQTCGGG 

CCSNNPDGSFQQESLLGPEFVDYLDSPTFPSSELAAIATEIGSLAWLRSGLESSSVRVME 

DAVGRLRPQGSRGHRDHYLVSEQGTNITNVLST* 

>G241 (46.. 867) 

GAAAAACATTTCAACTTCTTTTATCAGCAATCACAAATCAAAGAGATGGGAAGAGCTCCA 
TGCTGTGAGAAGATGGGGTTGAAGAGAGGACCATGGACACCTGAAGAAGATCAAATCTTG 
GTCTCTTTTATCCTCAACCATGGACATAGTAACTGGCGAGCCCTCCCTAAGCAAGCTGGT 
CTTTTGAGATGTGGAAAAAGCTGTAGACTTAGGTGGATGAACTATTTAAAGCCTGATATT 
AAACGTGGCAATTTCACCAAAGAAGAGGAAGATGCTATCATCAGCTTACACCAAATACTT 
GGCAATAGATGGTCAGCGATTGCAGCAAAACTGCCTGGAAGAACCGATAACGAGATCAAG 
AACGTATGGCACACTCACTTGAAGAAGAGACTCGAAGATTATCAACCAGCTAAACCTAAG 
ACCAGCAACAAAAAGAAGGGTACTAAACCAAAATCTGAATCCGTAATAACGAGCTCGAAC 
AGTACTAGAAGCGAATCGGAGCTAGCAGATTCATCAAACCCTTCTGGAGAAAGCTTATTT 
TCGACATCGCCTTCGACAAGTGAGGTTTCTTCGATGACACTCATAAGCCACGACGGCTAT 
AGCAACGAGATTAATATGGATAACAAACCGGGAGATATCAGTACTATCGATCAAGAATGT 
GTTTCTTTCGAAACTTTTGGTGCGGATATCGATGAAAGCTTCTGGAAAGAGACACTGTAT 
AGCCAAGATGAACACAACTACGTATCGAATGACCTAGAAGTCGCTGGTTTAGTTGAGATA 
CAACAAGAGTTTCAAAACTTGGGCTCCGCTAATAATGAGATGATTTTTGACAGTGAGATG 
GAACTTCTGGTTCGATGTATTGGCTAGAACCGGCGGGGAACAAGATCTCTTAGCCGGGCT 
CTAGTTAACATGTTTGAGGAGTAAAGTGAAATGGTGCAAATTAGTTAAGGCTAAGAAATT 
CAAAAGCTTTTGTTTACCGAGAAAAAAACACACTCTAACTCTTGATGTGATGTAGTTAGT 
GTATTAATTAGAGGCTGCGTTTTCAA 

>G241 Amino Acid Sequence (domain in AA coordinates: 14-114) 
MGRAPCCEKMGLKRGPWTPEEDQILVSFILNHGHSNWRALPKQAGLLRCGKSCRLRWMNY 
LKPDIKRGNFTKEEEDAIISLHQILGNRWSAIAAK^ 

PAKPKTSNKKKGTKPKSESVITSSNSTRSESELADSSNPSGESLFSTSPSTSEVSSMTLI 
SHDGYSNEINMDNKPGDISTIDQECVSFETFGADIDESFWKETLYSQDEHNYVSNDLEVA 
GLVE I QQEFQNLGSANNEMI FD SEMELLVRC IG * 
>G254 (15.. 923) 

CGATTTCGAGCTCTATGGTGTCCGTAAACCCTAGACCTAAGGGTTTTCCAGTTTTCGATT 
CCTCGAATATGAGTTTACCAAGCTCCGATGGATTTGGTTCGATTCCGGCCACGGGACGGA 
CCAGTACGGTGTCGTTTTCTGAGGATCCGACGACGAAGATTCGGAAGCCGTACACAATCA 
AGAAGTCGAGAGAGAATTGGACAGATCAAGAGCACGATAAATTTCTAGAAGCTCTTCACT 
TATTCGATAGGGATTGGAAGAAAATAGAAGCCTTTGTTGGATCAAAAACAGTAGTTCAGA 
TACGAAGCCACGCTCAGAAATACTTTCTCAAAGTTCAGAAGAGTGGTGCTAACGAACATC 
TTCCACTTCCTCGACCTAAGAGGAAAGCGAGTCATCCTTATCCTATAAAGGCTCCTAAAA 
ATGTTGCTTATACCTCTCTCCCGTCTTCGAGTACATTACCGTTGCTTGAGCCTGGTTATT 
TGTATAGCTCTGATTCGAAGTCATTGATGGGAAACCAGGCTGTTTGTGCATCTACCTCTT 
CTTCGTGGAATCATGAATCGACAAATCTGCCAAAACCGGTGATTGAAGAGGAACCGGGAG 
TCTCGGCCACGGCTCCTCTCCCAAATAATCGCTGCAGACAGGAAGATACAGAGAGGGTAC 
GAGCAGTGACAAAGCCZAAATAACGAAGAAAGTTGTGAAAAGCCACATAGAGTGATGCCGA 
ATTTTGCTGAAGTTTAC^GCTTCATTGGAAGTGTCTTCGATCCCAACACATCAGGCCACC 
TCCAGAGATTAAAGCAGATGGATCGAATAAATATGG 

ACCTGTCTGTAAATCTGAC^VAGTCCCGAGTTTGCAGAGCAAAGGAGGTTGATATCATCAT 
ACAGCGCTAAAGCTTTGAAATAGAGATAGAATAAAACAATAATGTACCTTATGTGAGATC 



192 



BNSDOCID: <WO_03013227A2JA> 



ft 

WO 03/013227 



PCT/LS02/25805 



AAGAGACAATCATCCAAGGTCTGTATGCATTGCTTGGATTTAGGCCTCGTGTTCTCACTA ■ 

CAGGAGCAGAACCAATCGCAAAGACTCTTAGATGGCTACTGAGTTGTGGTTTTTATGTCT 

CTGTAAGTCGCGGTGGAGCACACGTGTTTGTCCTGTCTTGTGTATGTGTGTATAGATAAT 

ACAAGGTTTTGCAGAGTAAGGTCACAGTTAGCTGCAAGTGAGTTTGGATCAATCTTAAGA 

TTAAAACCCTGAGAGTGAGTGTCCAAAGAGACTGTGTAATATTGGTTTGGCGGTCAGCAG 

AAGAGTTTTGAAGTGCACATCCAGTTAGTGATAACACGGTTGAAGAAAAGGTAAGGTTAC 

AAGTTTAGTTTTGAATAATTGTATACTCAAAAAATATGAATGTATAAAGAATAATCACTT 

GAGTCGCCTTA 

>G254 Amino Acid Sequence (domain in AA coordinates: 62-106) 

MVSVNPRPKGFPVFDSSNMSLPSSDGFGSIPATGRTSTVSFSEDPTTKIRKPYTIKKSRE 

NWTDQEHDKFLEALHLFDRDWKKIEAFVGSKTWQIRSHAQKYFLKVQKSGANEHLPLPR 

PKRKASHPYPIKAPKNVAYTSLPSSSTLPLLEPGYLYSSDSKSLMGNQAVCASTSSSWNH 

ES TNLPKP VI EEE PG VS AT APL PNNRCRQEDTER VRAVTKPNNEE S CE KPHRVM PNF AE V 

YSFIGSVFDPNTSGHLQRLKQMDPINMETVLLLMQNLSWLTSPEFAEQRRLISSYSAKA 

LK* 

>G26 (73.. 729) 

TTGGCTTGTACCCAAACCCATCTTTGACTTCAAAAATAAAATAAAAATAATCATAATTGA 
CATCATCGGATAATGCATAGCGGGAAGAGACCTCTATCACCAGAATCAATGGCCGGAAAT 
AGAGAAGAGAAAAAAGAGTTGTGTTGTTGCTCAACTTTGTCGGAATCTGATGTGTCTGAT 
TTTGTCTCTGAACTCACTGGTCAACCCATCCCATCATCCATTGATGATCAATCTTCGTCG 
CTTACTCTTCAAGAAAAAAGTAACTCGAGGCAACGAAACTACAGAGGCGTGAGGCAAAGA 
CCGTGGGGAAAATGGGCGGCTGAGATTCGTGACCCGAACAAGGCAGCTCGTGTGTGGCTT 
GGGACGTTCGACACTGCAGAAGAAGCCGCCTTAGCGTATGATAAAGCTGCATTTGAGTTT 
AGAGGTCACAAGGCCAAGCTTAACTTCCCCGAGCATATTCGTGTCAACCCTACTCAACTC 
TATCCATCGCCCGCTACTTCCCATGATCGCATTATCGTGACACCACCTAGTCCACCTCCA 
CCAATTGCTCCTGACATACTTCTTGATCAATATGGCCACTTTCAATCTCGAAGTAGTGAT 
TCCAGTGCCAACTTGTCCATGAATATGCTGTCTTCTTCGTCTTCATCTTTGAATCATC^ 
GGGCTAAGACCAAATTTGGAGGATGGTGAAAACGTGAAGAACATTAGTATCCACAAACGA 
CGAAAATAACATGTTAATGGCATAAATATCTCTTCGTCCAAGTTATCAAACGCATTGACC 
TCCGGCTTTGATCATTTTAGGCGCTTAATCTCTTTACGACTTCATTTTGGTAGTCTTTAA 
AGAGTCTATGGAGTGGATTTAGCTAGGAATCAGGCCTTATGGATGAAAAATATATAAATT 
TTGAACATGACTATGCAAGAATGGGATGAAGACTACTTAGCTTGGAAAACGTCCTGATAG 
GTCATGACGACTATATCCACAGAAGATGACCGACGGAGACAACAACATGCCTCACCTGAT 
CGACCGATCAAATGAGATAATGTGTTGACCGGACCGGTCGGATCAGGTTGGGTCGAGTAT 
ATCA 

>G26 Amino Acid Sequence (domain in AA coordinates: 67-134) 

MHSGKRPLSPESMAGNREEKKELCCCSTLSESDVSDFVSELTGQPIPSSIDDQSSSLTLQ 

EKSNSRQRNYRGVRQRPWGKWAAEIRDPNKAARWLGTFDTAE 

AKLNFPEHIRVNPTQLYPSPATSHDRIIVTPPSPPPPIAPDILLDQYGHFQSRSSDSSAN 
LSMNMLS S S S S SLNHQGLRPNLEDGENVKNI S IHKRRK* 
>G263 (48.. 902) 

TTTTTAGTTTTATTTTTCTGTGGTAAAATAAAAAAAGTTCGCCGGAGATGACGGCTGTGA 
CGGCGGCGCAAAGATCAGTTCCGGCGCCGTTTTTAAGCAAAACGTATCAGCTAGTTGATG 
ATCATAGCACAGACGACGTCGTTTCATGGAACGAAGAAGGAACAGCTTTTGTCGTGTGGA 
AAACAGCAGAGTTTGCTAAAGATCTTCTTCCTOVATACTTCAAGCATAATAATTTCTCAA 
GCTTCATTCGTCAGCTCAACACTTACGGATTTCGTAAAACTGTACCGGATAAATGGGAAT 
TTGCAAACGATTATTTCCGGAGAGGCGGGGAGGATCTGTTGACGGACATACGACGGCGTA 
AATCGGTGATTGCTTCAACGGCGGGGAAATGTGTTGTTGTTGGTTCGCCTTCTGAGTCTA 
ATTCTGGTGGTGGTGATGATCACGGTTCAAGCTCCACGTCATCACCCGGTTCGTCGAAGA 
ATCCTGGTTCGGTGGAGAACATGGTTGCTGATTTATCAGGAGAGAACGAGAAGCTTAAAC 
GTGAAAACAATAACTTGAGCTCGGAGCTCGCGGCGGCGAAGAAGCAGCGCGATGAGCTAG 
TGACGTTCTTGACGGGTCATCTGAAAGTAAGACCGGAACAAATCGATAAAATGATCAAAG 
GAGGGAAATTTAAACCGGTGGAGTCTGACGAAGAGAGTGAGTGCGAAGGTTGCGACGGCG 
GCGGAGGAGCAGAGGAGGGGGTAGGTGAAGGATTGAAATTGTTTGGGGTGTGGTTGAAAG 
GAGAGAGAAAAAAGAGGGACCGGGATGAAAAGAATTATGTGGTGAGTGGGTCCCGTATGA 
CGGAAATAAAGAACGTGGACTTTCACGCGCCGTTGTGGAAAAGCAGCAAAGTCTGCAACT 
AAAAAAAGAGTAGAAGACTGTTCAAACCAGCGTGTGACACGTCATCGACGACGACGAAAA 
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AAATGATTTAAAAAACTATTTTTTTCCGTAAGGAAGAAAAGTTATTTTTATGTTTTAAAA 
AGGTGAAGAAGGTCCAGAAGGATCAACGCAAATATATAAATGGATTTTCATGTATTATAT 
AATTTAATTAGTGTATTAAGAAAA 

>G263 Amino Acid Sequence (domain in AA coordinates: TBD) 

MTAVTAAQRSVPAPFLSKTYQLVDDHSTDDWSWNEEGTAFWWKTAEFAKDLLPQYFKH 

NNFSSFIRQLNTYGFRKTVPDKWEFANDYFRRGGEDLLTDIRRRKSVIASTAGKCVWGS 

PSESNSGGGDDHGSSSTSSPGSSKNPGSVENMVADLSGENEKLKRENNNLSSELAAAKKQ 

RDELVTFLTGHLKVRPEQIDKMIKGGKFKPVESDEESECEGCDGGGGAEEGVGEGLKLFG 

VWLKGERKKRDRDEKNYWSGSRMTEIKNVDFHAPLWKSSKVCN* 

>G308 (196.. 1794) 

AGTAATTTAGTTTTTTTTTTTTTTTTTTACAATTTATTTTGTTATTAGAAGTGGTAGTGG 
AGTGAAAAAACAAATCCTAAGCAGTCCTAACCGATCCCCGAAGCTAAAGATTCTTCACCT 
TCCCAAATAAAGCAAAACCTAGATCCGACATTGAAGGAAAAACCTTTTAGATCCATCTCT 
GAAAAAAACCCAACCATGAAGAGAGATCATCATCATCATCATCAAGATAAGAAGACTATG 
ATGATGAATGAAGAAGACGACGGTAACGGCATGGATGAGCTTCTAGCTGTTCTTGGTTAC 
AAGGTTAGGTCATCGGAAATGGCTGATGTTGCTCAGAAACTCGAGCAGCTTGAAGTTATG 
ATGTCTAATGTTCAAGAAGACGATCTTTCTCAACTCGCTACTGAGACTGTTCACTATAAT 
CCGGCGGAGCTTTACACGTGGCTTGATTCTATGCTCACCGACCTTAATCCTCCGTCGTCT 
AACGCCGAGTACGATCTTAAAGCTATTCCCGGTGACGCGATTCTCAATCAGTTCGCTATC 
GATTCGGCTTCTTCGTCTAACCAAGGCGGCGGAGGAGATACGTATACTACAAACAAGCGG 
TTGAAATGCTCAAACGG CGTCGTGGAAACCACCACAG CGACGGCTGAGTCAACTCGGCAT 
GTTGTCCTGGTTGACTCGCAGGAGAACGGTGTGCGTCTCGTTCACGCGCTTTTGGCTTGC 
GCTGAAGCTGTTCAGT^AGGAGAATCTGACTGTGGCGGAAGCTCTGGTGAAGCAAATCGGA 
TTCTTAGCTGTTTCTCAAATCGGAGCTATGAGACAAGTCGCTACTTACTTCGCCGAAGCT 
CTCGCGCGGCGGATTTACCGTCTCTCTCCGTCGCAGAGTCCAATCGACCACTCTCTCTCC 
GATACTCTTCAGATGCACTTCTACGAGACTTGTCCTTATCTCAAGTTCGCTCACTTCACG 
GCGAATCAAGCGATTCTCGAAGCTTTTCAAGGGAAGAAAAGAGTTCATGTCATTGATTTC 
TCTATGAGTCAAGGTCTTCAATGGCCGGCGCTTATGCAGGCTCTTGCGCTTCGACCTGGT 
GGTCCTCCTGTTTTCCGGTTAACCGGAATTGGTCCACCGGCACCGGATAATTTCGATTAT 
CTTCATGAAGTTGGGTGTAAGCTGGCTCATTTAGCTGAGGCGATTCACGTTGAGTTTGAG 
TACAGAGGATTTGTGGCTAACACTTTAGCTGATCTTGATCCTTCGATGCTTGAGCTTAGA 
CCAAGTGAGATTGAATCTGTTGCGGTTAACTCTGTTTTCGAGCTTCACAAGCTCTTGGGA 
CGACCTGGTGCGATCGATAAGGTTCTTGGTGTGGTGAATCAGATTAAACCGGAGATTTTC 
ACTGTGGTTGAGCAGGAATCGAACCATAATAGTCCGATTTTCTTAGATCGGTTTACTGAG 
TCGTTGCATTATTACTCGACGTTGTTTGACTCGTTGGAAGGTGTACCGAGTGGTCAAGAC 
AAGGTCATGTCGGAGGTTTACTTGGGTAAACAGATCTGCAACGTTGTGGCTTGTGATGGA 
CCTGACCGAGTTGAGCGTCATGAAACGTTGAGTCAGTGGAGGAACCGGTTCGGGTCTGCT 
GGGTTTGCGGCTGCACATATTGGTTCGAATGCGTTTAAGCAAGCGAGTATGCTTTTGGCT 
CTGTTCAACGGCGGTGAGGGTTATCGGGTGGAGGAGAGTGACGGCTGTCTCATGTTGGGT 
TGGCACACACGACCGCTCATAGCCACCTCGGCTTGGAAACTCTCCACCAATTAGATGGTG 
GCTCAATGAATTGATCTGTTGAACCGGTTATGATGATAGATTTCCGACCGAAGCCAAACT 
AAATCCTACTGTTTTTCCCTTTGTCACTTGTTAAGATCTTATCTTTCATTATATTAGGTA 
ATTGAAAAATTTTAATCTCGCCTAAATTACT 

>G308 Amino Acid Sequence (domain in AA coordinates: 270-274) 
MKRDHHHHHQDKKTMMMNEEDDGNGMDELLAVLGYKTO 

EDDLSQLATETVHYNPAELYTWLDSMLTDLNPPSSNAEYDLKAI PGDAILNQFAIDSAS S 

SNQGGGGDTYTTNKKLKCSNGVVETTTATAESTRHVVLVDSQENGVRLVHALLA 

KENLTVAEALVKQIGFIiAVSQIGAMRQVATYFAEALARRIYRLSPSQSPIDHSLSDTLQM 

HFYETCPYLKFAHFTANQAILEAFQGKKRVHVIDFSMSQGLQWPALMQALALRPGGPPVF 

RXiTGIGPPAPDNFDYtHEVGCKLAHLAE 

SVAWSVFELHKLLGRPGAIDKVLGVVNQIKPEI FTVVEQESNHNSPI FLDRFTESLHYY 
STLFDSLEGVPSGQDKVMSEVYLGKQIOTWACTGPDRVERHETLSQWRNRF 
HIGSNAFKQASMLLALFNGGEGYRVEESDGCLMLGWHTRPLIATSAWKliSTO 
>G38 (149.. 1156) 

GAGGAAT^CTCGAAAT^GCTACACACAAGAAGAAGAAGAAAAGATACGAGCAAGAAGACT 

AAACACGAAAGCGATTTATCAACTCGAAGGAAGAGACTTTGATTTTCAAATTTC^ 

TATAGATTGTGTTGTTTCTGGGAAGGAGATGGCAGTTTATGATCAGAGTGGAGATAGAAA 
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CAGAACACAAATTGATACATCGAGGAAAAGGAAATCTAGAAGTAGAGGTGACGGTACTAC 

TGTGGCTGAGAGATTAAAGAGATGGAAAGAGTATAACGAGACCGTAGAAGAAGTTTCTAC 

CAAGAAGAGGAAAGTACCTGCGAAAGGGTCGAAGAAGGGTTGTATGAAAGGTAAAGGAGG 

ACCAGAGAATAGCCGATGTAGTTTCAGAGGAGTTAGGCAAAGGATTTGGGGTAAATGGGT 

TGCTGAGATCAGAGAGCCTAATCGAGGTAGCAGGCTTTGGCTTGGTACTTTCCCTACTGC 

TCAAGAAGCTGCTTCTGCTTATGATGAGGCTGCTAAAGCTATGTATGGTCCTTTGGCTCG 

TCTTAATTTCCCTCGGTCTGATGCGTCTGAGGTTACGAGTACCTCAAGTCAGTCTGAGGT 

GTGTACTGTTGAGACTCCTGGTTGTGTTCATGTGAAAACAGAGGATCCAGATTGTGAATC 

TAAACCCTTCTCCGGTGGAGTGGAGCCGATGTATTGTCTGGAGAATGGTGCGGAAGAGAT 

GAAGAGAGGTGTTAAAGCGGATAAGCATTGGCTGAGCGAGTTTGAACATAACTATTGGAG 

TGATATTCTGAAAGAGAAAGAGAAACAGAAGGAGCAAGGGATTGTAGAAACCTGTCAGCA 

ACAACAGCAGGATTCGCTATCTGTTGCAGACTATGGTTGGCCCAATGATGTGGATCAGAG 

TCACTTGGATTCTTCAGACATGTTTGATGTCGATGAGCTTCTACGTGACCTAAATGGCGA 

CGATGTGTTTGCAGGCTTAAATCAGGACCGGTACCCGGGGAACAGTGTTGCCAACGGTTC 

ATACAGGCCCGAGAGTCAACAAAGTGGTTTTGATCCGCTACAAAGCCTCAACTACGGAAT 

ACCTCCGTTTCAGCTCGAGGGAAAGGATGGTAATGGATTCTTCGACGACTTGAGTTACTT 

GGATCTGGAGAACTAAACAAAACAATATGAAGCTTTTTGGATTTGATATTTGCCTTAATC 

CCACAACGACTGTTGATTCTCTATCCGAGTTTTAGTGATATAGAGAACTACAGAACACGT 

TTTTTCTTGTTATAAAGGTGAACTGTATATATCGAAACAGTGATATGACAATAGAGAAGA 

CAACTATAGTTTGTTAGTCTGCTTCTCTTAAGTTGTTCTTTAGATATGTTTTATGTTTTG 

TAACAACAGGAATGAATAATACACACTTGTGAAGCTTTTAAAAAAAAAAAAAAAAAAAAA 

>G38 Amino Acid Sequence (domain in AA coordinates: 76-143) 

MAWDQSGDRNRTQIDTSRKRKSRSRGDGTTVAERLKRWICEYNETVEEVSTKKRKVPAKG 

SKKGCMKGKGGPENSRCSFRGVRQRIWGKWVAEIREPNRGSRLWLGTFPTAQEAASAYDE 

AAKAMYGPIJUaNFPRSDASEVTSTSSQSEVCTVETPGCVHVKTEDPDCESKPFSGGVEP 

MYC^ENGAEEMKRGVKADKHWLSEFEHNYWSDILKEKEKQKEQGIVETCQQQQQDSLSVA 

DYGWPNDVDQSHLDSSDMFDVDELLRDLNGDDVFAGLNQDRYPGNSVANGSYRPESQQSG 

FDPLQSLNYGIPPFQLEGKDGNGFFDDLSYLDLEN* 

CTCCTGTCOTGTCTAAAGAAAAAAGAGAGAGGAAGAAATGGAGACTTTTGAGGAAAGCTC 

TGATTTGGATGTTATACAGAAACATCTATTTGAAGACTTGATGATCCCTGATGGTTTCAT 

TGAAGATTTTGTCTTTGATGATACTGCTTTTGTCTCCGGACTCTGGTCTCTAGAACCCTT 

TAACCCAGTTCCGAAACTGGAACCTAGTTCACCTGTTCTTGATCCAGATTCCTATGTCCA 

AGAGATTCTGCAAATGGAAGCAGAATCATCATCATCATCATCAACAACAACGTCACCTGA 

GGTTGAGACTGTCTCAAACCGGAAAAAAACAAAGAGGTTTGAAGAAACGAGACATTACAG 

AGGCGTGAGAAGGAGGCCATGGGGGAAATTTGCAGCAGAGATTCGAGATCCGGCAAAGAA 

AGGATCCAGGATTTGGTTAGGCACTTTTGAGAGTGATATTGATGCTGCAAGGGCTTACGA 

CTATGCAGCTTTTAAGCTCAGGGGAAGAAAAGCTGTTCTCAACTTTCCTTTGGATGCCGG 

AAAGTATGATGCTCCGGTCAATTCATGCCGAAAAAGGAGGAGAACCGATGTACCACAGCC 

TCAAGGAACAACAACAAGTACTTCATCATCGTCATCAAACTAATGGGGGAATAGTGATGT 

TTAATTAGTATATATAGGTTAATATCTTAAGTATGTGAAGCATCATGTATAGAGCCAAGA 

ACCTGTTAGACTAGTGTACTGAAAAGAACTCTTGCAAAATATGTACTAAAGAGTTCCTGT 

AACAATGGAACTTCTGCGTTTTCTCTTGTCTTAAAGAGCTTAAGGTTCTAGAAACAAAGT 

TCTTGTCCTTTCGGTTTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

AAAAAAAAA ' » 

>G43 Amino Acid Sequence (domain in AA coordinates: 104-172) 

METFEES SDLDVI QKHLFEDLMIPDGF IEDFVFDDTAFVSGLWSLEPFNPVPKLEPSS PV 

LDPDSYVQEILQMEAESSSSSSTTTSPEVETVSNRKKTKRFEETRHYRGVRRRPWGKFAA 

E IRDPAKKGSRI WLGTFE SD IDAARAYDYAAFKLRGRKAVLNFPLDAGKYDAPVNS CRKR 

RRTDVPQPQGTTTSTSSSSSN* 

>G536 (1..768) 

ATGTCGACAAGGGAAGAGAATGTTTACATGGCGAAATTAGCCGAACAAGCTGAACGTTAC 
GAAGAAATGGTTGAATTCATGGAGAAAGTTGCGAAAACTGTTGATGTTGAGGAACTTTCA 
GTTGAAGAGAGGAATCTTCTCTCTGTTGCTTACAAGAACGTGATTGGAGCGAGAAGAGCT 
TCGTGGAGAATCATTTCTTCGATTGAGCAGAAAGAAGAGAGCAAAGGGAACGAAGATCAT 
GTTGCTATTATCAAGGATTACAGAGGAGAGATTGAATCCGAGCTTAGCAAAATCTGTGAT 
GGGATTTTGAATGTTCTTGAAGCTCATCTTATTCCTTCTGCTTCACCAGCTGAATCTAAA 



195 



BNSDOCID: <WO_0301 3227A2_IA> 



WO 03/013227 PCT/US02/25805 



GTGTTTTATCTTAAGATG7VAGGGTGATTATCATAGGTATCTTGCTGAGTTTAAGGCTGGT 

GCTGAAAGGAAAGAAGCTGCTGAAAGCACTTTGGTTGCTTACAAGTCTGCTTCCGACATT 

GCCACTGCTGAGTTAGCTCCTACTCACCCGATAAGGCTTGGTCTTGCACTCAACTTCTCT 

GTGTTTTACTATGAAATCCTCAACTCGCCTGATCGTGCTTGCAGCCTCGCAAAGCAGGCG 

TTTGATGATGCAATCGCTGAGTTAGATACATTGGGTGAGGAATCATACAAGGACAGTACA 

CTGATTATGCAGCTTCTTAGAGACAATCTCACTCTCTGGACTTCAGATATGACTGACGAA 

GCAGGAGATGAGATTAAGGAGGCATCAAAGCCCGATGGTGCCGAGTAA 

>G536 Amino Acid Sequence (domain in AA coordinates :226-233) 

MSTREENVYMAIa J AEQAERYEE^^FMEKVAKTVDVEELSVEERNLLSVAYKNVIGAR^ 

SWRIISSIEQKEESKGNEDHVAIIKDYRGEIESELSKICDGILNVLEAHLIPSASPAESK 

VFYLKMKGDYHRYLAEFKAGAERKEAAESTLVAYKSASDIATAELAPTHPIRLGLALNFS 

VFYYEILNSPDRACSLAKQAFDDAIAELDTLGEESYKDSTLIMQLLRDNLTLWTSDMTDE 

AGDE I KEAS KPDG AE * 

>G567 (38.. 1273) 

AAAAAGAAGAATCAGAAAGTGAAAAAGAGAGCGAGCGATGAACAGTATCTTCTCCATTGA 
CGATTTCTCCGATCCTTTCTGGGAAACTCCTCCGATTCCTCTCAATCCCGACTCTTCTAA 
GCCTGTTACGGCGGATGAAGTTAGCCAGAGTCAACCGGAATGGACTTTCGAGATGTTTCT 
CGAAGAGATTTCTTCGTCGGCGGTGAGCTCTGAGCCACTTGGTAACAACAACAACGCGAT 
CGTCGGTGTTTCTTCGGCGCAATCTCTTCCTTCTGTTTCCGGACAGAATGATTTCGAGGA 
TGATAGTCGATTTCGTGATCGCGATTCGGGAJ\ATTTGGATTGTGCTGCTCCCATGACGAC 
GAAGACGGTGAATGTTGATTCCGATGATTATCGTCGTGTTCTTAAGAACAAGCTTGAGGC 
TGAGTGCGCGACTGGTGTTTCTCTTCGGGTTGGGTCTGTGAAGCCTGAAGATTCGACTAG 
TTCTCCAGAAACTCAACTTCAACCAGTTCAATCCAGTCCTCTTACTCAAGGAGAACTTGG 
TGTTACTTCTTCCTTACCAGCTGAGGTGAAAT^AAACTGGTGTATCAATGAAGCAGGTTAC 
TAGTGGATCGTCGAGAGAATATTCTGATGACGAGGACCTTGATGAAGAGAATGAAACCAC 
CGGTTCCTTGAAGCCAGAGGACGTTAAAAAATCTAGAAGGATGCTGTCAAATCGTGAGTC 
AGCTAGGCGATCTAGAAGGAGAAAGCAGGAGCAAACAAGTGACCTCGAAACACAGGTTAA 
TGATCTAAAAGGTGAGCATTCATCACTTCTTAAACAACTGAGCAACATGAATCACAAGTA 
TGACGAGGCTGCTGTTGGCAATAGAATACTAAAGGCTGACATTGAGACATTAAGAGCTAA 
GGTGAAAATGGCGGAAGAAACCGTGAAGAGAGTAACAGGAATGAATCCGATGCTTCTCGG 
AAGATCAAGTGGACATAACAACAACAACAGAATGCCAATAACTGGTAACAACAGGATGGA 
TTCTTCTAGCATTATTCCAGCTTATCAACCACACTCAAACCTAAACCATATGTCAAACCA 
AAACATCGGGATCCCAACCATTCTACCTCCAAGACTCGGAAACAATTTCGCTGCTCCTCC 
ATCCCAAACCAGCTCTCCCTTGCAGAGAATTAGAAATGGGCAAAATCACCATGTTACTCC 
AAGCGCCAACCCGTATGGCTGGAATACCGAACCTCAGAACGATTCAGCATGGCCGAAAAA 
ATGCGTGGACTGATCAAACAAGAAGCGGGTTTCGCACTATATTAATGTCTATGCATCTGT 
AATTTGTAAGTGTTATTAAGTTACGAATCATGAGAAAACATCTTGTGAAAATACAGTCTC 
ATGGCTTATATATATATATAAGCTCTGTCTTATAACATTACAAGATTCTTATTTGAGAAT 
CGTCTTTCTATTTATAGCTAATAAAAAAAAAAAAAAAAA 

>G567 Amino Acid Sequence (domain in AA cordinates 210-270) 

MNSIFSIDDFSDPFWETPPIPIiNPDSSKPVTADEVSQSQPEWTFEMFLEEISSSAVSSEP 

I^NNNNAIVGVSSAQSLPSVSGQNDFEDDSRFRDRDSGN^ 

VI1KNKLEAECATGVSI1RVGS VKPEDSTS SPETQLQPVQS S PLTQGELGVTS SLPAEVKKT 
GVSMKQVTSGSSREYSDDEDLDEENETTGSLKPEDVKKSRRMLSNRESARRSRRRKQEQT 
SDLETQVNDLKGEHSSLLKQLSNMNHKYDEAAVGNRILKADIETLRAKVKMAEETVKRW 
GMNPMLLGRSSGffimNNRMPI^ 

GNNFAAPPSQTSSPLQRIRNGQNHHVTPSANPYGWNTEPQNDSAWPKKCVD* 
>G680 (338..22T5) 

CAGrTATCTTCTTCCTTCTTCTCTCTGTTTTTTAAATTTATTTTTAGAGAAT^ 
TTTTGCTTCCGATTTGATTATTTCC 

ATGATAAGTCAGATTGCATACTTGTCTCCTCCATGGCTACTCTCAAGGGTTTTGGCTGCG 
GTGGATTCGTTTGGTTTCTCTAGAATCTAAAGAGGTT^^ 

AAACTTTCATGTTTGGGGAGATCAAAGATGGTTTCTTTTTTATACTTTACTTG 

GGATTTGAAGCAGCGAATAGCTGCAACCGGTCCTGTTATGGATACTAATACATCTGGAGA 

AGAATTATTAGCTAAGGCAAGAAAGCCATATACAATAACAAAGCAGCGAGAGCGATGGAC 

TGAGGATGAGCATGAGAGGTTTCTAGAAGCCTTGAGGCTTTATGGAAGAGCTTGGCAACG 

AATTGAAGAACATATTGGGACAAAGACTGCTGTTCAGATCAGAAGTGATGCACAAAAGTT 
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CTTCACAAAGTTGGAGAAAGAGGCTGAAGTTAAAGGCATCCCTGTTTGCCAAGCTTTGGA 

CATAGAAATTCCGCCTCCTCGTCCTAAACGAAAACCCAATACTCCTTATCCTCGAAAACC 

TGGGAACAACGGTACATCTTCCTCTCAAGTATCATCAGCAAAAGATGCAAAACTTGTTTC 

ATCGGCCTCTTCTTCACAGTTGAATCAGGCGTTCTTGGATTTGGAAAAAATGCCGTTCTC 

TGAGAAAACATCAACTGGAAAAGAAAATCAAGATGAGAATTGCTCGGGTGTTTCTACTGT 

GAACAAGTATCCCTTACCAACGAAACAGGTAAGTGGCGACATTGAAACAAGTAAGACCTC 

AACTGTGGACAACGCGGTTCAAGATGTTCCCAAGAAGAACAAAGACAAAGATGGTAACGA 

TGGTACTACTGTGCACAGCATGCAAAACTACCCTTGGCATTTCCACGCAGATATTGTGAA 

CGGGAATATAGCAAAATGCCCTCAAAATCATCCCTCAGGTATGGTATCTCAAGACTTCAT 

GTTTCATCCTATGAGAGAAGAAACTCACGGGCACGCAAATCTTCAAGCTACAACAGCATC 

TGCTACTACTACAGCTTCTCATCAAGCGTTTCCAGCTTGTCATTCACAGGATGATTACCG 

TTCGTTTCTCCAGATATCATCTACTTTCTCCAATCTTATTATGTCAACTCTCCTACAGAA 

TCCTGCAGCTCATGCTGCAGCTACATTCGCTGCTTCGGTCTGGCCTTATGCGAGTGTCGG 

GAATTCTGGTGATTCATCAACCCCAATGAGCTCTXCTCCTCCAAGTATAACTGCCATTGC 

CGCTGCTACAGTAGCTGCTGCAACTGCTTGGTGGGCTTCTCATGGACTTCTTCCTGTATG 

CGCTCCAGCTCCAATAACATGTGTTCCATTCTCAACTGTTGCAGTTCCAACTCCAGCAAT 

GACTGAAATGGATACCGTTGAAAATACTCAACCGTTTGAGAAACAAAACACAGCTCTGCA 

AGATCAAACCTTGGCTTCGAAATCTCCAGCTTCATCATCTGATGATTCAGATGAGACTGG 

AGTAACCAAGCTAAATGCCGACTCAAAAACCAATGATGATAAAATTGAGGAGGTTGTTGT 

TACTGCCGCTGTGCATGACTCAAACACTGCCCAGAAGAAAAATCTTGTGGACCGCTCATC 

GTGTGGCTCAAATACACCTTCAGGGAGTGACGCAGAAACTGATGCATTAGATAAAATGGA 

GAAAGATAAAGAGGATGTGAAGGAGACAGATGAGAATCAGCCAGATGTTATTGAGTTAAA 

TAACCGTAAGATTAAAATGAGAGACAACAACAGCAACAACAATGCAACTACTGATTCGTG 

GAAGGAAGTCTCCGAAGAGGGTCGTATAGCGTTTCAGGCTCTCTTTGCAAGAGAAAGATT 

GCCTCAAAGCTTTTCGCCTCCTCAAGTGGCAGAGAATGTGAATAGAAAACAAAGTGACAC 

GTCAATGCCATTGGCTCCTAATTTCAAAAGCCAGGATTCTTGTGCTGCAGACCAAGAAGG 

AGTAGTAATGATCGGTGTTGGAACATGCAAGAGTCTTAAAACGAGACAGACAGGATTTAA 

GCCATACAAGAGATGTTCAATGGAAGTGAAAGAGAGCCAAGTTGGGAACATAAACAATCA 

AAGTGATGAAAAAGTCTGCAAAAGGCTTCGATTGGAAGGAGAAGCTTCTACATGACAGAC 

TTGGAGGTAAAAAAAAAACATCCACATTTTTATCAATATCTTTAAATCTAGTGTTAGTAG 

TTTGCTTCTCCAATCTTTATGAAAGAGACTTTTAATT.TTCCTTCCGAACATTTCTTTGGT 

CATGTCAGGTTCTGTACCATATTACCCCATGTCTTGTCTCTTGTCTCTGTTTGTGTATGC 

^^^ TGGTCTATA TGTC^TCTGCTACTACTGTTAATTAACCATTAAGCAATGGATTTG 

>G680 Amino Acid Sequence (domain in AA coordinates: 24-70) 

MDTNTSGEELLAKARKPYTITKQRERWTEDEHERFLEALRLYGRAWQRIEEHIGTKTAVQ 

IRSHAQKFFTKLEKEAEVKGIPVCQALDIEIPPPRPKRKPNTPYPRKPGNNGTSSSQVSS 

AKDAKLVSSASSSQLNQAFLDLEKMPFSEKTSTGKENQDENCSGVSTVNKYPLPTKQVSG 

DIETSKTSTVDNAVQDVPKKNKDKDGNDGTTVHSMQNYPWHFHADIWGNIAKCPQNHPS 

GMVSQDFMFHPMREETHGHANLQATTASATTTASHQAPPACHSQDDYRSFLQISSTFSNL 

IMSTLLQNPAAHAAATFAASVWPYASVGNSGDSSTPMSSSPPSITAIAAATVAAATAWWA 

SHGLLPVCAPAPITCVPFSTVAVPTPAMTEMDTVENTQPFEKQNTALQDQTLASKSPASS 

SDDSDETGVTKLNADSKTNDDKIEEWVTAAVHDSNTAQKKNLVDRSSCGSNTPSGSDAE 

TDALDKMEKDKEDVKETDENQPDVIELNNRKIKMRDNNSNNNATTDSWKEVSEEGRIAFQ 

ALFARERLPQSFSPPQVAENVNRKQSDTSMPLAPNFKSQDSCAADQEGWMIGVGTCKSL 

KTRQTGFKPYKRCSMEVKESQVGNINNQSDEKVCKRLRLEGEAST* 

>G867 (64.. 1098) 

CACAACACAAACACATTTCTGTTTTCTCCATTGTTTCAAACCmTAAAAAAAAACACAGAT 
TAAATGGAATCGAGTAGCGTTGATGAGAGTACTACAAGTACAGGTTCCATCTGTGAAACC 
CCGGCGATAACTCCGGCGAAAAAGTCGTCGGTAGGTAACTTATACAGGATGGGAAGCGGA 
TCAAGCGTTGTGTTAGATTCAGAGAACGGCGTAGAAGCTGAATCTAGGAAGCTTCCGTCG 
TCAAAATACAAAGGTGTGGTGCCACAACCAAACGGAAGATGGGGAGCTCAGATTTACGAG 
AAACACCAGCGCGTGTGGCTCGGGACATTCAACGAAGAAGACGAAGCCGCTCGTGCCTAC 
GACGTCGCGGTTCACAGGTTCCGTCGCCGTGACGCCGTCACAAATTTCAAAGACGTGAAG 
ATGGACGAAGACGAGGTCGATTTCTTGAATTCTCATTCGAAATCTGAGATCGTTGATATG 
TTGAGGAAACATACTTATAACGAAGAGTTAGAGCAGAGTAAACGGCGTCGTAATGGTAAC 
GGAAACATGACTAGGACGTTGTTAACGTCGGGGTTGAGTAATGATGGTGTTTCTACGACG 
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GGGTTTAGATCGGCGGAGGCACTGTTTGAGAAAGCGGTAACGCCAAGCGACGTTGGGAAG 
CTAAACCGTTTGGTTATACCGAAACATCACGCAGAGAAACATTTTCCGTTACCGTCAAGT 
AACGTTTCCGTGAAAGGAGTGTTGTTGAACTTTGAGGACGTTAACGGGAAAGTGTGGAGG 
TTCCGTTACTCGTATTGGAACAGTAGTCAGAGTTATGTTTTGACTAAAGGTTGGAGCAGG 
TTCGTTAAGGAGAAGAATCTACGTGCTGGTGACGTGGTTAGTTTCAGTAGATCTAACGGT 
CAGGATCAACAGTTGTACATTGGGTGGAAGTCGAGATCCGGGTCAGATTTAGATGCGGGT 
CGGGTTTTGAGATTGTTCGGAGTTAACATTTCACCGGAGAGTTCAAGAAACGACGTCGTA 
GGAAACAAAAGAGTGAACGATACTGAGATGTTATCGTTGGTGTGTAGCAAGAAGCAACGC 
ATCTTTCACGCCTCGTAACAACTCTTCTTCTTTTTTTTTCTTTTGTTGTTTTAATAATTT 
TTAT^AAACTCCATTTTCGTTTTCTTTATTTGCATCGGTTTCTTTCTTCTTGTTTACCAAA 
GGTTCATGAGTTGTTTTTGTTGTATTGATGAACTGTAAATTTTATTTATAGGATAAATTT 

TAAAAAAAAAAAAAAAAAAAA 

>G867 Amino Acid Sequence (domain in AA coordinates: 59-124) 

MESSSVDESTTSTGSICETPAITPAKKSSVGNLYRMGSGSSVVLDSENGVEAESRKLPSS 

KYKGWPQPNGRWGAQIYEKHQRWLGTFNEEDEAARAYDVAVHRFRRRDAVTNFKDVKM 

DEDEVDFLNSHSKSEIVDMLRKHTYNEELEQSKRRRNGNGNMTRTLLTSGLSNDGVSTTG 

FRSAEALFEKAVTPSDVGKLNRLVIPKHHAEKHF^ 

RYSYWNSSQSYVLTKGWSRFVKEKNLRAGDWSFSRSNGQDQQLYIGWKSRSGSDLDAGR 

VLRLFGWISPESSRNDVVGNKRVNDTEMLSLVCSKKQRIFHAS* 

>G956 (1..840) 

ATGGAGGAGACAGAAAAGAATAAGGGCAGCATAAGTATGGTTGAGGCTAATCTACCTCCT 

GGTTTTAGATTCCATCCTAGAGACGACGAGCTCGTCTGTGACTACTTAATGAGAAGAACC 

GTTCGCAGCCTCTATCAACCAGTTGTCTTGATCGACGTCGATCTTAACAAATGCGAGCCT 

TGGGACATTCCTCAAACGGCGAGAGTGGGAGGGAAAGAATGGTACTTTTACAGCCAAAAA 

GACCGTAAATACGCAACAGGCTACAGAACAAACCGGGCTACGGCCACCGGTTATTGGAAA 

GCCACCGGGAAAGATAGAGCAATCCAAAGAAACGGTGGTCTTGTGGGTATGAGAAAGACA 

CTTGTGTTTTACCGAGGTCGATCCCCTAAAGGTCGTAAAACTGATTGGGTCATGCATGAG 

TTTCGTCTCCAAGGAAAACTTCTTCACCACTCCCCTAATTCTCTCGAGGAAGAGTGGGTA 

TTGTGTAGAGTTTTCCACAAGAACAGCAACGGAGCTGATATAGACGACATCACAAGGAGC 

TGCTCTGATGCAACAGCTTCTGCATTCATGGACTCTTACATCAACTTCGACCATCATCAC 

ATCATCAATCAGCATGTACCCTGCTTCTCCAATAATTTGTCACATAACCAAACCAACCAA 

TCCGGTTTAATCTCCAAGAACTCCAGCCCATTGTTTAATGCTTCCCCTGATCAAATGATT 

CTCAGAACTTTGCTAAGTCAACTCACAAAAAAAGTCGAAGAATCACAGAGTCGTGGAGAC 

GGAAGCTCAGAGAGCCAATTGACCGACATTGGCATCCCAAGCCATGCATGGAATTACTGA 

>G956 Amino Acid Sequence (domain in AA coordinates: TBD) 

MEETEKNKGS I SMVEANLPPGFRFHPRDDELVCDYLMRRTVRSLYQPVVL IDVDLNKCEP 

WDI PQTARVGGKEWYFYSQKDRKYATGYRTNRATATGYWKATGKDRAI QRNGGLVGMRKT 

IjVFYRGRS PKGRKTDWVMHEFRLQGKLLHHS PNSLEEEWVLCRVFHKNSNGAD IDD ITRS 

CSDATASAFMDSYINFDHHHI INQHVPCFSNNLSHNQTNQSGLISKNSSPLFNASPDQMI ■ 

LRTLLSQLTKKVEESQSRGDGS SESQLTD I GI PSHAWNY* 

>G996 (53.. 1063) 

CGATCGATCTTGAATTGATTCTTTGTAGTATTTTATTTACATATATATATAGATGGGAAG 

ACATTCATGTTGTTACAAACAGAAACTGAGGAAAGGACTTTGGTCTCCTGAAGAA 

GAAGCTTCTTCGTTACATCACTAAGTATGGTCATGGTTGCTGGAGCTCTGTCCCTAAACA 

AGCTGGTTTACAGAGATGTGGAAAAAGTTGTAGATTAAGATGGATAAATTATTTAAGACC 

AGATTTGAAGAGAGGAGCATTTTCTCAAGATGAAGAAAATCTCATTATTGAACTTCATGC 

CGTTCTTGGCAATAGATGGTCTCAGATAGCTGCACAGCTTCCTGGAAGAACCGACAATGA 

AATCAAGAATCTTTGGAATTCTTGTTTGAAGAAGAAATTGAGGCTGAGAGGAATTGACCC 

GGTTACACACAAGCTCTTAACCGAAATCGAAACCGGTACAGATGACAAAACAAAACCGGT 

TGAGAAGAGTCAACAGACCTACCTCGTTGAGACTGATGGCTCCTCTAGTACCACTACTTG 

TAGTACTAACCAAAACT^CAACACTGATCATCTTTATACCGGAAATTTCGGTTTTCAACG 

GTTAAGTCTAGAAAACGGTTCAAGAATCGCAGCCGGTTCTGACCTCGGTATCTGGATTCC 

CCAAACCGGAAGAAACCATCATCATCATGTCGATGAAACCATCCCTAGTGCAGTGGTACT 

ACCCGGTTCAATGTTCTCATCCGGTTTAACCGGTTATAGATCCTCCAATCTCGGTTTAAT 

TGAATTGGAAAACTCATTCTCAACCGGGCCAATGATGACAGAGCATCAGCAAATTCAAGA 

GAGTAACTACAACAATTCAACATTCTTTGGAAATGGGAATCTGAATTGGGGATTAACAAT 

GGAGGAAAATCAAAATCCATTC^CAATATCGAATC^TTCAAATTCGTCCTTATACAGTGA 
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TATAAAATCAGAGACCAATTTTTTTGGCACAGAGGCTACAAATGTTGGTATGTGGCCATG 
TAACCAGCTTCAGCCTCAGCAACATGCATATGGCCATATATAAATCTTCTTGTATATTAT 
AA 

>G996 Amino Acid Sequence (domain in AA coordinates: 14-114) 
MGRHSCCYKQKLRKGLWSPEEDBKLLRYITKYGHGCWSSVPKQAGLQRCGKSCRLRWINY 
LRPDLKRGAFS QDEENL 1 1 ELHAVLGNRWS QI AAQLPGRTDNE I KNLWNS CLKKKLRLRG 
IDPVTHKLLTEIETGTDDKTKPVEKSQQTYLVETDGSSSTTTCSTNQNNNTDHLYTGNFG 
FQRLSLENGSRIAAGSDLGIWIPQTGRNHHHHVDETIPSAWLPGSMFSSGLTGYRSSNL 
GLIELENSFSTGPMMTEHQQIQESNYNNSTFFGNGNLNWGLTMEENQNPFTISNHSNSSL 
Y SD I KSETNF FGTE ATNVGMW PCNQLQPQQHAYGHI * 
>G1946 (90.. 1547) 

TCTCACCTATTGTAAAAATCACCAGTTTCGTATATAAAACCCTAATTTTCTCAAAATTCC 

CAAATATTGACTTGGAATCAAAAATCCGAATGGATGTGAGCAAAGTAACCACAAGCGACG 

GCGGAGGAGATTCAATGGAGACTAAGCCATCTCCTCAACCTCAGCCTGCGGCGATTCTAA 

GTTCAAACGCGCCTCCTCCGTTTCTGAGCAAGACCTATGATATGGTTGATGATCACAATA 

CAGATTCGATTGTCTCTTGGAGTGCTAATAACAACAGTTTTATCGTTTGGAAACCACCGG 

AGTTCGCTCGCGATCTTCTTCCTAAGAACTTTAAGCATAATAATTTCTCCAGCTTCGTTA 

GACAGCTTAATACCTATGGTTTCAGGAAGGTTGACCCAGATAGATGGGAATTTGCGAATG 

AAGGTTTTTTAAGAGGTCAGAAGCACTTGCTACAATCAATAACTAGGCGAAAACCTGCCC 

ATGGACAGGGACAGGGACATCAGCGATCTCAGCACTCGAATGGACAGAACTCATCTGTTA 

GCGCATGTGTTGAAGTTGGCAAATTTGGTCTCGAAGAAGAAGTTGAAAGGCTTAAAAGAG 

ATAAGAACGTCCTTATGCAAGAACTCGTCAGATTAAGACAGCAGCAACAGTCCACTGATA 

ACCAACTTCAAACGATGGTTCAGCGTCTCCAGGGCATGGAGAATCGGCAACAACAATTAA 

TGTCATTCCTTGCAAAGGCAGTACAAAGCCCTCATTTTCTATCTCAATTCTTACAGCAGC 

AGAATCAGCAAAACGAGAGTAATAGGCGCATCAGTGATACCAGTAAGAAGCGGAGATTCA 

AGCGAGACGGCATTGTCCGTAATAATGATTCTGCTACTCCTGATGGACAGATAGTGAAGT 

ATCAACCTCCAATGCACGAGCAAGCCAAAGCAATGTTTAAACAGCTTATGAAGATGGAAC 

CTTACAAAACCGGCGATGATGGTTTCCTTCTAGGTAATGGTACGTCTACTACCGAGGGAA 

CAGAGATGGAGACTTCATCAAACCAAGTATCGGGTATAACTCTTAAGGAAATGCCTACAG 

CTTCTGAGATACAGTCATCATCACCAATTGAAACAACTCCTGAAAATGTTTCGGCAGCAT 

CAGAAGCAACCGAGAACTGTATTCCTTCACCTGATGATCTAACTCTTCCCGACTTCACTC 

ATATGCTACCGGAAAATAATTCAGAGAAGCCTCCAGAGAGTTTCATGGAACCAAACCTGG 

GAGGTTCTAGTCCATTACTAGATCCAGATCTGTTGATCGATGATTCTTTGTCCTTCGACA 

TTGACGACTTTCCAATGGATTCTGATATAGACCCTGTTGATTACGGTTTACTCGAACGCT 

TACTCATGTCAAGCCCGGTTCCAGATAATATGGATTCAACACCAGTGGACAATGAAACAG 

AGCAGGAACAAAATGGATGGGACAAAACTAAGCATATGGATAATCTGACTCAACAGATGG 

GTCTCCTCTCTCCTGAAACCTTAGATCTCTCAAGGCAAAATCCTTGATTTTGGGAGTTTT 

TAAAGTCTTTTGAGGTAACACAGTCCCTGAGAGCAGCATATTCAT 

>G1946 Amino Acid Sequence (domain in AA coordinates: 32-130) 

MDVSKVTTSDGGGDSMETKPSPQPQPAAILSSNAPPPFLSKTYDMVDDHNTDSIVSWSAN 

NNSFIVWKPPEFARDLLPKNFKHNNFSSFVRQLNTY^^ 

LQS ITRRKPAHGQGQGHQRS QHSNGQNS SVS ACVEVGKFGLEEEVERLKRDKNVLMQELV 
RLRQQQQSTDNQIiQTMVQRLQGMENRQQQLMSFLAKAVQSPHFLSQFLQQQNQQNESNRR 
ISDTSKKRRFKRDGIVRNNDSATPDGQIVKYQPPMHEQAKAMFKQLMKMEPYKTGDDGF^ 
LGNGTSTTEGTEMETSSNQVSGITLKEMPTASEIQSSSPIETTPENVSAASEATENCIPS 
PDDLTLPDFTHMLPENNSEKPPESFMEPNLGGSSPLLDPDLLIDDSLSFDIDDFPMDSDI 
DPVDYGLLERLIiMS S PVPDNMDSTPVDNETEQEQNGVnDKTKHMDNLTQQMGLLS PETLDL 
SRQNP* — 
>G217 (84.. 2618) 

cttcgttcttaccga^ttccacgagcattagcttcagagaccttgaattggagtgcggtt 
ggatcaaaaacagttgagcgaagatgaggattatgattaagggaggtgtttggaagaaca 
ccgaagatgagattctcaaagccgccgtgatgaagtatggtaagaaccaatgggctcgga 
tctcgtctcttctcgttcgtaagtctgctaaacagtgtaaagctcgctggtacgagtggc 
tcgatccatctatcaaaaagactgaatggaccagagaagaagatgagaagcttctacatc 
ttgctaaacttctgcctactcaatggagaactattgctcctattgtgggtcgtacaccat 
ctcaatgtcttgagaggtatgagaagctccttgatgcagcatgcactaaggatgaaaatt 
atgatgcagcggatgatccacgaaaattacgtcctggtgagattgatccgaacccagaag 
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caaagcctgctcgtcctgatccggtagacatggacgaagatgagaaagaaatgctttctg 

aagcaagagctagattggctaacacgaggggaaagaaggctaaaagaaaagctagagaaa 

aacaacttgaggaagctagaaggcttgcttctctgcaaaaaagaagagaactaaaagcag 

ctgggattgatggaaggcataggaaaagaaagagaaagggaatcgactataatgcagaaa 

ttccttttgaaaagagggcacctgcgggattttatgatactgcggatgaagatcgtcctg 

ctgatcaagtaaaatttccaactaccattgaagaacttgaaggaaaaagaagagctgatg 

tagaagcacatttacgcaaacaagatgttgcaaggaataaaattgctcagagacaggatg 

ctccagcagctatattgcaagcaaacaagctgaatgatccggaagttgttaggaagaggt 

caaagctgatgttaccaccaccgcagatttcagaccacgagctagaagaaattgctaaga 

tgggctatgccagtgaccttcttgccgagaatgaggagctaacagaaggcagtgctgcta 

ctcgtgcacttttggcaaattactcacaaacaccaaggcaaggaatgacacccatgagga 

cacctcaaagaactcctgctggtaaaggtgatgctattatgatggaagcagaaaacctgg 

ccagattaagagactctcagacacctttgctaggaggagaaaatcctgagttgcaccctt 

ctgacttcactggggtcactccgagaaagaaggagattcaaacgcctaatccaatgttga 

ccccttcaatgactcctggtggtgctggtcttactccaagaattggcttgacgccatcaa 

gggatgggtcttctttttctatgacacccaaagggactcccttcagggatgaacttcaca 

ttaacgaagacatggacatgcagcaaagtgcaaaacttgagaggcagagacgagaggaag 

ctagaaggagtttacgctctggtttgactgggcttcctcagccaaagaacgagtaccaaa 

tagttgcacaacctcctcctgaggaaagtgaagagccagaagagaaaattgaggaagaca 

tgtcagacaggatagcgagggaaaaggcggaggaagaagcaagacaacaggcattgctta 

agaagagatccaaggtcttgcagagagatcttcctagacccccagctgcttcattggcag 

taattaggaactcgttgctttcagctgatggagacaaaagttctgttgttcctcctactc 

cgattgaggttgcagataaaatggtaagagaggagcttctacagttgctggagcatgata 

atgcaaagtatccgcttgatgacaaagctgagaagaagaaaggagccaagaaccgtacca 

accgttctgcttctcaagttcttgcaattgacgattttgatgaaaatgagctccaagagg 

ctgacaaaatgataaaggaggaggggaagtttctgtgtgtgtcaatgggacatgagaaca 

agacacttgatgattttgtagaagctcacaacacatgcgtgaatgatctcatgtatttcc 

ccactcgaagcgcttacgagctctcaagtgttgctgggaacgcggacaaagttgcagctt 

ttcaggaggagatggagaatgtgagaaaaaagatggaggaggatgagaagaaggcagaac 

acatgaaggccaagtacaaaacttatacaaagggtcatgagaggagggcagagaccgtgt 

ggacccaaatagaggcgacattgaagcaggctgagattggtggaacagaagtagagtgct 

ttaaagcattgaagaggcaagaagagatggctgcatcttttaggaaaaagaatttgcaag 

aggaagtgataaagcaaaaggaaacagagagtaaactgcagactcgctatgggaatatgt 

tggcaatggttgaaaaagcagaggagataatggtcggtttccgagcacaggcattgaaga 

aacaagaggatgttgaagattctcacaaactgaaagaagctaagctagccactggagagg 

aagaggacatagccatagccatggaagcttctgcataaaaacttgagttttgtattgctt 

acaagttttaaggagacgtagcttgactttgtattggtaagtttttttaatatgagtcat 

gactttgtaaaaaggttatgatatattctctgtttgtatgctttgcaagagtcaagaaat 

ttgaatgcttcaggatcaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 

>G217 Amino Acid Sequence (conserved domain in AA coordinates: 8-67) 

MRIMIKGGWKNTEDEILKAAV^KYGKNQWARISSLLWKSAKQCKARWYEWLDPSIKCT 

EWTREEDEKLiaLAKLLPTQWRTIAPIVGR^ 

KLRPGEIDPNPEAKPARPDPVX)MDEDEKEMLSEARARLANTRGKKAKRKM 

LASLQKRRELKAAGIDGRHRKRKRKGIDYNAEIPFEKRAPAGFYDTADEDRPADQVKFPT 

TIEELEGKRRADVEAHLRKQDVARNKIAQRQDAPAAIM^ 

QI SDHELEE I AKMGYASDLLAENEELTEGSAATRALLANYSQTPRQGMTPMRTPQRTPAG 

KGDAIMMEAENLARLRDSQTPLLGGENPELHPSDFTGVTPRKKEIQTPNPMLTPSMTPGG 

AGLTPRIGLTPSRDGSSFSMTPKGTPFRDELHINEDMDMQQSAKLERQRREEARRSLRSG 

LTGLPQPKNEYQIVAQPPPEESEEPEEKIEEDMSDRIAREKAEEEARQQALLKKRSKVLQ 

RDLPRPPAASLAVIRNSLLSADGDKSSVVPPTPIEVADKMVREELLQLLEHDNAKYPLDD 

KAEKKKGAKNRTNRSASQVLAIOT 

AHNTCVNDLMYFPTRSAYELSSVAGNADKV^ 

YTKGHERRAETVWTQIEATLKQAEIGGTEVECFKALKRQEEMAAS FRKKNLQEEVI KQKE 
TES KLQTRYGNMLAMV12KAEE IMVGFRAQALKKQED^ I AI AM 

EASA* 

>G2192 (92.. 2971) 

CGGAAAGAGATCAACCAACGATAGAGGAGAAGAAGAACTTGCATACGCAAAAAAACTTTC 
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CCGGGAAAATTCCAGAAACTGCTTTGGAAAAATGTGCGAGCCCGATGATAATTCCGCTAG 
AAACGGCGTCACTACTCAACCTTCGAGGTCAAGGGAGCTTCTAATGGATGTTGACGACTT 
AGATCTTGACGGTTCATGGCCACTAGATCAAATCCCTTACTTATCCTCATCGAATCGCAT 
GATTTCTCCGATTTTTGTCTCCTCTTCCTCTGAGCAGCCTTGCTCGCCTCTCTGGGCTTT 
CTCCGACGGTGGAGGAAATGGTTTTCACCACGCAACCTCCGGTGGCGATGATGAGAAGAT 
CAGCTCTGTCTCCGGTGTTCCTTCTTTCCGTCTCGCCGAGTATCCTCTCTTCCTCCCTTA 
CTCTTCTCCATCAGCAGCTGAGAACACAACAGAGAAGCATAACAGTTTCCAGTTTCCGTC 
TCCATTGATGAGCCTAGTCCCACCAGAGAACACAGACAACTACTGTGTGATCAAAGAGAG 
GATGACTCAGGCGCTTCGATACTTCAAAGAATCAACCGAACAACACGTTTTGGCTCAGGT 
CTGGGCTCCTGTGAGAAAGAATGGTCGTGATTTGCTGACGACTTTGGGTCAACCTTTTGT 
TCTTAATCCTAATGGTAATGGGCTTAATCAATACAGGATGATCTCTCTCACATATATGTT 
TTCTGTGGATAGTGAAAGTGACGTAGAGCTCGGACTCCCGGGTCGAGTTTTCCGTCAGAA 
ATTGCCTGAATGGACTCCAAATGTTCAGTACTATTCCAGCAAAGAATTCTCGCGGCTTGA 
TCACGCCTTGCACTACAACGTGCGTGGTACACTGGCCTTGCCTGTCTTTAATCCCTCTGG 
TCAGTCCTGCATAGGTGTTGTGGAACTTATAATGACCTCAGAGAAGATTCACTATGCACC 
CGAAGTGGACAAAGTTTGCAT^AGCCCTTGAGGCGGTAAATCTGAAAAGCTCGGAAATACT 
TGATCACCAAACAACACAGATATGCAATGAGAGTCGCCAAAACGCGCTTGCTGAGATTCT 
CGAAGTGCTGACAGTTGTATGTGAGACCCATAACTTGCCTCTCGCTCAGACTTGGGTTCC 
ATGTCAGCATGGGAGCGTTCTTGCCAATGGTGGCGGTCTAAAGAAAAACTGCACCAGCTT 
TGACGGTAGCTGCATGGGTCAAATCTGCATGTCTACAACCGACATGGCCTGCTATGTCGT 
GGATGCTCATGTCTGGGGCTTTAGAGATGCCTGTCTTGAACACCATCTCCAGAAAGGCCA 
GGGAGTCGCTGGACGAGCTTTTCTCAATGGTGGCTCATGTTTCTGCAGAGACATCACCAA 
GTTCTGCAAAACGCAGTACCCACTAGTCCATTATGCGCTCATGTTCAAGTTGACCACTTG 
TTTTGCAATATCTCTCCAGAGCTCTTACACGGGCGACGACAGTTACATTCTTGAATTTTT 
TCTTCCTTCGAGTATAACAGACGACCAAGAGCAAGATTTGCTGTTGGGTTCTATTTTGGT 
GACAATGAAAGAACATTTTCAGAGTCTGAGGGTTGCATCTGGGGTTGACTTTGGTGAAGA 
TGACGACAAATTGTCTTTCGAGATCATCCAAGCATTACCGGACAAGAAGGTTCATTCAAA 
AATAGAATCCATTCGAGTTCCCTTTTCTGGTTTTAAGTCAAATGCAACAGAGACGATGTT 
GATTCCTCAGCCTGTGGTTCAGTCTTCTGATCCAGTAAATGAGAAAATCAACGTGGCCAC 
TGTTAACGGTGTGGTTAAGGAGAAGAAGAAAACAGAGAAAAAGCGTGGGAAGACTGAGAA 
AACAATCAGTCTAGATGTACTTCAGCAGTATTTCACTGGAAGTCTCAAAGACGCTGCAAA 
GAGCCTAGGAGTTTGCCCGACGACAATGAAGCGAATTTGCAGGCAACACGGAATCTCGCG 
GTGGCCATCGAGGAAGATCAAGAAAGTGAATCGTTCAATCACAAAGCTGAAACGAGTCAT 
CGAATCTGTTCAAGGTACTGATGGAGGCCTCGACCTGACTTCCATGGCCGTTAGTTCCAT 
CCCTTGGACACACGGTCAAACATCAGCACAGCCACTAAACTCACCCAATGGTTCCAAACC 
ACCTGAGCTACCAAACACCAATAATTCACCTAACCATTGGTCAAGTGATCACAGTCCGAA 
CGAGCCAAATGGTTCGCCTGAGTTACCACCAAGCAATGGTCACAAGCGATCACGAACGGT 
GGATGAGAGCGCTGGGACTCCAACCTCTCATGGCTCATGTGACGGTAACCAATTAGATGA 
ACCGAAAGTCCCAAATCAAGATCCGCTCTTCACGGTTGGTGGATCACCCGGGCTCCTTTT 
TCCACCTTATTCTAGAGATCATGATGTATCTGCAGCTTCCTTCGCAATGCCGAACAGGCT 
TCTTGGTTCTATAGACCATTTCCGAGGAATGCTCATTGAAGACGCTGGAAGTTCAAAAGA 
TCTGAGAAATCTCTGCCCCACTGCAGCATTTGACGATAAGTTTCAAGACACAAACTGGAT 
GAACAATGATAATAATAGCAACAACAACTTATACGCTCCCCCAAAGGAAGAGGCCATTGC 
AAATGTTGCATGCGAACCATCAGGCTCAGAAATGAGAACGGTAACAATCAAAGCAAGTTA 
CAAAGACGACATAATACGGTTCAGAATATCCTCGGGTTCAGGTATAATGGAATTGAAGGA 
TGAAGTGGCTAAGAGGCTGAAAGTTGATGCAGGAACGTTCGATATCAAGTATCTTGACGA 
TGATAACGAATGGGTTTTAATAGCTTGTGATGCTGATCTTCAAGAATGTCTCGAGATCCC 
TAGATCCTCCCGCA€GAAAATCGTAAGGCTCTTAGTTCATGATGTAACGACAAATCTAGG 
GAGCTCCTGCGAGAGCACTGGAGAATTGTGACCTGATAATTCATTCGAACTCTTTTGTAA 
ATAG 

>G2192 Amino Acid Sequence (conserved domain in AA coordinates : 500-700) 
MCEPDDNS ARNGVTTQPSRSRELIiMDVDDLDLDGS WPLDQ I P YLS S SNRMI S P I FVS S S S 
EQPCSPLWAFSDGGGNGFHHATSGGDDEKISSVSGVPSFRLAEYPLFLPYSSPSAAENTT 
EKHNSFQFPSPLMSLVPPENTDNYCVIKERMTQALRYF 

LLTTLGQPFvliNPNGNGLNQYRMISLTYMFSvlDSESDVELGLPGRVFRQKLPEWTPNVQY 
YSSKEFSRLDHALHYNTOGTLALPWNPSGQSCIGWELIMTSEKIHYAPEVTDKVCKALE 
AvTOjKSSEILDHOTTQICNESRQNALAEILEVTjTWCETHNLPLAQTWVPCQ 
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GGLKKNCTSFDGSCMGQICMSTTDMACYVVDAHVWGFRDACLEHHLQKGQGVAGRAFLNG 
GSCFCRDITKFCKTQYPLVHYALMFKLTTCFAISLQSSYTGDDSYILEFFLPSSITDDQE 
QDLLLGSILVTMKEHFQSLRVASGVDFGEDDDKLSFEIIQALPDKKVHSKIESIRVPFSG 
FKSNATETMLIPQPWQSSDPVNEKINVATVNGWKEKKKTEKKRGKTEKTISLDVLQQY 
FTGSLKDAAKSLGVCPTTMKRICRQHGISRWPSRKIKKVNRSITKLKRVIESVQGTDGGIi 
DLTSMAVSSIPWTHGQTSAQPLNSPNGSKPPELPNTNNSPNHWSSDHSPNEPNGSPELPP 
SNGHKRSRTVDESAGTPTSHGSCDGNQLDEPKVPNQDPLFTVGGSPGLLFPPYSRDHDVS 
AASFAMPNRLLGSIDHFRGMLIEDAGSSKDLRNLCPTAAFDDKFQDTNWMNNDNNSNNNL 
YAPPKEEAIANVACEPSGSEMRTVTIKASYI<I>DIIRFRISSGSGIMELKDEVAKRLKVDA 
GTFDIKYLDDDNEWVLIACDADLQECLEIPRSSRTKIVRLLVHDVTTNLGSSCESTGEL* 
>G504 (69.,. 1040) 

CGTCGACCTCTTGACGATCATGAGACTGATTTCGTGAAAATATCGTCATTATATCAAATT 

AGAAGTTGATGGAAAACATGGGGGATTCGAGCATAGGGCCGGGCCATCCGCATCTCCCTC 

CCGGGTTTCGGTTTCACCCGACTGATGAGGAACTAGTAGTTCATTACCTCAAGAAGAAAG 

CAGATTCTGTTCCACTTCCAGTCTCAATCATCGCAGAGATTGATCTTTACAAGTTTGATC 

CTTGGGAGCTTCCAAGCAAGGCGAGTTTTGGAGAGCACGAGTGGTACTTCTTTAGTCCTC 

GGGATCGGAAGTATCCAAATGGGGTTAGGCCAAACCGGGCAGCAACTTCCGGTTATTGGA 

AAGCAACGGGAACCGATAAACCGATATTTACGTGCAATAGTCACAAGGTTGGTGTCAAGA 

AAGCGCTTGTTTTTTACGGTGGAAAGCCTCCTAAAGGGATAAAAACAGATTGGATCATGC 

ATGAATATCGCCTCACTGATGGTAACCTTAGCACTGCGGCTAAGCCGCCTGACTTAACCA 

CGACAAGGAAAAACTCACTACGGCTAGACGATTGGGTTCTATGTAGGATCTATAAGAAGA 

ATAGTTCACAAAGACCAACAATGGAGAGAGTATTACTTAGAGAGGATCTAATGGAAGGCA 

TGCTCTCAAAATCATCTGCTAATTCTTCTTCTACATCAGTACTAGACAACAACGACAACA 

ATAATAACAATAACGAAGAACACTTTTTCGACGGTATGGTCGTTTCTTCAGACAAACGTT 

CCTTGTGTGGTCAATACCGAATGGGCCACGAGGCCTCAGGATCATCTTCATTCGGATCTT 

TCTTATCGAGCAAGAGGTTTCATCATACAGGTGATCTCAACAATGATAACTACAATGTCT 

CTTTTGTTTCGATGCTTAGTGAGATTCCTCAGAGTTCGGGGTTTCATGCAAATGGTGTTA 

TGGATACGACGTCGTCTCTAGCTGATCATGGGGTTTTAAGACAGGCGTTTCAGCTTCCTA 

ACATGAACTGGCACTCATAATCTATATAGATATATATGTGTGTATCATATATGTATCTAT 

GCAGGCCTAATATAGTTTACACATAAATCATCTGGGGCGGCCGCT 

>G504 Amino Acid Sequence (domain in AA coordinates: TBD) 

MENMGDSSIGPGHPHLPPGFRFHPTDEELVVHYLKKKADSVPLPVS I IAEIDLYKFDPWE 

LPSKAS FGEHEWYFFS PRDRKYPNGVRPNRAATSGYWKATGTDKP I FTCNSHKVGVKKAL 

VFYGGKPPKGIKTDWIMHEYRLTDGNLSTAAKPPDLTTTRKNSLRLDDWV^ 

QRPTMERVLLREDLMEGMLSKSSANSSSTSVLDN^ 

GQYRMGHEASGSSSFGSFLSSKRFHHTGDLNNDNYNVSFVSMLSEIPQSSGFHANGVMDT 
TSSLADHGVLRQAFQLPNMNWHS * 
>G622 (248.. 2620) 

TCTTTCTTTCTTCAATTCGCCGTCAAAATCTTCTCTTTCTCTTCCCCCGCCGGTCCTTCA 
CCAATCCTCTGATCTCTCTACACACGAACCTTTGATTTTGACCAACGTCGATGCATGTTC 
AXGACTAGTCTCTTCCTCAATCCTTCAATTTCATCAATTCACGTCGATTTCGTATCCGAT 
TCGTTGTTCTAGCTCTTTGTGTGGTGTTAGGGTTTTAAGATTTTGGAATTGGGGTTTGGA 
GTTTGTGATGTTTGAAGTCAAAATGGGGTCAAAGATGTGCATGAACGCTTCATGTGGTAC 
GACTTCTACTGTTGAATGGAAGAAAGGTTGGCCTCTTCGATCTGGTCTTCTCGCTGATCT 
CTGTTATCGTTGCGGATCTGCGTATGAGAGTTCTCTATTCTGTGAACAATTTCATAAGGA 
CCAATCTGGTTGGAGGGAATGCTATTTGTGTAGCAAGAGACTACATTGTGGATGCATTGC 
TTCTAAGGTAACGATTGAGTTAATGGACTATGGTGGTGTTGGTTGTAGTACATGTGCTTG 
CTGCCATCAACTCAATTTGAACACAAGGGGTGAGAATCCAGGTGTTTTTAGCAGATTGCC 
AATGAAAACGTTAGCTGATAGGCAACATGTAAATGGCGAAAGCGGAGGAAGAAACGAAGG 
CGATCTCTTTTCTCAGCCACTAGTCATGGGCGGAGATAAAAGGGAAGAGTTCATGCCTCA 
CCGTGGGTTTGGTAAGCTAATGAGTCCAGAAAGTACAACCACCGGGCATAGGCTGGATGC 
TGCTGGGGAAATGCATGAATCATCACCTTTACAGCCATCTTTAAATATGGGTTTGGCTGT 
GAATCCGT1TAGCCCATCTTTTGCAACCGAGGCTGTCGAGGGAATG7U\ACACATC!AGTCC 
TTCTCAGTCCAACATGGTCCATTGCTCTGCTTCTAATATACTGCAAAAGCCATCAAGACC 
TGCTATTTCAACTCCTCCTGTGGCTAGTAAATCCGCTCAGGCGCGGATTGGAAGGCCTCC 
TGTCGAAGGGCGAGGGAGAGGCCACTTGCTTCCGCGGTATTGGCCAAAATATACGGATAA 
AGAGGTTCAGCAGATCTCTGGAAATTTGAATTTGAACATTGTACCTCTCTTTGAGAAAAC 



202 



BNSOOC1D: <WO_0301 3227A2_I A> 



WO 03/013227 PCT/US02/25805 



TCTTAGTGCCAGTGATGCTGGTCGCATTGGTCGTCTAGTTCTTCCAAAAGCCTGTGCAGA 

GGCATATTTTCCTCCGATTAGTCAATCCGAAGGCATTCCTTTGAAAATCCAAGATGTGAG 

GGGTAGGGAGTGGACGTTCCAGTTCAGATATTGGCCCAATAACAATAGTAGAATGTATGT 

TTTAGAAGGTGTCACTCCATGCATACAGTCCATGATGCTACAGGCTGGTGATACAGTAAC 

TTTCAGTCGGGTTGATCCTGGCGGAAAACTAATCATGGGTTCCAGGAAGGCAGCTAATGC 

TGGAGACATGCAGGGTTGTGGGCTCACCAACGGAACATCAACTGAGGACACATCATCGTC 

TGGTGTAACAGAAAACCCACCCTCCATAAATGGTTCCTCGTGTATTTCACTAATACCGAA 

AGAGTTGAATGGTATGCCTGAGAATTTGAACAGTGAGACTAACGGGGGCAGGATAGGTGA 

TGATCCTACACGAGTTAAAGAGAAGAAGAGAACTCGAACCATTGGTGCAAAAAATAAGAG 

ACTTCTTTTGCATAGTGAAGAATCTATGGAGCTGAGACTCACTTGGGAAGAAGCTCAGGA 

CTTGCTTCGTCCCTCTCCTAGTGTAAAGCCTACCATCGTTGTCATTGAGGAGCAAGAAAT 

TGAAGAATATGACGAACCTCCTGTCTTTGGAAAGAGGACTATAGTCACTACAAAACCTTC 

AGGTGAACAGGAACGATGGGCAACTTGCGACGACTGCTCTAAATGGAGAAGGTTACCTGT 

AGATGCTCTTCTTTCCTTTAAATGGACATGTATAGACAATGTTTGGGATGTGAGTAGGTG 

TTCATGTTCTGCACCGGAGGAGAGTCTGAAGGAACTTGAGAATGTTCTTAAAGTAGGTAG 

AGAGCACAAGAAGAGAAGAACTGGGGAAAGACAGGCAGCACAAAGTCAGCAAGAACCGTG 

TGGTTTGGACGCACTGGCGAGTGCAGCAGTCTTAGGAGACACAATAGGCGAGCCAGAGGT 

AGCGACCACGACCAGACATCCAAGGCACAGGGCTGGATGCTCTTGCATCGTGTGCATTCA 

GCCACCAAGTGGGAAAGGTAGGCACAAGCCTACATGTGGCTGCACTGTGTGTAGCACCGT 

GAAGAGAAGGTTCAAGACGCTTATGATGAGGAGGAAGAAGAAGCAGTTGGAGCGCGATGT 

AACAGCAGCAGAAGATAAGAAGAAGAAGGACATGGAACTGGCTGAGTCTGATAAGAGTAA 

GGAGGAGAAGGAAGTGAACACAGCGAGAATAGACCTGAACAGTGATCCATACAATAAAGA 

AGATGTTGAAGCTGTTGCGGTGGAGAAAGAAGAGAGTCGAAAAAGAGCAATAGGACAGTG 

TTCGGGCGTGGTGGCTCAAGACGCCAGTGATGTTTTAGGAGTTACAGAGTTAGAAGGAGA 

GGGTAAGAATGTTCGTGAAGAGCCGAGAGTTTCAAGCTGATATGGAAA 

>G622 Amino Acid Sequence (domain in AA coordinates: TBD) 

MFEVKMGSKMCMNASCGTTSTVEWKKGWPLRSGLLADLCYRCGSAYESSLFCEQFHKDQS 

GWRECYLCSKRLHCGCIASKVTIELMDYGGVGCSTCACCHQLNLNTRGENPGVFSRLPMK 

TLADRQHWGESGGRNEGDLFSQPLVMGGDKREEFMPHRGFGKIiMSPESTTTGHRLDAAG 

EMHESSPLQPSLNMGLAWPFSPSFATEAVEGMKHISPSQSNMVHCSASNILQKPSRPAI 

STPPVASKSAQARIGRPPVEGRGRGHLLPRYWPKYTDKEVQQISGNLNLNIVPLFEKTLS 

ASDAGRIGRLVLPKACAEAYFPPISQSEGIPLKIQDTOGREWTFQFRYWPNNNSRMYVLE 

GVTPCIQSMMLQAGDTVTFSRVDPGGKLIMGSRKAANAGDMQGCGLTNGTSTEDTSSSGV 

TENPPSINGSSCISLIPKELNGMPENL^ 

LHSEESMELRLTWEEAQDLLRPSPSVKPTIWIEEQEIEEYDEPPVFGKRTIVTTKPSGE 
QERWATCDDCSKWRRLPVDALLSFKWTCID^ 

KKRRTGERQAAQSQQEPCGLDALASAAVLGDTIGEPEVATTTRHPRHRAGCSCIVCIQPP 

SGKGRHKPTCGCTVCSTVKRRFKTLMMRRKKKQLERDWAAEDKKKKDM 

KEWTARIDLNSDPYNKEDVEAVAVEKEESRKRAIGQCSGWAQDASDVLGVTELEGEGK 

NVREEPRVSS* 

>G778 (50.. 1249) 

TCTCAATAACACAAAACCTTTTAAACTAGTAAAATACACAGATTTTAGGATGAGCCAATG 
TGTTCCAAACTGTCACATCGATGATACTCCGGCAGCAGCCACCACCACCGTCCGCTCCAC 
CACAGCCGCAGACATCCCCATATTAGACTACGAGGTAGCCGAGCTGACGTGGGAGAACGG 
GCAACTAGGCTTGCACGGCTTAGGTCCACCGCGAGTGACGGCTTCGTCGACCAAGTACTC 
CACAGGCGCCGGTGGAACGTTGGAGTCGATAGTGGACCAAGCTACTCGCCTCCCTAACCC 
TAAGCCCACGGATGAGCTCGTCCCGTGGTTCCATCATCGCTCCTCCAGGGCCGCGATGGC 
AATGGACGCGCTTG5CCCTTGCTCCAACCTAGTACACGAGCAGCAGAGCAAGCCTGGTGG 
CGTTGGCTCCACCCGGGTGGGGTCATGTAGCGATGGTCGTACCATGGGCGGTGGAAAACG 
AGCAAGAGTGGCACCGGAGTGGAGCGGCGGCGGGAGTCAGCGGCTGACCATGGACACTTA 
CGACGTAGGTTTCACCTCAACATCAATGGGCTCGCACGATAACACAATCGACGATCATGA 
CTCCGTCTGCCACAGCCGCCCACAGATGGAGGACGAAGAAGAGAAGAAAGCCGGAGGAAA 
ATCATCAGTTTCAACCAAGAGAAGCAGAGCTGCTGCTATTCATAACCAATCCGAACGTAA 
GAGGAGAGATAAAATCAATCAAAGGATGAAGACTTTGCA/UU^CTGGTTCCCAATTCCAG 
CAAGACGGATAAAGCATCTATGTTGGATGAAGTGATAGAGTATTTGAAGC7VACTTCAAGC 
ACAAGTGAGCATGATGAGCAGAATGAATATGCCTTCTATGATGCTTCCTATGGCCATGCA 
GCAACAACAACAACTACAAATGTCTCTCATGTCCAATCCCATGGGTTTAGGGATGGGCAT 
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GGGGATGCCCGGTCTCGGTCTCCTCGACCTTAATTCTATGAACCGAGCTGCTGCAAGCGC 
TCCTAATATCCATGCCAACATGATGCCAAACCCATTTTTGCCCATGAATTGTCCATCGTG 
GGATGCTTCTTCCAATGACTCTCGATTTCAGTCTCCTCTCATCCCCGATCCTATGTCTGC 
CTTTCTTGCATGCTCTACTCAGCCAACGACGATGGAAGCGTATAGCAGGATGGCTACATT 
ATATCAGCAAATGCAACAACAACTTCCTCCTCCTTCGAATCCAAAATGATTATTACTCAA 
ACACCTCTATATAGTTTACGTCTATATATGTGTTAGTCACATACATACATATATATATTC 
CATCATAATTATTTATTTATATGTATAGGCTTCTCATGAATTATGATATTATACGTATTA 
CGTAAAAAA 

>G778 Amino Acid Sequence (domain in AA coordinates: 220-267) 

MSQCVPNCHIDDTPAAATTTVRSTTAADIPILDYEVAELTWENGQLGLHGLGPPRVTASS 

TKYSTGAGGTLESIVDQATRLPNPKPTDELVPWFHHRSSRAAMAMDALVPCSNLVHEQQS 

KPGGVGSTRVGSCSDGRTMGGGKRARVAPEWSGGGSQRLTMDTYDVGFTSTSMGSHDNTI 

DDHDSVCHSRPQMEDEEEKKAGGKSSVSTKRSRAAAIHNQSERKRRDKINQRMKTLQKLV 

PNSSKTDKASMLDEVIEYIiKQLQAQVSMMSRMNMPSMMLPMAMQQQQQLQMSLMSNPMGL 

GMGMGMPGLGLLDLNSMNRAAASAPNIHANMMPNPFLPMNCPSWDASSNTDSRFQSPLIPD 

PMSAFLACSTQPTTMEAYSRMATLYQQMQQQLPPPSNPK* 

>G791 (173.. 877) 

TTTTCTTTGGGTGTTCCTTCCACCAACGGCAGAAATCGATTCGGCTTAAATCTCCCCCTC 
CTTTCGATCTCTCTGATCGCCGCCGGGAACATTCAATTTCCCGGGAGTTCAACAAAAAAA 
AAACTCTCCGTTTTTATTTTTCCCCCTTTTTCACCGGTGGAAGTTTCCGGAGATGGTGTC 
ACCCGAAAACGCTAATTGGATTTGTGACTTGATCGATGCTGATTACGGAAGTTTCACAAT 
CCAAGGTCCTGGTTTCTCTTGGCCTGTTCAGC7VACCTATTGGTGTTTCTTCTAACTCCAG 
TGCTGGAGTTGATGGCTCGGCTGGAAACTCAGAAGCTAGCAAAGAACCTGGATCCAAAAA 
GAGGGGGAGATGTGAATCATCCTCTGCCACTAGCTCGAAAGCATGTAGAGAGAAGCAGCG 
ACGGGACAGGTTGAATGACAAGTTTATGGAATTGGGTGCAATTTTGGAGCCTGGAAATCC 
TCCCAAAACAGACAAGGCTGCTATCTTGGTTGATGCTGTCCGCATGGTGACACAGCTACG 
GGGCGAGGCCCAGAAGCTGAAGGACTCCAATTCAAGTCTTCAGGACAAAATCAAAGAGTT 
AAAGACTGAGAAAAACGAGCTGCGAGATGAGAAACAGAGGCTGAAGACAGAGAAAGAAAA 
GCTGGAGCAGCAGCTGAAAGCCATGAATGCTCCTCAACCAAGTTTTTTCCCAGCCCCACC 
TATGATGCCTACTGCTTTTGCTTCAGCGCAAGGCCAAGCTCCTGGAAACAAGATGGTGCC 
AATCATCAGTTACCCAGGAGTTGCCATGTGGCAGTTCATGCCTCCTGCTTCAGTCGATAC 
TTCTCAGGATCATGTCCTTCGTCCTCCTGTTGCTTAATCAAGAAAAATCATCAACCGGTT 
TGCTTCTTGCTTCCGCTTAAAAGAAAAGTCTCCATTTGTTTTGCTCTCCTCTCTTTCTCG 
GCTTTCTTAGTCTTATCCTTTTGCTTTGTCGTGTTATCATCGTAACTGTTATCTGTTGAA 
CAATGATATGACATTGTAAACTCCAATTGCTTCGCGCAATGTTATCTATTCACATGTAAA 
TTTAAGTAGAGTTTGGCAAAAAAAAAA 

>G791 Amino Acid Sequence (domain in AA coordinates: 75-143) 
MVSPENANWICDLIDADYGSFTIQGPGFSWPVQQPIGVSSNSSAGVDGSAGNSEASKEPG 
S KKRGRCES S S ATS SKACREKQRRDRLNDKFMELGAILEPGNPPKTDKAAI LVDAVRMVT 
QLRGEAQKLKDSNSSLQDKIKELKTEKNELRDEK^^ 

AP PMMPTAFASAQGQAPGNKMVP 1 1 S YPGVAMWQFMPPAS VDTS QDHVLRPPVA* 
>G861 (158.. 880) 

CTTCTTCCTCCTCCTCCATCTCTTCTCTTTACTCTCTCTTTAATCATCTCTCATTCTTGA 

ATCTTGATCCATCAAAATCAATCCCGTTCTCGAAAGATCCATTAAAATCAAAACCTAAGC^ 

TCTCTCTCTTGCTTCTAGGGTTTTTTTGTTCGTTGTGATGGCGAGAGAAAAGATTCAGAT 

CAGGAAGATCGACAACGCAACGGCGAGACAAGTGACGTTTTCGAAACGAAGAAGAGGGCT 

TTTCAAGAAAGCTGAAGAACTCTCCGTTCTCTGCGACGCCGATGTCGCTCTCATCATCTT 

CTCTTCCACCGGAAAACTGTTCGAGTTCTGTAGCTCCAGCATGAAGGAAGTCCTAGAGAG 

GCATAACTTGCAGTCAAAGAACTTGGAGAAGCTTGATCAGCCATCTCTTGAGTTACAGCT 

GGTTGAGAACAGTGATCACGCCCGAATGAGTAAAGAAATTGCGGACAAGAGCCACCGACT 

AAGGCAAATGAGAGGAGAGGAACTTCAAGGACTTGACATTGAAGAGCTTCAGCAGCTAGA 

GAAGGCCCTTGAAACTGGTTTGACGCGTGTGATTGAAACAAAGAGTGACAAGATTATGAG 

TGAGATCAGCGAACTTCAGAAAAAGGGAATGCAATTGATGGATGAGAACAAGCGGTTGAG 

GCAGCAAGGAACGCAACTAACGGAAGAGAACGAGCGACTTGGCATGCAAATATGTAACAA 

TGTGCATGCACACGGTGGTGCTGAATCGGAGAACGCTGCTGTGTACGAGGAAGGACAGTC 

GTCGGAGTCTATTACTAACGCCGGAAACTCTACCGGAGCGCCTGTTGACTCCGAGAGCTC 

CGACACTTCCCTTAGGCTCGGCTTACCGTATGGTGGTTAGAGATGGAACAATTCAAAGAA 
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GTTGATGGAGTGAGGAGAGTAATGTAAATCTTTTTAACTCGGTAGTAACAAGAGACAATG 

TCTAAGTAGTGAATTCTCAAATGTTTGTGTAAGTTTCTGCCTATGGAAGAGGCTTTCATT 

TTTATGATTTTCACTATGTATGATCTCTCTTCACTGCATTTCTGGTTAGTAACGGCTTGT 

CACCGATAAACTTTCTCGTTATGGAAAGTTAGAATAAAAAAAAAAAAAAAAAAA 

>G86l Amino Acid Sequence (domain in AA coordinates: 2-57) 

MAJREKIQIRKIDNATARQVTFSKRRRGLFKKAEELSVIiCDADVALIIFSSTGKLFEFCSS 

SMKEVLERHNLQSKNLEKLDQPSLELQLVENSDHARMSKEIADKSHRLRQMRGEELQGLD 

IEELQQLEKALETGLTRVIETKSDKIMSEISELQKKGMQLMDENKRLRQQGTQLTEENER 

LGMQICNNVHAHGGAESENAAVYEEGQSSESITNAGNSTGAPVDSESSDTSLRLGLPYGG 

* 

>G938 (1..1755) 

ATGATGATGTTTAACGAGATGGGAATGTATGGAAACATGGATTTCTTCTCTTCCTCCACA 

TCTCTCGATGTGTGTCCATTACCACAAGCTGAACAAGAACCTGTAGTTGAAGATGTCGAC 

TACACCGATGATGAGATGGATGTGGATGAGCTTGAGAAGAGGATGTGGAGAGACAAAATG 

CGTTTGAAACGTCTCAAGGAGCAACAGAGTAAGTGTAAAGAAGGCGTCGATGGTTCGAAA 

CAGAGGCAGTCGC AAGAG CAAGCTAGGAGGAAGAAAATGTCTAGAG CCCAAGATGGGATC 

TTGAAGTATATGTTGAAGATGATGGAAGTTTGTAAAGCTCAAGGCTTTGTTTATGGTATT 

ATTCCTGAGAAGGGTAAGCCTGTGACTGGTGCTTCGGATAATTTGAGGGAATGGTGGAAA 

GATAAGGTTAGGTTTGATCGTAATGGTCCAGCTGCTATTGCTAAGTATCAGTCAGAGAAT 

AATATTTCTGGAGGGAGTAATGATTGTAACAGCTTGGTTGGTCCAACACCGCATACGCTT 

CAGGAGCTTCAGGACACGACTCTTGGTTCGCTTTTATCGGCTTTGATGCAACATTGTGAT 

CCACCGCAGAGACGGTTTCCTTTGGAGAAAGGAGTTTCTCCACCTTGGTGGCCTAATGGG 

AATGAAGAGTGGTGGCCTCAGCTTGGTTTACCAAATGAGCAAGGTCCTCCTCCTTATAAG 

AAGCCTCATGATTTGAAGAAAGCTTGGAAAGTCGGTGTTTTAACTGCGGTGATCAAGCAT 

ATGTCGCCGGATATTGCGAAGATCCGTAAGCTTGTGAGGCAATCAAAATGCTTGCAGGAT 

AAGATGACGGCGAAAGAGAGTGCTACTTGGCTTGCCATTATTAACCAAGAAGAGGTTGTG 

GCTCGGGAGCTTTATCCCGAGTCATGCCCTCCTCTTTCTTCTTCTTCATCATTAGGAAGC 

GGGTCGCTTCTCATTAATGATTGTAGCGAGTATGACGTTGAAGGTTTCGAGAAGGAACAA 

CATGGTTTCGATGTGGAAGAGCGGAAACCAGAGATAGTGATGATGCATCCTCTAGCAAGC 

TTTGGGGTTGCTAAAATGCAACATTTTCCCATAAAGGAGGAGGTCGCCACCACGGTAAAC 

TTAGAGTTCACGAGAAAGAGGAAGCAGAACAATGATATGAATGTTATGGTAATGGACAGA 

TCAGCAGGTTACACTTGTGAGAATGGTCAGTGTCCTCACAGCAAAATGAATCTTGGATTT 

CAAGACAGGAGTTCAAGGGACAACCACCAGATGGTTTGTCCATATAGAGACAATCGTTTA 

GCGTATGGAGCATCCAAGTTTCATATGGGTGGAATGAAACTAGTAGTTCCTCAGCAACCA 

GTCCAACCGATCGACCTATCGGGCGTTGGAGTTCCGGAAAACGGGCAGAAGATGATCACC 

GAGCTTATGGCCATGTACGACAGAAATGTCCAAAGCAACCAAACGCCTCCTACTTTGATG 

GAAAACCAAAGCATGGTCATTGATGCAAAAGCAGCTCAGAATCAGCAGCTGAATTTCAAC 

AGTGGCAATCAAATGTTTATGCAACAAGGGACGAACAACGGGGTTAACAATCGGTTCCAG 

ATGGTGTTTGATTCGACACCATTCGATATGGCAGCATTCGATTACAGAGATGATTGGCAA 

ACCGGAGCAATGGAAGGAATGGGGAAGCAGCAGCAGCAGCAGCAGCAGCAGCAAGATGTA 

TCAATATGGTTCTGA 

>G938 Amino Acid Sequence (domain in AA coordinates: 96-104) 
MIWIFNEMGMYGNI^FFSSSTSLDVCPLPQA 

RLKRLKEQQSKCKEGVDGSKQRQSQEQARRKKMSRAQDGILKYMLKMM^ 

IPEKGKPVTGASDNLREWWKDKVRFDRNGPAAIAKYQSENWISGGSNDCNSLVGPTPHTL 

QELQDTTLGSLLSALMQHCDPPQRRFPLEKGVSPPWWPNGNEEWWPQLGLPNEQGPPPYK 

KPHDLKKAWKVGVLT AVI KHMS PD I AKIRKLVRQS KCLQDKMTAKES ATWLAI INQEE W 

ARELYPESCPPLSSSSSLGSGSLLINDCSEYDVEGFEKEQHGFDVEERKPEIVMMHPLAS 

FGVAKMQHFPIKEEVATlWLEFTRKRKQNiroMNVMW 

QDRSSRDimQMVCPYRDl^AYGASKFHMGGMKLVVPQQPVQPIDLSGVGVPENGQKMIT 
ELMAMYDRNVQSNQTPPTLMENQSMVIDAKAAQN^ 
MVFDSTPFDMAAFDYRDDWQTGAMEGMGKQQQQQQQQQDVSIWF* 
>G965 (73.. 1956) 

GATTCTCTGTGTATGTCTGAATCCTTACAGGATCCAAGAGCTTTGGAAAAAAGATATAAT 
GAATAACAAGATATGGGTTTAGCTACTACAACTTCTTCTATGTCACAAGATTATCATCAT 
CACCAAGGAATCTTTTCCTTCTCTAATGGATTCCACCGATCATCATCAACCACTCATCAG 
GAGGAAGTAGATGAATCCGCCGTCGTCTCCGGTGCTCAAATTCCGGTTTATGAAACCGCC 
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GGAATGTTGTCTGAAATGTTTGCTTACCCTGGCGGAGGTGGCGGCGGTTCCGGTGGAGAG 
ATTCTTGATCAGTCTACTAAACAGTTGCTAGAGCAACAAAACCGTCACAACAACAACAAT 
AACTCAACTCTTCATATGTTATTACCAAATCATCATCAAGGTTTTGCTTTCACCGACGAA 
AACACTATGCAGCCGCAGCAACAACAACACTTTACATGGCCATCTTCCTCCTCCGATCAT 
CATCAAAACCGAGATATGATCGGAACCGTCCACGTGGAAGGAGGAAAGGGTTTGTCTTTA 
TCTCTCTCATCTTCATTAGCCGCAGCTAAAGCCGAGGAATATAGAAGCATTTATTGTGCA 
GCCGTTGATGGAACTTCTTCTTCTTCTAACGCATCCGCTCATCATCATCAATTCAATCAG 
TTCAAGAATCTTCTTCTTGAGAATTCTTCTTCTCAACATCATCACCATCAAGTTGTTGGA 
CATTTTGGTTCATCATCATCATCTCCCATGGCGGCTTCTTCATCCATTGGAGGGATCTAC 
ACGTTGAGGAATTCGAAATATACGAAACCGGCTCAAGAGTTGTTGGAAGAGTTTTGTAGT 
GTTGGAAGAGGACATTTCAAGAAGAACAAACTTAGTAGGAACAACTCAAACCCTAATACT 
ACCGGTGGAGGAGGAGGCGGAGGGTCCTCGTCATCGGCCGGAACAGCTAATGATAGTCCT 
CCTTTGTCTCCGGCTGATCGGATTGAACATCAAAGAAGAAAAGTCAAGCTACTATCTATG 
CTTGAAGAGGTGGACCGACGGTACAACCACTACTGCGAACAAATGCAAATGGTAGTGAAC 
TCATTCGACCAAGTAATGGGTTACGGCGCGGCGGTTCCGTACACGACATTAGCTCAAAAG 
GCAATGTCTAGGCATTTCCGGTGTTTGAAAGACGCGGTAGCGGTTCAGCTTAAACGCAGC 
TGTGAGCTTCTAGGGGATAAAGAGGCGGCAGGGGCTGCATCCTCGGGGTTAACCAAAGGG 
GAAACGCCGCGATTGCGTTTGCTAGAGCAGAGTTTGCGTCAGCAACGAGCGTTTCATCAT 
ATGGGTATGATGGAGCAAGAGGCATGGAGACCGCAACGTGGTTTGCCTGAACGCTCCGTT 
AATATCCTTAGAGCTTGGCTATTCGAGCATTTTCTTAATCCGTACCCAAGCGATGCTGAT 
AAGCACCTCTTAGCACGACAGACTGGTTTATCCAGAAATCAGGTGTCAAATTGGTTCATA 
AATGCTAGGGTTCGCCTATGGAAACCAATGGTGGAAGAGATGTATCAACAAGAAGCAAAA 
GAAAGAGAAGAAGCAGAAGAAGAAAATGAAAATCAACAACAACAAAGAAGACAGCAACAA 
ACAAACAACAACGACACGAAACCCAACAACAATGAAAACAACTTCACTGTCATAACCGCA 
CAAACTCCAACGACGATGACATCGACACATCACGAAAACGACTCTTCATTCCTCTCTTCC 
GTCGCCGCCGCTTCTCACGGCGGTTCAGACGCGTTCACCGTCGCCACGTGTCAGCAAGAC 
GTCAGTGACTTCCACGTCGACGGAGATGGTGTGAACGTCATAAGATTCGGGACCAAACAG 
ACTGGTGACGTGTCTCTTACGCTTGGTCTACGCCACTCTGGCAATATTCCTGATAAGAAC 
ACTTCTTTCTCGGTTAGAGACTTTGGAGATTTTTAGTCTTCTTTGTTTCTCAATTTATTC 
ATC 

>G965 Amino Acid Sequence (domain in AA coordinates: 423-4 86) 
MGLATTTSSMSQDYHHHQGIFSFSNGFHRSSSTTHQEEVDESAVVSGAQIPVYETAGMLS 
EMFAYPGGGGGGSGGEILDQSTKQLLEQQNRHNNNb^^ 
PQQQQHFTWPSSSSDHHQNRDMIGTVHVEGGKGL^^ 

TS S S SNAS AHHHQFNQFKNLLLENS S S QHHHHQ WGHFGS S S S S PMAAS SS IGGI YTLRN 
SKYTKPAQELLEEFCSVGRGHFKKNKLSRNNSNPNTTGGGGGGGSSSSAGTANDSPPLSP 
ADRIEHQRRKVKLLSMLEEVDRRYimYCEQMQ 

HFRCLKDAVAVQLKRSCELLGDKEAAGAASSGLTKGETPRLRLLEQSLRQQRAFHHMGMM 
EQEAWRPQRGLPERSVNI LRAWLFEHFLNPYPSDADKHLLARQTGLSRNQVSNWF INARV 
RLWKPMVEEWQQEAKEREEAEEENENQQQQRRQQQTNNNDTKP 

TMTSTHHENDS S FLS S VAAASHGGSDAFTVATCQQDVSDFHVDGDGVNVIRFGTKQTGDV 
SLTLGLRHSGNI PDKNTS FS VRDFGDF* 
>G1143 (54.. 677) 

AAATAAGAATATAAACACTTTTGTCTGAAAAATTATCAAAGAAGAAGAAATAAATGGGTG 

GAGGAAG(^GATTTCAAGAACCAGTGAGGATGAGCCGTAGGAAACAAGTAACAAAAGAGA 

AGGAAGAAGATGAAAACTTCAAATCTCCAAATCTTGAAGCAGAGAGACGTAGAAGAGAGA 

AGCTTCATTGTCGGCTTATGGCTCTGCGATCTCATGTCCCCATTGTCACCAACATGACTA 

AAGCAAGTATTGTTGAAGATGCGATTACTTACATAGGAGAGCTTCAAAACAATGTTAAGA 

ATCTCTTAGAGAGATTTCATGAAATGGAAGAAGCTCCTCCTGAGATTGATGAAGAACAAA 

CGGATCCAATGATAAAACCTGAAGTTGAAACTAGTGATCTTAACGAAGAGATGAAGAAAC 

TCGGAATCGAGGAGAATGTGCAATTGTGTAAGATTGGGGAGAGGAAGTTTTGGTTAAAGA 

TCATAACAGAGAAGAGAGATGGGATCTTTACTAAATTCATGGAGGTTATGAGATTTCTCG 

GATTCGAGATTATCGATATTAGTCTAACAACTTCAAATGGAGCAATTCTTATTAGTGCC^ 

CTGTTCAGACACAGGAACTCTGTGATGTTGAACAGACAAAAGATT^ 

TGAGAAGCAATCCATAAGTATTAATTATATACATCTTGGAAATTTCTTGATCTAATAACA 

TTTCCATTGGTTTTTATTACATTGTTGTTCC^TTTTAAATATGATATGATTCAGATGAAA 

AAGAGTTTGTGTTACAAGCCAATGA 
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>G1143 Amino Acid Sequence (domain in AA coordinates : 33-82 ) 
MGGGSRFQEPVRMSRRKQVTKEKEEDENFKSPNLEAERRRREKLHCRLMALRSHVPIVTN 
MTKAS I VED AI TY IGELQNNVKNLLETFHEMEE APPE IDEEQTDPM I KPEVETSDLNEEM 
KKLGIEENVQLCKIGERKFWLKI ITEKRDG I FTKFMEVMRFLGFE I ID I SLTTSNGAILI 
SASVQTQELCDVEQTKDFLLEVMRSNP* 
>G1190 (209. .2020) 

TCCTGTCCCAAAACCAAAAGACTTGAGAGTGTGTCTTTAGAGAGAGATCTTCTCTCTTTT 
ATCTTACGACTCTCACTTCTTATCTCAAATCTACTTCAACTCTATTTCCAGTCTCCACAT 
TTTCCCACAAATTTCAACTCTTGTTCTCTTCCTCCAAAGTAAAAAACAAATCGTTGCAAG 
TGAGGTTTGGTTTTGGTGTTATAGAATTATGAAGAGCGGGAAGCAATCTTCGCAACCTGA 
AAAGGGTACTTCCAGGATCTTGTCACTGACTGTCCTGTTTATCGCATTTTGCGGTTTCTC 
CTTCTACCTCGGTGGTATATTTTGCTCTGAGAGAGACAAGATTGTAGCCAAGGATGTCAC 
AAGGACGACTACAAAGGCTGTAGCTTCCCCTAAAGAACCTACAGCTACTCCTATTCAAAT 
CAAATCCGTTTCTTTCCCGGAGTGCGGGTCAGAGTTCCAAGATTACACCCCGTGCACCGA 
TCCAAAGAGGTGGAAGAAGTATGGTGTCCATCGCTTAAGTTTCTTGGAGCGTCATTGTCC 
TCCGGTATATGAAAAGAATGAGTGTTTGATTCCACCACCAGACGGGTATAAACCGCCTAT 
AAGATGGCCCAAGAGCCGAGAACAGTGTTGGTACAGGAACGTGCCTTATGATTGGATCAA 
TAAGCAAAAGTCTAACCAGCATTGGCTTAAGAAAGAAGGAGATAAGTTCCATTTCCCTGG 
TGGTGGTACCATGTTCCCTCGTGGAGTTAGTCACTATGTTGATTTGATGCAAGATCTGAT 
TCCTGAAATGAAAGACGGAACAGTCAGGACCGCCATTGATACTGGCTGTGGGGTTGCGAG 
CTGGGGAGGCGATCTTTTGGACCGTGGGATACTATCACTCTCTCTTGCTCCAAGAGATAA 
CCATGAAGCTCAGGTTCAATTTGCTCTTGAACGTGGAATTCCTGCGATTCTCGGGATCAT 
CTCTACGCAACGTCTCCCTTTTCCTTCAAATGCATTTGATATGGCTCATTGTTCAAGATG 
TCTTATTCCCTGGACAGAATTTGGTGGAATCTATTTACTTGAGATTCACCGTATAGTTCG 
ACCTGGAGGTTTTTGGGTTCTTTCTGGTCCACCTGTGAACTATAATAGACGATGGCGTGG 
ATGGAACACAACCATGGAAGATCAGAAATCTGACTACAACAAGCTTCAGTCACTTCTAAC 
CTCCATGTGTTTCAAAAAGTACGCTCAAAAAGATGACATAGCCGTGTGGCAGAAACTCTC 
AGACAAATCTTGCTATGACAAAATCGCTAAGAACATGGAAGCTTACCCTCCCAAATGTGA 
CGACAGTATAGAACCTGATTCTGCTTGGTACACTCCACTCCGTCCTTGCGTGGTTGCCCC 
GACACCTAAAGTCAAGAAGTCTGGTCTCGGATCAATCCCAAAATGGCCCGAGAGGTTACA 
TGTCGCGCCCGAGAGAATCGGTGATGTTCACGGAGGGAGTGCGAACAGTTTGAAACACGA 
TGATGGTAAATGGAAGAACAGAGTTAAGCATTACAAGAAAGTTTTACCAGCTCTTGGGAC 
AGACAAGATAAGAAATGTTATGGATATGAACACTGTTTATGGAGGTTTCTCTGCGGCCCT 
CATTGAGGATCCCATTTGGGTCATGAACGTTGTATCATCGTACAGCGCAAATTCGCTTCC 
TGTTGTCTTTGATCGCGGTCTCATCGGGACTTACCACGACTGGTGCGAAGCTTTCTCAAC 
GTATCCAAGAACATATGATCTTCTTCACCTCGACAGTCTTTTTACCTTGGAGAGTCACAG 
GTGTGAGATGAAGTACATTTTGCTAGAGATGGACAGGATCTTGCGGCCGAGTGGATATGT 
TATAATCCGAGAATCGAGTTATTTCATGGACGCAATCACAACGTTAGCGAAAGGGATAAG 
GTGGAGTTGCCGGAGAGAGGAGACTGAGTATGCAGTCAAAAGTGAGAAGATTCTGGTTTG 
CCAGAAAAAGCTATGGTTTTCGTCAAACCAAACCTCTTGATGAGACCACCTGTATCATAG 
TGTTTATCATCTCCTGTGATGCACACTACAGAGAGAAGGATCTAGTCCTTTGAGTCCAAG 
ATATAGCTCTATAAACAATCTCCTTTTTTTGTTCTCTTTAATTTCTTGGGTATTTCACGG 
- TATAGATTGATATTATATATTTTTTAATTATATTTTTAATATATAGATATATTAGTATGT 
GGTTTAAACACTATTATTATCAAGGTCTTAAAGATTTGCTTTGCAAGAGTTAAAAAATGT 
TGGAGTAAGGACCTCTTGATTAATAAATTGACTGACGCAGCAAA 

>G1190 Amino Acid Sequence (domain in AA coordinates: entire protein) 

MKSGKQSSQPEKGTSRILSLTVLFIAFCGFSFYLGGIFCSERDKIVAKDVTRTTTKAVAS 

PKEPTATPIQIKSVSFPECGSEFQDYTPCTDPKRWKKYGVHRLSFLERHCPPVYEKNECL 

IPPPDGYKPPIRWPKSREQCWYRNVPYDWINKQKSNQHWLKKEGDKFHFPGGGTMFPRGV 

SHYVDLMQDLIPEMKDGTVRTAIDTGCGVASWGGDLLDRGILSLSLAPRDNHEAQVQFAL 

ERGIPAILGIISTQRLPFPSNAFDMAHCSRCLIPWTEFGGIYLLEIHRIVRPGGFWVLSG 

PPVimmRWRGWNTTMEDQKSDYNKLQSLLTSMCFKKYAQKDDIAVWQKLSD 

KNMEAYPPKCDDSIEPDSAWYTPLRPCWAPTPKVKKSGLGSIPKWPERLHVAPERIGDV 

HGGSANSLKHDDGKWKNRVKHYKK\^PALGTDKIRN^ 

WSSYSANSLPWFDRGLIGTYHDWCEAFSTYPRTYDIiliHLDSLFTLESHRCEMKYILLE 
MDRILRPSGYVIIRESSYF^AITTLAKGIRWSCRREETEYAVKSEKILVCQKKLWFSSN 
QTS* 
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>G1198 (230.. 1675) 

TCTTTTCAAATTCCAATCATTTGATCAACTAATCAAGAATTAATTATAAGACTTTGCAA^ 

ctctctccctctccctctccctagctagttctctcttgtgtttcttaac^cSSctJ 

ACTAATCAAGAAGAGATATCATCAATTGAAGCTGTTTTCTTGAGTAGAGATGGC^r^A 

tagaatgagcgaagctacaaaccataaccacaatcatcatSS^ 
tggtctcaacaacaatcatccatcttctggtttcattaaccaagaS 



^^ gccacc ™ ttag P ag gaggaggaggagctacgactctggagatgttcccttc 
gtggccaatcagaactcaccaaactcttcctactgagagttccaagtcaggagS^^ 

gagcga ^ cag ™^ 



AGAATCAAGTAGGATAAAGCTTTCCCAATTGGAGCAAGAACTTCAGCGAGCT^ 



AACCGGTCTTCAGGCTCATTTATCTGACAATGATTTAAGGTTGATCGTTGACGGT 

n 



ss^r GGGAcATGGATGTccc ^ 



GTAT ^ CTAAAAG GCCAAGTTTCATTGTCTGTCGTAATTTCACCTATTTC 
CTTTAAAGTTGTACTAGAGAAAAGATAGGATCTTCCTTCG ^AATTTCACCTATTTC 



m^. 8 .^ 110 A ° id Se< I uenc e (domain in AA coordinates- 173-2?^ 

'^'^^^^'^^^•' J ^^GLNNNHPS SGPIKQDGS SS FDFGELEEAIVLQGVKY 
P^S^^np^^^ 



gaaifdmeygrwleddnrhmseirtglqahlsdndlrlivdgyiah: 



ra™^ — "' u "" ,olnl uijyArtJjaUrJDLRIjI VDGYI AHFDEI FRLKAVAZV 

dvfhliigtwmspaercfivmagfrpsdlikilvsqmdllteqqlmgiyslohSo 

ALS^LEQLQQSLIDTIiAASPVIDGMQQMAVALGKISNLEGFIRQ^NLRQ 
^AAHCFLVIGEYYG^^ 

>G1226 (212.. 1159) 

^^gagagagaagaagaagcagcagcagaagttc^ 

AAGACACCTATATCTAAATACTCAAAGrrACAAAAATATmcS?^ 
AGAA ™ CAATraGGTCAG ^^ 

AAGAGATCCGGGTCAAACAGATGACCCGGAAAAGGATCC^GAAC^GAaAa^^^ 
^^^^^^^^^^^ 

tggatttacggcgagattcggcggtggagatacgacagaagtggSSSS 
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GAACCATGTGAGCTTAAAAGTTCGGTGTAAGAGAGGAAAACGACAGATCTTAAAAGCTAT 
TGTCTCGATTGAAGAACTAAAGCTTGCGATTCTACATCTCACTATCTCTTCTTCCTTTGA 
CTTTGTCATCTACTCTTTCAATCTCAAGATGGAAGATGGTTGTAAATTAGGATCAGCAGA 
TGAGATAGCGACAGCCGTTCATCAGATCTTCGAGCAAATCAACGGTGAAGTCATGTGGTC 
AAATCTTAGTCGAACTTAGTTGACTTTTGACTCCTAGTAACGTGTGTAAACTTTAGGTTA 
CAAAGAAAAGGGACGTGATATAAATAAGAAAAACCAAAGAGGTGAAATTTTGGGAGTTTT 
AATTATTATCTTATACTTTTTGGATTTTAGATTAGTAGCAAACTCGCAGTGTTCTACGAT 
GACATTATTATTGGTCACATGAAGGTTTAGGTTAAAAAAAAA 

>G1226 Amino Acid Sequence (domain in AA coordinates : 115-174 ) 
MSGLMSFGELEDQFGQISDTTMEEKIPFLQMLQCIEHPFTTTEPNQFLQSLLQIQTLESK 
SCLTLETNIICRDPGQTDDPEKDPRTENGAVTVKEKRKRKRTRAPICNKDEVENQRMTHIAV 
ERURRRQMNEHLNSLRSLMPPSFLQRGDQASIVGGAIDFIKELEQLLQSLEAEKRKDGTD 
ETPKTASCS SS S SLACTNS S I SSVSTTSENGFTARFGGGDTTEVEATVIQNHVSLKVRCK 
RGKRQI LKAI VS IEELKLAILHLTI S SSFDFVIY S FNLKMEDGCKLGS ADE I ATAVHQI F 
EQINGEVMWSNLSRT* 
>G1451 (124.. 2559) 

TTTGTACTTCCGGAGCTAAAGAGTTATAGCTACTGTAGTAGCTGGAAGTGAAGAAGATTT 

TTTAATAGATTGTACGGAAAAATTAGGGTTTTCAAAGTTTGGTTTCTTGAAGTTGAATTA 

GACATGAAGCTGTCAACATCTGGATTGGGTCAACAGGGTCATGAAGGAGAGAAGTGTCTG 

AATTCTGAGCTATGGCATGCTTGTGCTGGACCATTAGTCTCTCTTCCATCATCTGGTAGT 

CGAGTTGTTTACTTTCCACAGGGTCACAGTGAACAGGTAGCTGCTACAACTAATAAGGAA 

GTTGATGGTCACATACCCAATTACCCAAGCCTACCACCACAATTGATATGCCAGCTCCAT 

AATGTTACAATGCATGCAGATGTTGAGACGGATGAAGTCTATGCTCAAATGACACTTCAA 

CCATTGACACCGGAGGAGCAGAAGGAAACATTTGTACCGATTGAGTTGGGGATACCGAGT 

AAGCAACCTAGTAATTATTTTTGTAAGACTCTCACAGCTAGTGATACCAGTACACATGGA 

GGGTTTTCTGTTCCTAGACGTGCTGCTGAGAAAGTGTTTCCTCCATTGGATTACACACTG 

CAGCCACCAGCTCAAGAACTGATTGCAAGGGATCTCCATGATGTTGAATGGAAGTTTAGG 

CATATCTTTCGGGGACAGCCCAAACGGCATCTCCTAACTACTGGATGGAGTGTCTTTGTC 

AGTGCCAAGCGACTAGTAGCTGGAGATTCTGTCATTTTCATCAGGAATGAAAAGAATCAA 

CTCTTTTTGGGAATTCGTCATGCCACTCGGCCGCAGACTATTGTACCATCATCTGTTTTA 

TCTAGTGATAGCATGCATATTGGACTCCTTGCTGCTGCTGCACATGCTTCTGCAACTAAT 

AGCTGTTTCACTGTTT'TCTTTCATCCAAGGGCTAGCCAATCTGAGTTTGTGATACAACTT 

TCCAAGTACATTAAAGCCGTTTTTCACACGCGTATTTCAGTTGGGATGCGCTTTCGCATG 

CTCTTCGAGACAGAAGAGTCGAGTGTCCGCAGGTACATGGGTACTATAACTGGTATTAGT 

GATCTAGATTCTGTTCGTTGGCCAAACTCTCATTGGCGATCTGTGAAGGTTGGTTGGGAT 

GAATCGACTGCAGGGGAGAGACAGCCAAGGGTTTCTTTATGGGAGATTGAGCCTCTGACT 

ACCTTTCCTATGTATCCATCTCTTTTTCOTCTCAGACTAAAACGTCCATGGCATGCTGGC 

ACATCATCTTTGCCTGATGGAAGGGGTGATTTGGGAAGTGGTCTAACATGGCTAAGAGGG 



TGGATGCAACAAAGGCTGGATCTCAGTCAAATGGGGACTGATAATAATCAGCAATACCAA 

GCAATGTTAGCTGCTGGGTTGCAGAACATCGGCGGTGGAGATCCTTTAAGACAGCAGTTT 

GTACAGCTGCAAGAGCCTCACCACCAATATCTTCAACAATCAGCTTCCCATAATTCTGAT 

TTGATGCTTCAGCAGCAACAGCAGCAACAAGCGTCACGCCATCTCATGCATGCTCAAACA 

CAGATTATGAGTGAGAATCTTCCGCAGCAGAATATGCGACAAGAAGTTAGTAACCAACCA 

GCTGGACAGCAGCAACAGCTACAGCAACCGGACCAAAATGCATATCTTAATGCTTTCAAA 

ATGCAAAATGGCCATCTTCAACAGTGGCAGCAGCAATCAGAGATGCCATCTCCCTCGTTC 

ATGAAGTCAGATTTTACTGACTCAAGCAACAAATTTGCAACAACTGCTAGTCCGGCTTCT 

GGAGATGGCAATCTTTTGAATTTTTCTATAACCGGTCAGTCTGTACTCCCTGAGCAGTTA 

ACAACAGAGGGCTGGTCTCCAAAAGCATCCAACACTTTTTCTGAACCGTTGTCACTTCCA 

CAAGCCTATCCTGGGAAGAGTCTTGCTCTAGAACCCGGAAATCCGCAGAATCCCTCTCTT 

TTCGGTGTTGATCCCGACTCTGGACTCTTCCTCCCCAGTACGGTTCCCCGCTTTGCTTCT 

TCATCAGGAGATGCTGAAGCTTCCCCTATGTCACTAACAGATTCAGGATTTCAGAATTCC 

TTATATAGCTGCATGCAAGACACAACTCATGAGTTATTGCATGGAGCTGGACAGATTAAC 

TCGTCCAACCAAACCAAGAACTTTGTAAAGGTTTATAAATCTGGTTCGGTTGGGCGTTCA 

TTAGACATCTCCCGATTCAGCAGCTACCACGAGCTGCGAGAAGAGTTAGGGAAGATGTTT 

GCTATCGAAGGGTTGTTGGAAGACCCCCTTAGATCAGGCTGGCAGCTTGTATTCGTTGAC 

AAGGAAAATGATATTCTTCTCCTTGGTGATGACCCATGGGAGTCATTTGTGAATAACGTT 
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TGGTACATAAAGATACTATCACCAGAAGATGTGCATCAAATGGGAGATCATGGAGAAGGC 
AGTGGTGGGTTATTCCCGCAAAACCCGACCCATCTCTAGAAGCTGCTTCGGTGTTAGTCT 

===== SEEKS 

^ssssssssssssssssss^ 

>G1451 Amino Acid Sequence (domain in AA coordinates- ■?■> «u 

dghipnypslppqlicqlhnvtmhadvetoevyaqmtlqpltpeeqketfvp^lgtSk 

QPSNYFCKTLTASDTSTHGGFSVPRRAAE^FPPLDYTLQP™ 

^smhiguaaaahasatosc^ 



itTLNAFKM 
/LPEQLT 



tegwspkasntfseplslpqaypgksi^epgnpqnpslfgvdpdsglfIpsS^ 



>G1478 (1. .354) 

ATGTGTAGAGGGTTTGAGAAAGAAGAAGAGAGAAGAAGCGACAATGGAGGATGCCAAAGA 

ctatgcacggagagtca^gctccggtaagctgtgagctttgcggJgSaIcg^^ 

GCTAATTTTCTAGCTCGGAGACATCTCCGGCGCGTGATCTGCACGACCTGTCGGAArrTa 
ACTCGTCGATGTCTTGTCGGTGATAAT^^ 

GCAAGGATTGAAGAACATAGTAGTGATCACAAAArrCCCT^ 

>G1478 Ammo Add Sequence (domain in aa coordinates • 32 761 

JgI™^ 1 ™^ 

AAACCCACCAAATAACTCAGAGCTTTTTTGCATTTTTTCCCATTCTCTATT^ 
GGAAGGTCTTCTCTCTCAAGAAAGCTTGTCOTAAACTCTATGGACATC^ 

J AA ^r ACCTGAACTOCTTCAGATACTTCAG ^CCATGGAAGCAAS 
r^ GG ^™ GmCAGCCAAmCA ^^ 

CATGGGTTTTGGTCCTCCA^TGAATCCATTTCAAGAACAAGTAGCTGC^ 



GAACAAGAGAAAACCAGAGGTTAAGACAAGGGAAGAGCAAAAGACAGAGAAGAAGATPAA 



AAGACAAGTCGAGTTCCTGTCGATGAAAClTGCTGTCTTGAACCCGG^CTAGAGOTT^n 
a^ GG ^ GTATCCGTAAAACAGGCTOA ™ 

ACAGAG ^^ ACAACACATC TAGCCTCGG T TTTCATTACTAAGCAAG^^ 

caaagtagtaatttcgaaatttggttaatgcattatcctttgatccttgS^ 

OTAAACCAGAAGAACTGGAGATAGCAATCCAATGATCTTGTCAcS ™ ' 



:.r.3== 
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LKNKRKPEVKTREEQKTEKKIKVEAETESSMKGKSNMGNTEASSDTSKETSKGASENQKL 

DYIHVRARRGQATDRHSLAERARREKISKKMKYLQDIVPGCNKVTGKAGMLDEIINYVQC 

LQRQVEFLSMKLAVLNPELELAVEDVSVKQAYFTNWASKQSIMVDVPLFPLDQQGSLDL 

SAINPNQTTSIEAPSGSWETQSQSLYNTSSLGFHY* 

>G1526 (1- .3090) 

ATGGGAACGAAAGTCTCAGACGATCTTGTTTCCACCGTCAGATCAGTCGTGGGTTCCGAT 

TACTCAGATATGGATATAATCAGGGCTTTACACATGGCGAATCATGATCCAACGGCTGCT 

ATCAATATAATCTTCGACACTCCAAGTTTCGCCAAACCTGATGTAGCCACTCCTACCCCG 

AGCGGCTCTAATGGAGGGAAGCGAGTTGATAGTGGATTAAAGGGCTGTACTTTTGGTGAC 

AGCGGAAGTGTTGGAGCGAATCATCGCGTGGAGGAAGAAAATGAGAGTGTTAATGGTGGA 

GGAGAAGAGAGTGTTTCAGGGAATGAGTGGTGGTTTGTTGGTTGTTCTGAATTGGCTGGG 

TTATCGACATGTAAAGGAAGGAAATTGAAGTCTGGTGATGAATTGGTGTTCACGTTTCCG 

CATAGTAAAGGATTAAAGCCTGAGACTACGCCTGGGAAGCGCGGTTltTGGGCGGGGAAGG 

CCAGCTTTGCGTGGTGCTTCTGATATCGTTAGGTTCTCTACAAAGGATTCAGGAGAGATT 

GGTAGAATACCAAACGAGTGGGCTCGGTGTCTTCTACCACTTGTGAGAGACAAGAAAATT 

AGGATAGAAGGCAGTTGCAAGTCGGCGCCTGAAGCTTTGAGCATCATGGATACAATTCTT 

CTGTCTGTAAGCGTGTACATTAATAGTTCCATGTTTCAAAAGCATAGTGCGACTTCATTT 

AAGACAGCTAGTAATACGGCAGAGGAATCAATGTTCCATCCTCTCCCAAATCTCTTTCGG 

TTACTCGGTTTGATCCCCTTTAAGAAGGCAGAGTTTACTCCAGAGGATTTTTACTCTAAG 

AAGCGACCTTTGAGTTCCAAGGATGGTTCTGCTATTCCTACTTCGTTGCTTCAATTAAAC 

AAGGTCAAGAATATGAATCAAGATGCAAACGGAGATGAAAATGAGCAGTGTATCAGCGAT 

GGTGATCTTGATAACATTGTTGGTGTTGGGGACAGTTCTGGATTAAAGGAAATGGAAACT 

CCACATACACTTCTGTGTGAGCTTCGTCCATACCAAAAGCAGGCACTTCATTGGATGACC 

CAACTGGAGAAAGGAAATTGCACTGATGAGGCAGCAACAATGCTTCACCCGTGTTGGGAA 

GCATACTGTTTAGCAGACAAGAGGGAACTGGTTGTCTACCTGAATTCTTTTACTGGTGAT 

GCTACAATACACTTCCCTAGCACACTTCAAATGGCAAGAGGAGGAATATTAGCAGACGCA 

ATGGGTCTTGGAAAGACTGTAATGACCATATCCCTTTTGCTTGCCCATTCTTGGAAAGCT 

GCATCAACTGGGTTTCTATGCCCCAACTATGAAGGAGACAAAGTGATCAGCAGTTCTGTA 

GATGATCTCACTAGTCCCCCGGTGAAGGCAACCAAATTTCTAGGCTTTGATAAGAGGCTT 

CTTGAACAAAAAAGTGTACTTCAAAATGGTGGTAACCTGATTGTATGTCCGATGACACTT 

TTAGGACAGTGGAAGACAGAGATTGAAATGCATGCAAAGCCTGGGTCTCTATCTGTCTAT 

GTTCACTATGGGCAAAGCAGGCCGAAGGATGCAAAACTTCTTTCCCAGAGTGATGTGGTA 

ATCACCACATATGGAGTTCTAACATCCGAATTCTCGCAAGAGAACTCAGCAGACCATGAA 

GGAATTTATGCAGTTCGATGGTTTAGGATTGTTCTTGACGAGGCACATACCATCAAAAAC 

TCAAAAAGCCAAATTTCCTTGGCTGCTGCAGCTCTGGTTGCTGATAGGCGTTGGTGTCTT 

ACGGGTACTCCTATTCAGAACAATCTGGAGGATTTATACAGCCTTCTACGGTTTTTGAGG 

ATTGAACCATGGGGAACTTGGGCATGGTGGAATAAACTTGTCCAAAAGCCATTTGAAGAG 

GGTGATGAGAGAGGGTTAAAGCTAGTGCAGTCTATCTTAAAACCTATCATGCTTAGGAGA 

ACAAAGTCTAGCACAGACCGAGAAGGAAGGCCGATTCTTGTTCTACCCCCTGCTGATGCA 

CGGGTCATTTACTGTGT^ACTTTCGGAGTCTGAGAGGGATTTCTACGACGCGCTATTTAAA 

AGATCCAAGGTCAAATTTGATCAATTTGTTGAACAAGGCAAAGTTCTTCATAACTATGCT 

TCGATCCTGGAACTGCTTTTGCGTCTTCGACAATGTTGTGATCACCCATTTTTAGTAATG 

AGTCGAGGGGATACAGCGGAATACTCTGATCTGAATAAGCTTTCTAAACGTTTCCTTAGT 

GGAAAGTCTTCTGGCTTAGAAAGGGAAGGAAAAGATGTACCGTCAGAGGCTTTTGTTCAG 

GAGGTGGTAGAGGAACTGCGCAAAGGAGAGCAAGGAGAGTGTCCAATATGCCTTGAAGCA 

CTTGAGGATGCTGTATTAACGCCATGTGCTCATAGATTATGTCGTGAGTGTCTCTTGGCA 

AGTTGGAGAAATTCTACTTCTGGGTTATGTCCTGTGTGTAGGAACACTGTAAGCAAACAA 

GAACTCATCACAGCACCAACCGAAAGTAGATTCCAGGTTGACGTGGAAAAGAATTGGGTG 

GAATCATCGAAAATCACTGCTCTTCTGGAAGAGCTTGAAGGTCTTCGTTCTTCAGGCTCT 

AAGAGCATTCTCTTTAGCCAGTGGACCGCTTTCCTCGATCTCCTCCAAATTCCCCTCTCT 

CGGAATAACTTTTCATTTGTCCGTCTTGATGGCACGCTAAGTCAGCAGCAACGAGAGAAG 

GTCCTTAAAGAATTTTCCGAAGATGGCAGTATCCTGGTACTGTTGATGTCTCTAAAAGCT 

GGTGGCGTTGGGATAAATCTAACAGCTGCGTCCAATGCTTTTGTCATGGATCCATGGTGG 

AACCCAGCGGTAGAGGAACAAGCTGTTATGCGTATTCATCGTATAGGGCAAACTAAGGAA 

GTCAAAATCAGAAGATTCATCGTTAAGGGAACGGTTGAAGAGAGAATGGAGGCGGTTCAG 

GCGAGGAAGCAGAGAATGATCTCTGGGGCTTTAACCGATCAAGAAGTACGAAGTGCACGT 

ATAGAGGAACTCAAGATGTTATTTACCTGA 
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>G1526 Amino Acid Sequence (domain in AA coordinates- 493-620 *ra mn.i 

mgtkvsddlvswrswgsdysdmdiiralhmanhdptaainiifdtpspakpdvIt^ 864 -ioo6) 

SGSNGGKRVDSGLKGCTFGDSGSVGANHRVEEENESVNGGGEESVSGNEWWFVGCSELAG 

g^pnewarcllplvrdkkiriegscksapealsimdtillsvsvyinL 

ktasntaeesmfhplpnlfrllglipfkkaeftpedfyskkrplsskdgsaipSlloln 
^mnqdangdeneqcisdg^^ 

mglgktvmtislllahswkaastgflcpnyegdk^ 

leqksvlqnggnlivcpmtllgqwkteiemhakpgslsvyvhygqsrp^aSlsqs^ 
ittygvltsefsqensadhegiyavrwfrivldeahtiknsksqisla^aSSSScI 

TKSSTDREGRPILVLPPADARVIYCELSESERDFYD^ 



ELTTA P^^^^ ^ ^^^^^R^R^L^S W^STSG^C WC^^SKQ 

^sfvrldgtlsqqqrekvl^ 

>G1543 (1..828) 

CA ™ GG 5 TACGCATCCGTGT ^^^ 

ctagctcttaagaaccctaataattcaitgatcaaaataatggcgattttS 

tcttcaaacttggatcttactatctccgttccaggcttctcttcatcccctctctccgat 

g^ggaagtggcggaggaagagaccagctaaggctagacatgaatcgS 

ctccgtctaaccagagaacagtcacgtcttcttgaagatagtttc^gacagaa^Sc^ 
GAAG ^ TTOCAAAACCG ™^^ 

GAGTATCTCAAAAGGTGGTTTGGTTCATTAACGGAAGAAAACrAf'afsnnTnonSAo^JA^ 



^ ^^^^^ccataaaggttggcccaacaacggtgaactctgcctcgagcctt 
actatgtgtcctcgctgtoagcgagttacccctgccgcgagcccttcgagScSto^ 
ccggttccggctaagaaaacgtttccgccgcaagagcgtgatcgttga 

^lrtL, min ° Acid ^quence (domain in AA coordinates- 135-195) 

^^p^slikimailpenssnldltisvpgfsssplsdegsgggrdqlrlS---- 
edgddeefshddgsapprkklrltreqsiu.ledsfromh^^^^™^?^ 



" v " r wiiKKrtKbiUjKQXEMECEYLKRWFGSLTEENHI 
TMCPRCERVTPAAS PSRAWPVPAKKTFPPQERDR * 
>G162 (101.. 619) 



^vaxo^ 1J.U1..619) 

==== 

S?S?™^ A ^ GACATC ^^ 

lol^ G ^r^^^^^TTTTTCGCAAAAAAAA 

w,™ Amno Acid Sequence (domain in AA coordinates- 2-57 4 

MGRRKIKMEMVQDMNTRQVTFSKRR^^ 
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NLDSVAERFMREYDDSDSGDEEKSGNYRPKLKRLSERLDLLNQEVEAEKERGEKSQEKLE 
SAGDERFKESIETLTLDELNEYKDRLQTVHGRIEGQVWHLQASSCLMLLSRK* 
>G1640 {168.. 1196) 

TTCGCCAGATCCTTCCTCTATATAAGGAAGTTCATTTCATTTGGAGAGGTTTCGCTGACA 
AGCTGCTCTAGCTTATCTGGTACCGTCGACCTCTCACTCAAGGGTCCAAAAGTGTTTTCT 
CTTTTTCAGTTTCTCTTTCTCTTTTTGACAGAAGAGACCGAGAAGCAATGGGAAGGGCTC 
CGTGTTGTGAGAAAATCGGGTTGAAGAGAGGGAGATGGACAGCCGAGGAAGATGAGATCC 
TCACCAAGTATATTCAGACCAATGGTGAAGGTTCTTGGCGATCTTTGCCTAAGAAAGCTG 
GATTGTTGAGATGTGGAAAGAGCTGTAGACTAAGGTGGATAAACTACTTAAGAAGAGACT 
TAAAAAGAGGAAATATTACTTCCGACGAAGAAGAAATAATCGTCAAGTTGCATTCCCTTC 
TCGGCAACAGATGGTCACTTATTGCAACACATCTACCAGGAAGAACAGACAACGAAATTA 
AAAACTATTGGAACTCACATCTCAGCCGCAAAATCTATGCCTTCACTGCCGTTTCCGGAG 
ATGGACACAATCTACTCGTCAACGATGTAGTCTTGAAGAAATCTTGTTCATCGTCTTCTG 
GAGCCAAGAACAATAACAAGACCAAGAAGAAGAAGAAGGGAAGGACTAGTAGGTCATCCA 
TGAAGAAACACAAGCAAATGGTGACGGCCTCACAATGTTTCTCACAACCTAAGGAGCTAG 
AGAGTGATTTCAGTGAGGGAGGGCAAAATGGTAATTTTGAAGGAGAGTCTTTGGGGCCTT 
ATGAGTGGTTGGATGGTGAGTTAGAACGGCTCTTGAGTAGTTGTGTCTGGGAATGCACTA 
GTGAAGAGGCTGTGATTGGAGTAAATGATGAAAAGGTGTGTGAGAGTGGGGACAATAGTA 
GTTGTTGTGTTAATTTGTTTGAAGAAGAACAAGGAAGCGAGACAAAGATTGGTCACGTAG 
GAATCACAGAGGTTGATCATGATATGACGGTGGAAAGAGAAAGAGAGGGAAGTTTTTTAA 
GTTCGAATTCAAATGAAAATAATGATAAAGATTGGTGGGTTGGTCTATGTAATTCTTCAG 
AAGTTGGGTTTGGGGTTGATGAGGAGTTGCTTGATTGGGAGTTTCAAGGTAATGTCACTT 
GTCAAAGTGATGATCTATGGGATCTCTCAGATATTGGAGAGATAACATTGGAGTGATTGT 
ACCGAGCAAGTGGATTGGCGGCCGCTCTAGACAGGCCTCGTACCGGATCTCTAGCTAGAG 
CTTTCGTTCGTATCATCGGTTTCGACAACGTTCGTCAAGT 

>G1640 Amino Acid Sequence (domain in AA coordinates: 14-115) 
MGRAPCCEKIGLKRGRWTAEEDEILTKYIQTNGEGSWRSLPKKAGLLRCGKSCRLRW1NY 
LRRDLKRGNITSDEEEI I VKLHSLLGNRWSLIATHLPGRTDNEIKNTYWNSHLSRKI YAFT 
AVSGDGHl^LViroVVLKKS^^^ 

PKELESDFSEGGQNGNFEGESLGPYEWLDGELERLLSSCVWECTSEEAVIGVNDEKVCES 
GDNS S CCVNLFEEEQGS ETKIGHVG I TEVDHDMTVEREREGS FLS SNSNENNDKDWWVGL 
CNSSEVGFGVDEELLDWEFQGNVTCQSDDLWDLSDIGEITLE* 
>G1644 (1..348) 

ATGAAATTGATTGATTGGAAAGACTGTGCTTTGATGACTTACACCGAACTCATTTTGGGT 
TTCTGCAATGTTTTAATGTTGATCTGCAGGAGGACTAGTGGACCTATGAGACGAGCAAAA 
GGTGGTTGGACTCCAGAGGAGGATGAGACACTTAGACGAGCAGTTGAAAAGTATAAGGGG 
AAGAGGTGGAAGAAAATAGCGGAATTTTTCCCAGAGAGAACACAAGTCCAATGCTTGCAC 
AGGTGGCAGAAAGTTCTTAATCCAGAGCTTGTTAAAGGACCTTGGACTCAAGAGGTTCTC 
TTATCATTTTCATGTTCTGAAACTTTTTTTGGTTTTCATTTTACGTAA 

>G1644 Amino Acid Sequence (conserved domain in AA coordinates : 39-102) 
MKLIDWKDOUjMTYTELILGFC^ 

KRWKKIAEFFPERTQVQC^HRWQK^NPELVKGPWTQEVIiLSFSCSETFFGFHFT* 
>G1646 (34.. 786) 

GATCTTTTGATCCAATCACAAGGCAAAGATCCAATGGACAATAACAACAACAACAACAAC 
CAGCAACCACCACCAACCTCCGTCTATCCACCTGGCTCCGCCGTCACAACCGTAATCCCT 
CCTCCACCATCTGGATCTGCATCAATAGTCACCGGAGGAGGAGCGACATACCACCACCTC 
CTCCAGCAACAAGAGCAACAGCTTCAAATGTTCTGGACATACCAGAGACAAGAGATCGAA 
CAGGTAAACGATTTCAAAAACCATCAGCTCCCTCTAGCTCGTATCAAAAAAATCATGAAA 
GCTGATGAAGATGTGCGTATGATCTCCGCCGAAGCACCGATTCTCTTCGCGAAAGCTTGT 
GAGCTTTTCATTCTCGAACTTACGATTAGATCTTGGCTTCACGCTGAAGAGAACAAACGT 
CGTACGCTTCAGAAAAACGATATCGCTGCTGCGATTACTAGAACCGATATCTTCGATTTC 
CTTGTTGATATTGTTCCTAGGGAAGAGATCAAGGAAGAGGAAGATGCAGCATCGGCTCTT 
GGTGGAGGAGGTATGGTTGCTCCCGCCGCGAGCGGTGTTCCTTATTATTATCCACCGATG 
GGACAACCGGCGGTTCCTGGAGGGATGATGATTGGAAGACCGGCGATGGATCCTAGCGGT 
GTTTATGCTCAGCCTCCTTCTCAGGCATGGCAAAGCGTTTGGCAGAATTC^GCTGGTGGT 
GGTGATGATGTGTCTTATGGAAGTGGAGGAAGTAGCGGCCATGGTAATCTCGATAGCCAA 
GGGTAAGTGAATTCTAGTAG 
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>G1646 Amino Acid Sequence {domain in AA coordinates: 72-162) 
MDNNNNNNNQQPPPTSVYPPGSAVTTVIPPPPSGSASIVTGGGATYHHLLQQQQQQLQMF 
WTYQRQEIEQVNDFKNHQLPLARIKKIMKADEDVRMISAEAPILFAICACELFILELTIRS 
WLHAEENKRRTLQKND I AAAI TRTD I FDFLVD I VPREE I KEEEDAAS ALGGGGMVAPAAS 
GVPYYYPPMGQPAVPGGMMIGRPAMDPSGVYAQPPSQAWQSVWQNSAGGGDDVSYGSGGS 

SGHGNLDSQG* 
>G1672 (239. .1399) 

CCATTCCTGACGTCCGGGATGACGCACAATCCCACTATCCTTCGCAAGACCCTTCCTCTA 

TATAAGGAAGTTCATTTCATTTGGAGAGGACACGCTGACAAGCTGACTCTAGCAGATCTG 

GTACCGATCACTCCCGTCTTTATCAAATTCTTCTTCCTCTTACATTTTCCCTATCCAATC 

GATCTCACGCAGATCTGATCAATTTCTCATCAAATCATTTAGAGATCAAAAGAAAACTAT_ 

GAAGAATAGTAAATGTAACCTCATAGATTCAAAGCTCGAAGAACATCATCATCTTTGCGG 

ATCAAAACATTGTCCTGGATGTGGTCGCATGATTCAAGCTGCTACTAAACCAAATTGGGT 

TGGATTGCCGGCAGGAGTGAAATTCGATCCGACAGATCAAGAACTTATAGAACATTTAGA 

AGCAAAAGTGAAGGGAAAAGAAGAAAATAAGAAATGGTCGTCGTCTCATCCACTTATAGA 

TGAATTTATTCCCACCATTGATGGAGAAGATGGAATATGTTACACTCATCCTCAGAAGCT 

TCCAGGGGTGACAAGAGATGGCTTGAGCAAACACTTCTTCCACAAACCATCAAGAGCTTA 

CACAACCGGAACAAGAAAACGACGTAAAATAATTCAAACCGATCACGACTCTGAGTTAAC 

CGGATCATCAGAAACCAGGTGGCACAAAACGGGCAAAACAAGACCGGTTATGATCAACGG 

TCAACAAAGAGGATGCAAGAAGATATTAGTACTCTACACAAACTTCGGCAAGAATCGTCG 

ACCGGAGAAAACAAATTGGGTGATGCATCAATATCATTTAGGGATTAATGAGGAAGAGAG 

AGAAGGAGAACTTGTGGTCTCCAAGATATTTTATCAGACACAACCAAGACAGTGTGTTAG 

TAATACTAATTGGTCTGATCACCATGGTTCCAAGGACGTGATCGGAATTGGTGTCGGAGA 

TGAGATTTCCAGCGTAGCTGCCACGTTGCAGAGTCTTGGCTCCGGTGACGTCGTTTCTAG 

GGTTAATATGCATCCCCATACAAGATCCTTTGATGAGGGGACAGCCGAAGCTTCAAAGGG 

AAGAGAGAACCAGCATGTGTCTGGCACGTGCGAGGAAGTACATGATGGGATCATAACATC 

ATCAATGTCATCTCATCATATGATTCATGATCATCATAATCAACATCATCAAATCGGAGA 

TAGAAGAGAATTTCACATGTCATCATCATATCCCATGACCCCTACTATCACATCACAACA 

TGAGTCAATCTTCCATGTTACAAGTACTATGCCCTTTCAGCGGCAGCAATTAAGGGGTCG 

GTCGTCTGGTTCGGGATTAGAAGACCTAATTATGGGTTGTACCACAGCTACGTGTACAGA 

AGACAATAATCACAAATGATTAAATTCGCAGGAGCATTCAGAAGCAAACCCTCAGCGAAA 

TGCAGAGTGGTTAACGTTTCCACAATTCTGGAACCAAGCCGAATCAGATGATCAAAACCG 

AAGATTTTAACAG7UVCCAAAAGGAAGCAGAGAAATCTTGCAAAAAGCTCCTGCTTAGCTG 

TTGATCAATGCCGGAAATGCTGAGCTATGACTGACTAGTCTCTGCCATTTAACTTACAAT 

ATCACCAGAGGTTGCGATGAATGTTGATTCGCTCAAAGGAGAGCGGCCGCTCTAGACAGG 

CCTCGTACCG 

>G1672 Amino Acid Sequence {conserved domain in AA coordinates: 41-194) 

MKNSKCNIiIDSKLEEHHHLCGSiaiCPGCGRMIQAATKPNWVGLPAGVK^DPTDQEIiI^ 

EAKvlCGKEENKKWSSSHPLIDEFIPTIDGEDGICYTHPQKLPGVTRDGLSKHFFHKPSRA 

YTTGTRKRRKIIQTDHDSELTGSSETRWHKTC 

RPEKTNWVT4HQYHLGINEEEREGELWSKIFYQTQ 

DE I S SVAATLQSLGSGDWSRVNMHPHTRS FDEGTAEASKGRENQHVSGTCEEVHDG I IT 
SSMSSHHMIHDHHNQHHQIGDRREFHMSSSYPMTPTITSQHESIFHVTSTMPFQRQQLRG 
RS SGSGLEDL1MGCTTATCTEDNNHK* 
>G1677 (24. ,1037) 

CAGTACTAATTCTGTGTGTGTTAATGGTTCTAGTTATGGATGATGT^AGAGAGTAACAACG 
TTGAAAGATATGACGACGTCGTATTGCCAGGGTTTAGGTTCCATCCCACTGATGAAGAAC 
TCGTAAGTTTCTACTTGAAACGGAAGGTTTTACACAAATCTCTTC 

AGAAAGTCGACATTTACAAATACGATCCATGGGACCTCCCAAAGCTTGCAGCGATGGGGG 
AAAAAGAGTGGTACTTTTATTGTCCTAGAGACAGGAAATACCGCAACAGCACAAGACCTA 
ACCGAGTAACTGGAGGTGGCTTCTGGAAAGCAACCGGAACAGACCGGCCTATATACTCAT 
TGGACTCCACTCGATGCATCGGTTTGAAGAAATCACTTGTGTTCTACCGTGGTCGAGCTG 
CTAAAGGAGTCAAAACCGATTGGATGATGCATGAATTTCGTCTCCCTTCTCTCTCTGACT 
CTCATCACTCATCATATCCCAATTACAATAACAAGAAGCAACA 

ACAGCAAGGAGCTTCCTTCAAACGATGCTTGGGCGATATGTAGAATATTTAAGAAGACAA 
ATGC^GTATCCTCACAAAGATCAATCCCACAATCTTGGGTTTATCCAACGATTCCTGACA 
ACAATCAAC^GTCACACAACAACACCGCAACTCTCTTAGCTrCATCAGACGTTCTCAGCC 
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ACATATCAACAAGACAAAACTTTATTCCTTCTCCAGTCAACGAACCCGCAAGCTTCACAG 
AATCAGCTGCTTCTTACTTCGCGTCTCAGATGCTCGGAGTCACGTACAATACAGCCAGAA 
ACAACGGAACAGGGGATGCTCTGTTTCTGAGAAACAATGGAACAGGGGATGCTCTGGTTC 
TGAGCAACAATGAGAATAACTACTTCAACAACTTGACTGGAGGGTTGACTCATGAGGTTC 
CGAATGTAAGATCAATGGTGATGGAGGAGACTACGGGGAGTGAGATGTCGGCGACGTCGT 
ATTCCACTAACAATTAAGATCATAGTACTATTAACACTTGAATTAGTGTAGACGTTGATC 
ATCGCTAATATGTATTAATTTTTCTTGTCTTACTATAAACGAAAAAAAAA 

>G1677 Amino Acid Sequence (conserved domain in AA coordinates : 17-181) 
MVLVMDDEESNNVERYDDWLPGFRFHPTDEELVSFYLKRKVLHKSLPFDLIKKVDIYKY 
DPWDLPKLAAMGEI<EWYFYCPRDRKYRNSTRPNR\rTGGGFWKATGTDRPIYSLDSTRCIG 
LKKSLVFYRGRAAKGVKTDWMMHEFRLPSLSDSHHSSYPNYOT 

DAWAICRIFKKTNAVSSQRSIPQSWVYPTIPDNNQQSHNNTATLLASSDVLSHISTRQNF 
IPSPVNEPASFTESAASYFASQMLGVTYNTARmGTGDALFLRl^GTGDALV^SNWENNY 
FNNLTGGLTHEVPNWSMVMEETTGSEMSATSYSTON* 
>G1765 (139.. 966) 

TCCTTCGCAAGACCCTTCCTCTATATAAGGAAGTTCATTTCATTTGGAGAGGACACGCTG 
ACAAGCTGACTCTAGCAGATCTGGTACCGTCGACAAGAATGACTTGATTGGTGTTCTAAA 
GAGATCGATGTAGTGAAGATGAGTGGCGAAGGTAACTTAGGTAAGGATCATGAAGAAGAA 
AACGAAGCACCACTTCCTGGGTTCAGGTTTCATCCGACGGATGAAGAGCTTTTAGGATAC 
TATCTTCGAAGAAAAGTAGAGAACAAAACCATCAAACTCGAACTTATCAAACAGATCGAT 
ATCTATAAGTACGATCCTTGGGATCTTCCAAGAGTGAGCAGCGTCGGAGAAAAGGAGTGG 
TACTTCTTCTGCATGAGAGGTAGGAAATACAGGAATAGCGTTCGACCAAACCGAGTGACC 
GGTTCAGGTTTCTGGAAAGCCACTGGTATTGATAAACCGGTTTACTCCAATCTTGACTGT 
GTTGGTCTCAAGAAATCTCTGGTTTACTATCTTGGTTCAGCCGGTAAAGGCACCAAAACC 
GATTGGATGATGCATGAATTCCGCCTCCCCTCCACCACGAAAACCGACTCTCCAGCTCAA 
CAAGCAGAGGTATGGACACTTTGCAGAATCTTCAAACGAGTCACATCTCAAAGAAACCCA 
ACCATCTTACCACCAAACCGAAAACCGGTTATCACTTTAACCGACACTTGTTCTAAGACC 
AGCAGCTTAGATTCCGACCACACGAGCCACCGTACAGTAGATTCCATGTCCCACGAGCCG 
CCGCTTCCACAGCCACAGAATCCTTATTGGAACCAACATATAGTTGGTTTTAATCAACCG 
ACATATACTGGTAATGATAATAACCTCCTGATGAGTTTCTGGAACGGCAACGGTGGAGAT 
TTCATAGGAGACTCAGCAAGTTGGGATGAACTTAGATCTGTTATAGATGGCAACACTAAA 
CCCTAGTAATAAAGTTTCCTTTTTTCAGCTTTGTACAAAAAGATAAAACAAACGGCAACC 
GCTCTAGACAGGCCTCGTACCGGGATCCTCTAGCTAGAGCTTTCGTTTCGTATCATCGGT 
TTCGACAACGTTCGT 

>G1765 Amino Acid Sequence (conserved domain in AA coordinates: 20-140) 

MSGEGNLGKDHEEENEAPLPGFRFHPTDEELLGYYLRRKVENKTIKLELIKQIDIYKYDP 

WDLPRVSSVGEKEWYFFCMRGRKYRNSVRPNRVTGSGFWKATGIDKPVYSNLDCVGLKKS 

LVYYLGSAGKGTKTDWMMHEFRLPSTTKTDSPAQQAEWTLCRIFKRVTSQRNPTILPP^ 

RKPVITLTDTCSKTSSLDSDHTSHRTVDSMSHEPPLPQPQNPYWWQHIVGFNQPTYTGND 

NNLLMS FWNGNGGDF I GDS ASWDELRS VIDGNTKP * 

>G1777 (97.. 1878) 

CTCGTACTTTATCACCTCCGTCGTTCTATAATACTCTCTTCCGTCAATCATATCATTTGT 
CGACAATTTCATTCTGATCAGTTTAAAAATTGATCCATGGATGATAATTTAAGCGGCGAG 
GAAGAAGATTACTATTACTCCTCCGATCAGGAATCTCTCAACGGGATTGATAATGATGAA 
TCCGTTTCGATACCTGTTTCTTCCCGATCAAATACTGTCAAGGTTATTACGAAGGAATCA 
CTTTTGGCTGCACAGAGGGAGGATTTGCGGAGAGTGATGGAATTGTTATCGGTTAAGGAG 
CACCATGCTCGGACTCTTCTTATACATTACCGATGGGATGTGGAGAAGTTGTTTGCTGTT 
CTTGTTGAGAAAGGSAAAGATAGCTTGTTTTCTGGTGCTGGTGTTACACTTCTTGAAAAC 
CAAAGTTGTGATTCTTCCGTTTCTGGTTCTTCTTCGATGATGAGTTGTGATATCTGCGTA 
GAGGATGTACCGGGTTATCAGCTGACAAGGATGGACTGTGGCCATAGCTTTTGCAATAAC 
TGTTGGACTGGG(^TTTTACTGTAAAGATAAATGAAGGTCAGAGCAAAAGGATTATATGC 
ATGGCTCATAAGTGTAATGCTATTTGTGATGAAGATGTTGTCAGGGCTCTAGTTAGTAAA 
AGCCAACCAGATTTAGCTGAGAAGTTTGATCGTTTTCTTCTTGAGTCGTATATCGAAGAT 
AACAAAATGGTGAAGTGGTGTCCGAGTACTCCTCATTGTGGGAATGCCATACGTGTTGAG 



TCTCAAGCTCACTCCCCTTGCTCTTGTGTGATGTGGGAACTATGGAGAAAGAAGTGCTTT 
GATGAGTCCGAGACTGTTAATTGGATAACTGTTCACACAAAGCCGTGTCCCAAATGTCAC 
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AAGCCTGTTGAAAAGAATGGTGGATGCAATCTCGTGACTTGTCTTTGTCGACAATCTTTT 

TGTTGGTTGTGTGGTGAAGCTACTGGAAGGGACCACACTTGGGCTAGAATCTCGGGTCAT 

AGTTGTGGTCGGTTCCAAGAAGATAAAGAGAAACAAATGGAGAGAGCGAAAAGGGATCTC 

AAGCGGTATATGCATTATCATAACCGATACAAAGCACATATCGACTCCTCCAAGCTAGAG 

GCTAAGCTTAGTAATAATATTAGTAAAAAGGTGTCTATTTCAGAAAAGAGGGAGTTACAA 

CTTAAAGACTTCAGCTGGGCTACCAATGGACTCCATCGGTTATTTAGATCAAGACGAGTT 

CTTTCATATTCATACCCTTTCGCATTTTACATGTTTGGAGATGAGCTGTTTAAAGATGAG 

ATGAGCTCTGAGGAAAGAGAAATAAAACAAAATCTGTTTGAGGATCAGCAGCAGCAGCTT 

GAGGCTAATGTTGAGAAACTTTCTAAGTTCTTGGAGGAACCTTTTGATCAATTTGCTGAT 

GATAAGGTCATGCAGATAAGGATTCAAGTCATCAATTTGTCAGTTGCGGTCGATACACTC 

TGCGAAAATATGTATGAATGCATTGAGAATGACTTGTTGGGTTCTCTGCAACTTGGCATC 

CACAACATTACTCCATACAGATCAAACGGCATAGAACGAGCATCTGATTTTTATAGTTCC 

CAGAATTCCAAGGAAGCTGTTGGTCAGTCTTCGGATTGTGGATGGACGTCCAGGCTCGAT 

CAAGCTTTGGAGTCAGGGAAGTCGGAAGACACAAGTTGCTCTTCCGGGAAGCGTGCTAGA 

ATAGACGAAAGTTACAGAAACAGCCAAACCACCTTACTAGATTTAAACTTGCCAGCGGAA 

GCCATTGAGCGGAAATGAACACTTATCCTTCTTCACCTCCCAATAACACCCTTTTTGTCC 

AAATAAAGTGTGTTACCCGGATATTTATAGCTCTAAACCCAATCCCCTCTGCTTAATTTG 
TCAGTGACCTTACCTAACCCTCTTCA 

>G1777 Amino Acid Sequence (domain in AA coordinates -124-247) 

MDDNLSGEEEDYYYSSDQESLNGIDNDESVSIPVSSRSNTVKVITKESLLAAQREDLRRV 

MELLSVKEHHARTLLIHYRWDVEKLFAVLVEKGKDSLFSGAGVTLLENQSCDSSVSGSSS 

MMSCDICVEDVPGYQLTRMDCGHSFCNNCWTGHFTVKINEGQSKRIICMAHKCNAICDED 

WRALVSKSQPDLAEKFDRFLLESYIEDNKMVKWCPSTPHCGNAIRVEDDELCEVECSCG 

LQFCFSCSSQAHSPCSCVMWELWRKKCFDESETVNWITVHTKPCPKCHKPVEKNGGCNLV 

TCLCRQSFCWLCGEATGRDHTWARISGHSCGRFQEDKEKQMERAKRDLKRYMHYHNRYKA 

HIDSSKLEAKLSNNISKKVSISEKRELQLKDFSWATNGLHRLFRSRRVLSYSYPFAFYMF 

GDELFKDEMSSEEREIKQNLFEDQQQQLEANVEKLSKFLEEPFDQFADDKVMQIRIQVIN 

L S VAVDTLCENMYECI ENDLLGSLQLG I HNI TPYRSNGIERASDFYS SQNS KEAVGQS SD 

CGWTSRLDQALESGICSEDTSCSSGKRARIDESYRNSQTTLLDLNLPAEAIERK* 
>G1793 (59. .1783) 

AGTGATTTATTGATTAACCC^VAACACAAAATAAACAGATTTGACTCAAAAAGAAGAAAAT 

GAATTCTAACAACTGGCTTGGCTTTCCTCTTTCACCGAACAACTCTTCTTTGCCTCCTCA 

TGAATACAACCTTGGCTTGGTCAGCGACCATATGGACAACCCTTTTCAAACACAAGAGTG 

GAATATGATCAATCCACACGGTGGAGGAGGAGATGAAGGAGGAGAGGTTCCAAAAGTGGC 

CGATTTTCTCGGTGTGAGCAAACCGGACGAAAACCAATCCAACCACCTAGTAGCTTACAA 

CGACTCAGACTACTACTTCCATACCAATAGCTTGATGCCTAGCGTCCAATCAAACGATGT 

CGTTGTAGCAGCTTGTGACTCCAATACTCCTAACAACAGTAGCTATCATGAGCTTCAAGA 

GAGTGCTCACAATCTACAGTCACTTACTTTGTCCATGGGGACCACCGCTGGTAATAATGT 

TGTAGACAAAGCTTCACCATCCGAGACCACCGGGGATAACGCTAGCGGTGGAGCACTAGC 

CGTTGTTGAGACGGCCACGCCAAGACGTGCATTGGACACTTTCGGACAACGAACCTCGAT 

CTATCGTGGTGTCACAAGACATCGATGGACTGGTCGATATGAGGCTCATCTATGGGATAA 

TAGTTGTAGAAGGGAAGGCCAGTCTAGGAAAGGAAGACAAGTTTACTTGGGTGGATATGA 

CAAAGAAGATAAAGCAGCAAGATCATATGATCTAGCTGCACTTAAGTACTGGGGTCCTTC 

AACTACTACTAATTTCCCCATTACAAACTACGAGAAAGAAGTAGAGGAAATGAAGCACAT 

GACGAGACAAGAGTTCGTGGCTGCCATTAGAAGGAAAAGTAGTGGATTTTCGAGAGGCGC 

TTCGATGTATCGAGGAGTTACAAGGCATCACCAACATGGAAGATGGCAAGCAAGGATCGG 

CCGAGTCGCCGGAAACAAAGACCTCTACTTGGGAACTTTTAGCACTGAGGAAGAAGCAGC 

AGAAGCTTACGATATAGCTGCAATAAAGTTTAGAGGACTTAATGCAGTGACCAACTTCGA 

GATCAACCGGTACGACGTGAAAGCCATTCTAGAGAGTAGCACTCTTCCCATCGGAGGAGG 

CGCAGCTAAACGGCTCAAAGAAGCTCAAGCTCTTGAGTCTTCAAGGAAACGCGAGGCGGA 

GATGATAGCCCTTGGTTCAAGTTTCCAGTACGGTGGTGGCTCGAGCACAGGCTCTGGCTC 

CACCTCATCAAGACTTCAGCTTCAACCTTACCCTCTAAGCATTCAACAACCATTAGAGCC 

TTTTCTATCTCTTCAGAACAATGACATCTCTCATTACAACAACAACAATGCTCACGATTC 

CTCCTCTTTTAATCACCATAGCTATATCCAGACACAACTTCATCTCCACCAACAGACCAA 

CAATTACTTGCAGCAACAGTCGAGCCAGAACTCTCAGCAGCTCTACAATGCGTATCTTCA 

TAGCAATCCGGCTCTGCTTCATGGACTTGTCTCTACCTCTATCGTTGACAACAATAATAA 

CAATGGAGGCTCTAGTGGGAGCTACAACACTGCAGCATTTCTTGGGAACCACGGTATTGG 
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TATTGGGTCCAGCTCGACTGTTGGATCGACCGAGGAGTTTCCAACCGTTAAAACAGATTA 
CGATATGCCTTCCAGTGATGGAACCGGAGGGTATAGTGGTTGGACCAGTGAGTCTGTTCA 
GGGGTCAAACCCTGGTGGTGTTTTCACTATGTGGAATGAGTAAACAAGGATCTCTTTCTT 

GCGGCACAAGGAATGGGT 

>G1793 Amino Acid Sequence (conserved domain in AA coordinates : 179-255 , 281-349) 

MNSNNWLGFPLSPNNSSLPPHEYNLGLVSDHMDNPFQTQEWNMINPHGGGGDEGGEVPKV 

ADFLGVSKPDENQSNHLVAYNDSDYYFHTNSLMPSVQSNDVWAACDSNTPNNSSYHELQ 

ESAHNLQSLTLSMGTTAGNNVVDKASPSETTGDNASGGALAVVETATPRRALDTFGQRTS 

IYRGVTRHRWTGRYEAHLWDNSCRREGQSRKGRQVYLGGYDKEDKAARSYDLAALKYWGP 

STTTNFPITNYEKEVEEMKHMTRQEFVAAIRRKSSGFSRGASMYRGVTRHHQHGRWQARI 

GRVAGNKDLYLGTFSTEEEAAEAYDIAAIKFRGLNAVTNFEINRYDVKAILESSTLPIGG 

GAAKRLKEAQALESSRKREAEMIALGSSFQYGGGSSTGSGSTSSRLQLQPYPLSIQQPLE 

PFLSLQl^ISHYNl^AHDSSSFNHHSYIQTQLHLHQQTNNYLQQQSSQNSQQLYNAYL 

HSNPALLHGLVSTSIVTDNNNNNGGSSGSYNTAAFLGiraGIGIGSSSTVGSTEBFPTVKTO 

YDMPSSDGTGGYSGWTSESVQGSNPGGVFTMWNE* 

>G180 (54.. 629) 

GTAATTACGATCTACAACAAGTGACATCGTCGTCGACGACGATTCAAGAGAATATGAACT 
TCCTCGTTCCTTTTGAAGAAACCAATGTCTTAACCTTTTTCTCTTCTTCTTCTTCCTCTT 
CTCTTTCTTCTCCTTCTTTCCCCATTCACAACTCTTCCTCCACTACTACTACTCATGCAC 
CTCTAGGGTTTTCTAATAATCTTCAGGGTGGAGGACCCTTGGGATCAAAGGTGGTTAATG 
ATGATCAGGAGAATTTTGGAGGTGGAACTAACAATGATGCTCATTCTAATTCTTGGTGGA 
GATCAAATAGTGGAAGTGGAGATATGAAGAACAAAGTGAAGATAAGGAGGAAACTAAGAG 
AGCCAAGATTCTGTTTCCAAACCAAAAGCGATGTTGATGTTCTTGACGATGGCTACAAAT 
GGCGTAAATATGGTCAGAAAGTCGTCAAGAACAGCCTTCACCCCAGGAGTTATTACAGAT 
GCACACACAACAACTGTAGGGTGAAAAAGAGAGTGGAGCGACTATCGGAAGATTGTAGAA 
TGGTGATTACTACTTACGAAGGTCGTCACAACCACATTCCCTCTGATGACTCCACTTCTC 
CTGACCATGATTGTCTCTCTTCCTTTTAACATCTCTTTCTATATATCTATATATAGACAG 
TTATATGTGCACATATAGATGTGTGATATATTGCATATTTGATATTGCATGTGTTTTTCA 
AGAGTATGTCATCAGATGTTATGCATATATTCTTGACTTGTTGCTTATAGTATACATATG 
TAATAATATATATTGACATTGGTAGTTCATTTCTGTTCAAACAAAAAAAAAAAAAAA 

>G180 Amino Acid Sequence (domain in AA coordinates: 118-174) 
mFLVPFEETNVLTFFSSSSSSSLSSPSFPIHNSSSTTTTHAPLGFSNNLQGGGPLGSKV 

VTTODQENFGGGTNNDAHSNSWWRSNSGSGDMKNK^ 
YK^KYGQKVVKNSLHPRSYYRCT^^ 
TSPDHDCLSSF* 
>G192 (63.. 959) 

CTTTTTTCTCTTCTCTCCTCAGAGATTCGAAGCTTTTTGTCTCCCCTGAGTAACCAAATT 

CAATGGCCGACGATTGGGATCTCCACGCCGTAGTCAGAGGCTGCTCAGCCGTAAGCTCAT 

CAGCTACTACCACCGTATATTCCCCCGGCGTTTCATCTCACACAAACCCTATATTCACCG 

TCGGACGACAAAGTAATGCCGTCTCCTTCGGAGAGATTCGAGATCTCTACACACCGTTCA 

CACAAGAATCTGTCGTCTCTTCGTTTTCTTGTATAAACTACCCAGAAGAACCTAGAAAGC 

CACAGAACCAGAAACGTCCTCTTTCTCTCTCTGCTTCTTCCGGTAGCGTCACTAGCAAAC 

CCAGTGGCTCCAATACCTCTAGATCTAAAAGAAGAAAGATACAGCATAAGAAAGTGTGCC 

ATGTAGCAGCAGAAGCTTTAAACTCCGATGTCTGGGCATGGCGAAAGTACGGACAGAAAC 

CCATCAAAGGTTCACCATATCCAAGAGGATACTACAGATGTAGTACATCAAAAGGTTGTT 

TAGCCCGTAAACAAGTGGAGCGAAATAGATCCGACCCGAAGATGTTTATCGTCACTTACA 

CGGCGGAGCATAATCATCCAGCTCCGACACACCGTAATTCTCTCGCCGGAAGCACACGTC 

AGAAACCATCCGATCAACAGACGAGTAAATCTCCGACGACCACTATTGCTACTTATTCAT 

CGTCTCCGGTGACTTCAGCCGACGAATTTGTTTTGCCTGTTGAGGATCATCTAGCGGTGG 

GAGATCTTGACGGAGAAGAAGATCTGTTATCTTTGTCGGATACGGTGGTTAGCGATGATT 

TCTTCGATGGGTTAGAGGAATTCGCAGCCGGAGATAGCTTTTCCGGGAACTCGGCTCCGG 

CGAGTTTTGATCTCTCTTGGGTTGTGAACAGTGCCGCCACTACCACCGGAGGAATATGAT 

TAGATTACGACGGCTTAGAATACTCTTATTAGGACAGATTTATAGGATTAAGGAATTATT 

CTCGGAGCATATGTAAAAATAGGATAAAAGAAAATGTTCTTTGTTACTTTTTTTCGGGTT 

TTCTTCCTATTGTTTCTAAACATCTTAGAAAAAATTTAATTGTATATTCCTTAAGCTCGA 

TACATCTTGTTTTAAAAAAAAAAAAAAAAAA 

>G192 Amino Acid Sequence (domain in AA coordinates: 128-185) 
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MADDWDLHAWRGCSAVSSSATTTVYSPGVSSHTNPIFTVGRQSNAVSFGEIRDLYTPFT 

QESWSSFSCINYPEEPRKPQNQKRPLSLSASSGSVTSKPSGSNTSRSKRRKIQHKKVCH 

VAAEALNSDVWAWRKYGQKPIKGSPYPRGYYRCSTSKGCLARKQVERNRSDPKMFIVTYT 

AEHNHPAPTHRNSLAGSTRQKPSDQQTSKSPTTTIATYSSSPVTSADEFVLPVEDHLAVG 

DLDGEEDLLSLSDTWSDDFFDGLEEFAAGDSFSGNSAPASFDLSWWNSAATTTGGI* 

>G1948 (18.. 1118) 

AAAAGGTCTTCTTGGCCATGGATACTTGTGCTCTAGTAATCCATCAGTCTCTGTCTCGCA 

TCAAACTTTCTCCTCCCAAATCTTCTTCTTCTTCTTCTTCTGCTTTCTCCCCTGAATCCT 

TACCGATCAGACGGATCGAGCTGTGTTTCCGAGGAGCTATATGTGCCGCCGTACAAAGAA 

ACTACGAAGAAACGACCTCCTCCGTGGAAGAGGCAGAGGAAGATGATGAGTCATCATCAT 

CGTACGGAGAAGTGAACAAGATCATTGGAAGCCGAACGGCGGGGGAAGGAGCCATGGAGT 

ACCTTATCGAGTGGAAGGACGGCCATTCTCCGTCGTGGGTTCCATCGAGCTACATCGCAG 

CAGACGTAGTGTCGGAGTACGAGACACCCTGGTGGACGGCAGCTAGAAAAGCCGACGAGC 

AGGCCCTGTCACAGCTCCTGGAGGACCGAGACGTCGATG CCGTGGACGAAAACGG CCGGA 

CGGCTCTGCTTTTCGTGGCAGGTCTGGGGTCGGACAAGTGCGTAAGGCTTCTGGCGGAGG 

CTGGAGCCGATCTCGACCACCGAGACATGAGGGGAGGCTTGACGGCGCTGCACATGGCGG 

CTGGTTACGTGAGGCCGGAGGTGGTGGAGGCGCTGGTGGAGCTGGGAGCTGATATTGAAG 

TGGAAGACGAGAGAGGGTTAACGGCGTTGGAACTAGCGAGGGAGATTCTGAAGACGACGC 

CGAAGGGGAATCCGATGCAGTTCGGGAGGAGAATTGGGTTAGAGAAAGTGATCAATGTCC 

TGGAAGGACAAGTGTTCGAGTACGCCGAGGTGGATGAGATCGTAGAGAAACGAGGGAAAG 

GCAAAGACGTTGAATATCTGGTCAGATGGAAGGACGGTGGAGATTGCGAGTGGGTGAAAG 

GTGTACACGTGGCGGAAGATGTGGCTAAGGACTACGAGGATGGGCTGGAGTACGCTGTAG 

CGGAGAGTGTGATCGGGAAGAGGGTGGGAGACGATGGGAAGACCATCGAGTATCTTGTCA 

AATGGACTGATATGTCTGATGCCACTTGGGAGCCTCAGGACAATGTCGACTCTACTCTTG 

TTCTACTCTACCAACAACAACAACCAATGAATGAATGATTGATTTTGATGATTACATTCT 

TCTCAATTTGCTTCTTTCTCATATGTGTTGGTTCATCTGACCGGTTCGGTTGGTAGGTAC 

CGGTACATTTTCATTTTCTTTTAAGATGTGATCTTGATGGTTTTTGGCCTTTTGGGGACA 

CTATTTGATTTTATATCCATGCTTTGAATTTTGCTTCCCTTTTTGGGGAGATTCATGAAA 

>G1948 Amino Acid Sequence (domain in AA coordinates: entire protein) 

MDTCALVIHQSLSRIKLSPPKSSSSSSSAFSPESLPIRRIELCFRGAICAAVQRNYEETT 

SSVEEAEEDDESSSSYGEVNKIIGSRTAGEGAMEYLIEWKDGHSPSWVPSSYIAADWSE 

YETPWWTAARKADEQALSQLLEDRDVDAVDENGRTALLFVAGLGSDKCVRLLAEAGADLD 

HRDMRGGLTALHMAAG YVRPEVVETUjVELGADIEVEDERGLTALELARE I LKTTPKGNPM 

QFGRRIGLEKVINVLEGQVFEYAEVDEIVEKRGKGKDVEYLVRWKDGGDCEWVKGVHVAE 

DVAKDYEDGLEYAVAESVIGKRVGDDGKTIEYLVKWTDMSDATWEPQDNVDSTLVLLYQQ 
QQPMNE* 

>G2123 (1..657) 

ATGAGAAAAGTATGTGAGCTTGATATAGAGCTAAGTGAAGAGGAAAGAGACCTACTAACA 

ACTGGATACAAGAATGTCATGGAGGCTAAGAGAGTTTCATTGAGAGTAATATCATCCATT 

GAAAAAATGGAAGACTCGAAAGGAAACGACCAAAATGTGAAACTGATAAAAGGACAACAA 

GAAATGGTTAAATATGAGTTTTTCAATGTTTGTAATGACATTTTGTCTCTC^TTGATTCT 

CATCTCATACCATCAACTACTACTAATGTCGAATC^TTGTCCTTTTTAACAGAGTG^ 

GGAGATTATTTTCGATATATGGCAGAGTTTGGTTCTGATGCTGAACGTAAAGAAAATGCA 

GATAATTCTCTAGATGCATATAAGGTTGCAATGGAAATGGCAGAGAATAGTTTAGCACCC 

ACC^TATGGTTAGACTTGGATTGGCTTTAAATTTCTCGATATTCAATTATGAGATCCA^ 

AAATCTATTGAAAGCGCATGTAAATTGGTTAAGAAAGCTTACGATGAAGCAATCACTGAA 

CTCGATGGCCTTGACAAGAATATATGCGAAGAGAGCATGTATATCATAGAGATGCTTAAA 

TACAATCTTTCTACGTGGACTTCAGGCGATGGTAATGGTAATAAGACAGACGGTTAG 

>G2123 Amino Acid Sequence (domain in AA coordinates : 99-109) 

MRKVCELDIELSEEERDLLTTGYKNVMEAraVSL 

EMVKYEFFNVCNDILSLIDSHLIPSTTTOVESIVLFNRV^ 

DNSLDAYKVAMEMAENSLAPTNMVRLGLALNFS IFNYE IHKS I ES ACKLVKKAYDEAITE 

LDGLDKNICEESMYIIEMLKYNLSTWTSGDGNGNKTDG* 

>G2138 (27.. 512) 

GGAACCCTAATTTCCGCT^AATTCACTATGAAGTOTATTATCAGAATCTCATTCACCGACG 
CAGAAGCCACCGATTCTTCTAGCGACGAAGACACGGAGGAGCGTGGAGGAGCATCCCAGA 
CTCGGCGCCGTGGGAAACGCCTCGTTAAAGAGATCGTAATCGATCCTTCCGATTCCGCCG 
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ATAAACTCGATGTCTGCAAAACACGGTTCAAAATCAGGATCCCGGCGGAATTTCTCAAGA 

CGGCGAAAACGGAGAAGAAATATCGTGGAGTGAGGCAGAGGCCGTGGGGGAAGTGGGTGG 

CGGAGATCAGATGTGGAAGAGGAGCTTGTAAAGGACGACGTGATCGTCTCTGGCTGGGTA 

CTTTTAACACTGCTGAGGAAGCTGCTCTAGCTTATGATAACGCTTCAATTAAGCTGATTG 

GACCTCACGCGCCGACCAATTTTGGTTTGCCGGCGGAGAATCAAGAGGATAAGACGGTGA 

TTGGAGCTTCTGAGGTTGCTAGAGGCGCGTGAAGTGGGGTTGGTAATTTAGTTGTTAGC 

>G2138 Amino Acid Sequence (domain in AA coordinates: TBD) 

MKRI IRI S FTDAEATDS S SDEDTEERGG AS QTRRRGKRLVKE I VIDPSDS ADKLDVCKTR 

FKIRIPAEFLKTAKTEKKYRGVRQRPWGKWVAEIRCGRGACKGRRDRLWLGTFNTAEEAA 

LAYDNAS I KLIGPHAPTNFGLPAENQEDKTVI GASEVARG A * 
>G2139 (40.. 663) 

CCTACAAGAAATCAAACACTAGTTCTGGTTTCTGCAAACATGTCATCTACGAAGCAAGCA 
AAGGGAAGAAAAACAAAGGGGAAGCAAAAGATCGAGATGAAGAAGGTGGAGAACTATGGA 
GATAGGATGATTACGTTCTCAAAACGTAAAACCGGAATTTTTAAGAAAATGAACGAGCTC 
GTAGCAATGTGTGACGTTGAAGTGGCTTTCTTGATTTTCTCTCAACCCAAGAAGCCCTAT 
ACATTCGCACATCCGTCTATGAAGAAAGTGGCTGACCGGTTAAAGAACCCTTCGAGACAA 
GAACCATTAGAGAGAGACGATACCAGACCCCTCGTCGAAGCTTATAAGAAACGAAGGCTC 
CACGACCTCGTAAAAAAAATGGAGGCGCTCGAAGAGGAGCTTGCGATGGATCTAGAGAAG 
TTGAAACTGTTGAAGGAATCGAGAAATGAAAAGAAGTTAGATAAAATGTGGTGGAACTTT 
CCTTCGGAAGGTTTGAGCGCGAAGGAGCTGCAGCAAAGGTACCAAGCGATGCTCGAGTTA 
CGTGATAACTTATGCGACAATATGGCTCACTTACGATTGGGAAAAGACTGTGGTGGTTCA 
TCTTCTGTTCGTGTGGGACGTCGAGTTTCTGGTGGTGTTCGTCTGTTCGATCGTGAAGCA 
TGATCATACATATTCATACTTGATGATTTAAATTTCTTTGTATTTGAACTGCTGATTTTA 
ATACTGCATGTATCCATTTGACGAAGCTCAATCGTCTCGAGTATATCTCTATTATCTAAC 
AGTATTGAGAAAAAAGGAGTTTCAGTAAAAAAAAAAAAAAAAAAAAAA 

>G2139 Amino Acid Sequence (conserved domain in AA coordinates : 14-69) 

MS STKQAKGRKTKGKQKIEMKKVENYGDRM ITFS K3UCTGIFKKMNELVAMCDVEVAFLI F 

SQPKKPYTFAHPSMKKVADRLKNPSRQEPLERDDTRPLVEAYKX^ 

LAMDLEKLKLLKESRNEKKLDKM^ 

GKDCGGSSSVRVGRRVSGGVRLFDREA* 

>G2343 (1..1113) 

ATGGGTCATCACTCATGCTGCAACCAGCAAAAGGTGAAGAGAGGGCTTTGGTCACCGGAA 

GAAGATGAGAAGCTTATTAGATATATCACAACTCATGGCTATGGATGTTGGAGTGAAGTC 

CCTGAAAAAGCAGGGCTTCAAAGATGTGGAAAAAGTTGTAGATTGCGATGGATAAACTAT 

CTTCGACCTGATATCAGGAGAGGAAGGTTCTCTCCAGAAGAAGAGAAATTGATCATAAGC 

CTTCATGGAGTTGTGGGAAACAGGTGGGCTCATATAGCTAGTCATTTACCGGGAAGAACA 

GATAACGAGATTAAAAACTATTGGAATTCATGGATTAAGAAAAAGATACGAAAACCGCAC 

CATCATTACAGTCGTCATCAACCGTCAGTAACTACTGTGACATTGAATGCGGACACTACA 

TCGATTGCCACTACCATCGAGGCCTCTACCACCACAACATCGACTATCGATAACTTACAT 

TTTGACGGTTTCACTGATTCTCCTAACCT^TTAAATTTCACCAATGATCAAGAAACTAAT 

ATAAAGATTCAAGAAACTTTTTTCTCCCATAAACCTCCTCTCTTCATGGTAGACACAACA 

CTTCCTATCCTAGAAGGAATGTTCTCTGAAAACATCATCACAAACAATAACAAGAACAAT 

GATCATGATGACACGCAAAGAGGAGGAAGAGAAAATGTTTGTGAACAAGCATTTCTAACA 

ACTAACACGGAAGAATGGGATATGAATCTTCGTCAGCAAGAGCCGTTTCAAGTTCCTACA 

CTGGCGTCACATGTGTTCAACAACTCTTCCAATTCAAATATTGACACGGTTATAAGTTAT 

AATCTACCGGCGCTAATAGAGGGAAATGTCGATAACATCGTCCATAATGAAAACAGCAAT 

GTCCAAGATGGAGAAATGGCGTCCACATTCGAATGTTTAAAGAGGCAAGAACTAAGCTAT 

GATCAATGGGACGATTCACAAGAATGCTCTAACTTTTTCTTTTGGGACAACCTTAATATA 

AACGTGGAAGGTTCATCTCTTGTTGGAAACCAAGACCCATCAATGAATTTGGGATCATCT 

GCCTTATCTTCTTCTTTCCCTTCTTCGTTTTAA 

>G2343 Amino Acid Sequence (domain in AA coordinates: 14-116) 

MGHHSCCNQQKVKRGLWSPEEDEKLIRYITTHGYGCWSEVPEKAGLQRCGKSCRLRWINY 

LRPDIRRGRFSPEEEKLIISLHGWGNRWAHIASHLPGRTDNEIKimmSWIKKKIRKPH 

HHYSRHQPSVTTVTLNADTTSIATTIEAST^ 

IKIQETFFSHKPPLFMVDTTLPILEGMFSEN^^ 

TNTEEWDMNLRQQEPFQVPTLASHVFITOSSNS]^ 

VQDGEMASTFECLKRQELSYDQWDDSQQCSNFFFWDNI^INVEGSSLVGNQDPSMNLGSS 
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ALSSSFPSSF* 
>G265 (280.. 1317) 

CTTTGGTCTTGGAAGCCAAATCAAACCTTTCCTTCAATCCTCAAATTTTCGAAAATTTTC 
TCTTTTGCTTTACGTTCTCTCAATTCTTATTTGTAAGAAAGTTTGTTCCTTTAATCAATC 
AAATCAAAGAGACTTTTGAAGATTGTTTCCCAATTTGCGTCAATCGGGATCGAGTCAAAT 
CTGAAATCTTCTCCACTCATCATCTGACTATAAGACTTAATCAAGGGACTTTTTGTTCGG 
GTTTGGTTTTAAACGTCTTGGATTCGAAGTGGTTAAGGTATGGATGAAAATAATGGAGGT 
TCAAGCTCACTTCCACCTTTCCTTACTAAAACATATGAAATGGTTGATGATTCTTCTTCT 
GACTCGGTCGTTGCTTGGAGCGAAAACAACAAAAGCTTCATCGTCAAGAATCCAGCAGAG 
TTTTCAAGAGACCTTCTTCCGAGATTCTTCAAGCATAAGAATTTCTCAAGTTTCATCCGT 
CAGCTTAATACATATGGTTTTCGAAAAGTAGATCCTGAGAAATGGGAATTCTTGAATGAT 
GATTTTGTTAGAGGTCGACCTTACCTTATGAAGAACATTCATAGACGAAAACCGGTTCAT 
AGCCACTCGTTAGTGAATCTACAAGCGCAAAATCCTTTGACGGAATCAGAAAGACGGAGC 
ATGGAGGATCAGATAGAAAGACTGAAAAATGAGAAAGAAGGCCTTCTTGCGGAGTTACAG 
AACCAAGAGC7VAGAACGGAAAGAGTTTGAGCTGCAAGTAACGACATTGAAAGATCGGTTA 
CAACATATGGAACAACATCAGAAATCAATAGTGGCATATGTTTCACAGGTTTTGGGAAAA 
CCAGGACTTTCACTAAACCTCGAAAACCATGAGAGAAGAAAAAGAAGATTTCAAGAGAAC 
TCTCTTCCTCCAAGCAGTTCACACATAGAACAGGTCGAAAAGTTAGAATCTTCGCTAACG 
TTTTGGGAGAATCTTGTATCGGAATCATGCGAGAAGAGCGGTTTGCAGTCATCAAGCATG 
GATCATGATGCAGCTGAGTCAAGTCTAAGTATTGGCGATACACGACCCAAATCATCGAAG 
ATTGATATGAACTCAGAGCCGCCCGTTACCGTTACTGCGCCTGCTCCAAAAACAGGCGTT 
AACGATGACTTTTGGGAACAATGTTTGACAGAGAACCCTGGATCAACCGAGCAACAAGAA 
GTTCAGTCAGAGAGAAGAGATGTCGGTAATGATAATAATGGTAATAAGATTGGAAATCAA 
AGGACGTATTGGTGGAATTCAGGGAATGTAAATAACATTACAGAGAAAGCTTCTTGACAT 
GAATGAGGTTTTTGTAAAATAGTTTTCTTTTGGTTCCACTGAGATTATTGTATGTGTTCA 
TTATTTATTACTCTGTTTCTGTAAAAACAAATCTCTCTATTGTTTGAGGCAGGAGTGACA 
TAAATGCATATGCAGAATTGGTTTCAAAAA 

>G265 Amino Acid Sequence (domain in AA coordinates: 11-105) 
MDEl^GGSSSLPPFLTKTYEMVDDSSSDSWAWSENNKSFIVKNPAEFSRDLLPRFFKHK 
NFS S F IRQLNTYGFRKVDPEKWE FLNDDFWGRP YLMKNIHRRKPVHSHSLVNLQAQNPL 
TESERRSMEDQIERLKNEKEGLLAELQNQEQERKEFELQVTTLKDRLQH^QHQKSIVAY 
VSQVLGKPGLSLNLENHERRKRRFQENSLPPSSSHIEQVEKLESSLTFWENLVSESCEKS 
GLQS S SMDHDAAESSLS IGDTRPKSS KIDMNSEPPVTVTAPAPKTGVNDDFWEQCLTENP 
GSTEQQEVQSERRDVGNDKTWGNKIGNQRTYWWNSGNVNNITEKAS * 
>G2792 (1..960) 

ATGGAT(^TCATCATCACATAGCATCAAGAAATTCATCAACA^ 

TTCGAGCCAGCGTGCCATAACGGTAATGGTT^ACGGTTGGATCTATGACCCAAATCAAGTT 
AGGTACGATCAAAGTAGTGACCAACGGCTGTCAAAGTTGACGGATCTTGTAGGCAAGCAC 
TGGTCAATTGCACCACCGAATAATCCCGACATGAACCATAACCTTCATCATCACTTCGAT 
CATGATCATTCTCAAAACGACGAC^TTTCTAT 

GAGGAAGATCTTTGTTACAATAATGGCTCAAGTGGTGGTGGTTCCTTGTTCCATGATCCT 
ATAGAAAGTTCTAGAAGTTTCCTTGATATAAGGTTAAGTAGGCCATTAACGGATATTAAT 
CCGTCATTTAAGCCATGCTTTAAGGCCTTAAACGTATCCGAGTTTAACAAGAAAGAACAT 
CAAACGGCATCTCTGGCAGCAGTGAGACTGGGAACAACAAACGCTGGAAAAAAGAAGAGA 
TGTGAAGAAATTTCCGATGAGGTTTCAAAGAAGGCCAAGTGCAGTGAGGGCTCTACACTT 
TCGCCAGAGAAGGAACTACCCAAAGCCAAACTTCGAGACAAGATCACGACTCTACAGCAA 
ATTGTGTCTCCCTTTGGAAAGACTGATACTGCTTCTGTGCTTCAAGAGGCCATCACTTAC 
ATAAATTTTTATCAAGAGCAAGTTAAGCTGCTAAGCACTCCTTATATGAAGAATTCATCA 
ATGAAGGATCCATGGGGGGGATGGGACAGAGAAGATCACAACAAAAGGGGACCGAAGCAT 
CTTGATCTAAGGAGTAGAGGGCTTTGTTTGGTTCCTATTTCATATACCCCAATCGCATAC 
CGCGATAACAGTGCAACTGACTACTGGAATCCCACGTATAGAGGTTCTTTGTATCGTTAG 
>G2792 Amino Acid Sequence (domain in AA coordinates : 190-258) 

MDHHHHIASRNSSTTSELPSFEPACHNGNGNGWIYDPNQVRYDQSSDQRLSKLTDLVGKH 
WSIAPPNNPDMNHNLHHHFDHDHSQOT^ 

IESSRSFLDIRLSRPLTDINPSFKPCFKALNVSEFNKKEHQTASLAAVRLGTTNAG 

CEEISDEVSKKAKCSEGSTLSPEKELPKAKLRDKITTLQQIVSPFGKTDTASVLQEAITY 

INFYQEQVKLLSTPYMKNSSMKDPWGGWDREDHNKRGPKHLDLRSRGLCLVPISYTPIAY 
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RDNSATDYWNPTYRGSLYR* 
ATGTCTTCCATC^ 

TCGCTTCCCATCGAAATGAATCACAACTCTCGAATGGTTCGATCCATGTTCATTACATCT 

GATCGCATGAATCATAGAGATTTGTTTTCTTCTCCTCCTTCTTTCTCTTCTTATCAAAAT 

TCACATATCTCTTCATCTTCTGTTGGGTTTAATAATTCACATATGACTTATCATATGCTG 

AAAAGAAATTATGATTCTGTTTCCCGTGCTGATTATTTCTCTACTAAAGATCATTCTCAT 

TTTACTCAAGTATCTTTCACTCAAACCATCACAAATAAGTATACTACTATTGTTCCTTCC 

AATATATTTGACACTGTTCACTATGATATTGGTCGTGTCAAACGTGCCATAGATTTTAGA 

AATATTTGGAATCCTAAATCTCATCTTCCAAAAAAATTTAATAGGCAATGCGAGATTTTG 

AATCCTACCCCTCTTAATATCGTCTTTCCGCACCAGGATTCAGCTGATCGTCAACATTTA 

GACATTATTTTCTCGTCATCAAAGCACAACCATGTTTTCCAAGATGGTCGATCCTTGAAG 

AAAATTTCCGAACCAACCAATCTGTTTGAAAAATCTAATTCTTATGATTCTCAAGAAGAT 

GAGAAAATCGATGCTTATCAATATGATGGTCGTACACATAGTCTACCGTATACGAAATAC 

GGTCCATATACATGTCCCAGGTGTAACGGTGTGTTTGATACTTCTCAAAAATTTGCTGCA 

CATATGTTATCTCACTACAATAATGAGACGGACAAAGAAAGAGACCAAAGATTTCGTGCA 

AGAAATAAAAAACGATATCGTAAGTTTATGGACAGTCTTAAAATATCAAAACAGAAGATA 

TGA 

>G2830 Amino Acid Sequence (domain in AA coordinates : 245-266) 
MSSIPNRFNIYGGDTTNHRESLPIEMNHNSRIWRSMFITSDRMNHRDLFSSPPSFSSYQN 

SHISSSSVGFIWSHMTYHMLKRNYD^ 

NIFDTVHYDIGRVKRAIDFRNIWNPKSHLPKKFNRQCEILNPTPLNIVFPHQDSADRQHL 
DIIFSSSKHNHVFQDGRSLKKISEPTNLFEKSNSYDSQEDEKIDAYQYDGRTHSLPYTKY 
GPYTCPROTGVFDTSQKFAAHMLSHYimETDKERDQRFRARNKKRYRKFMDSLKISKQKI 

* 

>G286 (94.. 2454) 

TGCAATTTCTCTCGACCAAAACCCTAATTTCAGGTTTGGGGTTTTCCTTCTTTCACTGTC 

AATTTTGATGAAACTTGTGATTCAGTGATTAGAATGAATGCTAATGAGCAAACTCGATCC 

GCCAATGGCATTGGCAATGGCAATGGTGAGTCTATTCCCGGGATTCCAGATGACTTACGG 

TGCAAGAGATCGGATGGTAAACAGTGGAGATGCACTGCAATGTCCATGGCTGATAAGACT 

GTTTGTGAGAAGCACTACATCCAAGCAAAGAAGCGGGCGGCTAATTCTGCTTTCAGGGCG 

AACCAGAAGAAAGCGAAAAGGCGATCATCGTTAGGCGAAACAGATACGTATTCGGAAGGG 

AAGATGGATGATTTCGAGTTACCAGTCACCAGCATTGACCACTATAATAACGGTCTTGCC 

TCTGCTTCCAAGAGTAATGGTAGACTAGAGAAGAGACATAATAAAAGCCTGATGCGGTAC 

TCGCCCGAGACACCGATGATGAGGAGTTTCTCTCCACGTGTTGCAGTGGATTTGAATGAT 

GACTTGGGTAGAGATGTTGTAATGTTTGAAGAGGGCTACAGATCTTATAGGACACCACCA 

TCTGTTGCTGTTATGGATCCGACACGAAACAGATCACACCAAAGCACCAGTCCTATGGAA 

TACTCAGCAGCAAGCACAGATGTGTCTGCAGAGTCTTTGGGGGAAATCTGCCATCAATGC 

CAGAGAAAAGATAGAGAGAGAATCATTTCTTGCCTCAAATGCAATCAAAGAGCCTTCTGC 

(^CAATTGTCTATCGGCAAGGTACTCGGAGATATCACTTGAAGAAGTCGAGAAAGTTTGC 

CCTGCATGTCGTGGCTTGTGTGATTGCAAATCTTGCCTGCGTTCAGATAATACAATAAAG 

GTTCGGATCCGGGAAATACCCGTTTTGGACAAGTTGCAGTATCTTTATCGTCTATTATCA 

GCTGTCCTACCAGTCATAAAGCAGATCCATCTTGAACAATGTATGGAAGTTGAACTAGAG 

AAGAGGCTTCTTGAAGTTGAGATTGATCTTGTCAGGGCAAGATTGAAAGCAGATGAGCAG 

ATGTGCTGCAACGTGTQTCGGATACCAGTTGTTGACTACTACCGTCACTGTCCGAACTGC 

TCATATGACCTTTGCCTGAGATGCTGTCAAGATCTACGGGAAGAGTCTTCAGTGACGATT 

AGTGGGACTAACCAAAACGTACAAGATAGAAAAGGAGCTCCCAAACTAAAACTAAACTTT 

TCATACAAGTTTCGTGAGTGGGAAGCCAACGGTGATGGGAGCATCCCTTGCCCTCCTAAG 

GAGTATGGAGGCTGCGGTTCACATTCTTTGAATCTTGCCCGCATTTTCAAGATGAATTGG 

GTTGCAAAGCTTGTGAAAAATGCTGAGGAGATTGTTAGTGGCTGCAAATTATCTGATCTT 

CTGAACCCTGATATGTGTGATTCAAGATTCTGCAAATTTGCTGAGAGAGAAGAGAGCGGT 

GACAACTACGTGTACAGCCCGTCGCTTGAAACGATTAAAACTGATGGAGTAGCTAAGTTT 

GAGCAACAATGGGCAGAGGGTCGGCTTGTTACTGTGAAAATGGTACTTGATGACTCATCT 

TGCTCTAGATGGGATCCTGAGACTATTTGGAGGGATATAGACGAGCTTTCGGACGAGAAA 

CTGAGAGAACATGATCCATTCTTGAAGGCCATTAATTGCTTGGATGGTTTAGAGGTTGAT 

GTAAGACTTGGGGAGTTTACAAGAGCATATAAAGATGGAAAGAACCAAGAGACAGGTCTT 

CCGCTATTGTGGAAGTTAAAGGACTGGCCGAGCCCAAGTGCTTCCGAGGAGTTCATTTTC 
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TACCAAAGACCTGAGTTTATCAGAAGTTTTCCGTTTCTCGAGTACATTCATCCCCGGTTA 
GGCCTTCTGAATGTTGCAGCCAAGTTACCTCATTACTCGCTCCAAAACGATTCAGGTCCA 
AAGATTTATGTGTCTTGTGGGACGTACCAAGAAATCAGTGCTGGCGATTCATTGACTGGT 
ATTCACTACAACATGCGTGACATGGTATACCTATTGGTGCACACGTCTGAAGAAACAACA 
TTCGAAAGGGTGAGAAAAACAAAACCTGTTCCAGAGGAACCTGACCAGAAGATGAGCGAA 
AATGAGTCACTTCTTAGCCCTGAGCAGAAATTAAGGGACGGAGAGTTACATGATCTATCA 
CTTGGTGAAGCCAGTATGGAGAAGAATGAACCTGAGTTGGCGTTGACTGTGAATCCAGAG 
AACTTAACGGAAAACGGTGACAACATGGAATCTTCTTGCACATCTTCATGTGCAGGAGGA 
GCCCAGTGGGATGTCTTTCGACGCCAAGACGTCCCAAAGTTGTCCGGGTATTTGCAGAGA 
ACATTCCAGAAGCCTGATAATATCCAGACTGATTTTGTAAGCCGTACCTGCTAATTCAAA 
TAAATGAAGTGTGTAAAGTCTTGTATGTGGAATGATTGAGTTTCCTAGTTTGTTTACTCT 
GGTTTCAGGTGTCACGCCCGTTGTATGAAGGATTGTCTTTAAATGAACACCACAAGAGAC 
AACTAAGAGACGAGTTTGGAGTTGAGCCATGGACATTTGAGCAACATCGTGGTGAGGCTA 
TCTTCATTCCGGCTGGATGTCCGTTCCAAATCACTAATCTTCAGTCGAATATTCAGGTGG 
CACTTGACTTCTTGTGCCCTGAAAGCGTTGGAGAGTCAGCAAGACTAGCTGAAGAAATCC 
GGTGTTTACCAAACGACCACGAGGCAAAACTTCAGATTCTAGAGATTGGAAAGATATCAT 
TATACGCAGCTAGCTCAGCCATTAAAGAGGTTCAGAAACTGGTCTTGGATCCAAAGTTTG 
GAGCAGAGCTTGGATTTGAAGACTCTAACTTAACCAAAGCAGTCTCTCACAACTTAGACG 
AGGCAACCAAGCGGCC 

>G286 Amino Acid Sequence (domain in AA coordinates: TBD) 

MNANEQTRSANGIGNGNGESIPGIPDDLRCKRSDGKQWRCTAMSMADKTVCEKHYIQAKK 

RAANSAFRANQKKAKRRSSLGETDTYSEGKI^DFELPVTSIDHYNNGLASASKSNGRLEK 

RHNKSLMRYSPETPMMR3FSPRVAVDLNDDLGRDVVMFEEGYRSYRTPPSVAVMDPTRNR 

SHQSTSPMEYSAASTDVSAESLGEICHQCQRKDRERIISCLKCNQRAFCHNCLSARYSEI 

SLEEVEKVCPACRGLCDCKSCLRSDNTIKVRIREIPVLDKLQYLYRLLSAVLPVIKQIHL 

EQCMEVELEKRLLEVEIDLVRARLKADEQMCCWCRIPVVDYYRHCPNCSYDLCLRCCQD 

liREESSVTISGTNQNVQDRKGAPKLKLNFSYKFPEWEANGDGSIPCPPKEYGGCGSHSLN 

LARIFKMOTWAKLVKNAEEIVSGCKLSDLLN^ 

IKTDGVAKFEQQWAEGRLVTVKMVLDDSSCSRWDPETIWRDIDELSDEKLREHDPFLKA^ 

NCLDGLEVDVRLGEFTRAYKDGKNQETGLPLLWKLKDWPSPSASEEFIFYQRPEFIRSFP 

FLEYIHPRLGLLWAAKLPHYSLQNDSGPKIYVSCGTYQEISAGDSLTGIHYNMRDMV^ 

LVHTSEETTFERVRKTKPVPEEPDQKMSENESLLSPEQKLRDGELHDLSLGEASMEKNEP 

ELALTWPEI^TENGDI^ESSCTSSCAGGAQWDVFRRQDVPKLSGYLQRTFQKPDNIQTD 
FVSRTC* 

>G291 (124.. 1197) 

CAAGAACCCAAAGATCTCTCTCTATTTGTTTGCCTTCTTCTTTCTTTCTGACTCAAACCC 
TCAAATCAATTCTCGCGATTAAGCAAAACCCTAGATTTATTCTACTCTTCGAAGTCGATT 
TCAATGGAAGGTTCCTCGTCAGCCATCGCGAGGAAGACATGGGAGCTAGAGAACAACATT 
CTCCCAGTGGAACCAACCGATTC^GCCTCCGACTVGTATATTCCACTACGACGACGCTTCA 
CAAGCCAAAATCC^GCAGGAGAAGCCATGGGCCTCCGATCCTAACTACTTCAAGCGCGTT 
CACATCTCAGCCCTTGCTCTTCTCAAGATGGTGGTTCACGCTCGCTCCGGTGGCACAATC 
GAGATCATGGGTCTTATGCAGGGTAAAACCGAGGGTGATACAATCATCGTTATGGATGCT 
TTTGCTTTGCCTGTTGAAGGTACTGAGACTAGGGTTAATGCTCAGTCTGATGCCTATGAG 
TATATGGTTGAATACTCTCAGACCAGCAAGCTGGCTGGGAGGTTGGAGAACGTTGTTGGA 
TGGTATCACTCTCACCCTGGGTATGGATGTTGGCTCTCGGGTATTGATGTTTCGACACAG 
ATGCTTAACCAACAGTATCAGGAGCCATTCTTAGCTGTTGTTATTGATCCAACAAGGACT 
GTTTCGGCTGGTAAGGTTGAGATTGGGGCATTCAGAACATATCCAGAGGGACATAAGATC 
TCGGATGATCATGl^TCTGAGTATCAGACTATCCCTCTTAACAAGATTGAGGACTTTGGT 
GTACATTGCAAAC^GTACTACTCATTC^ 

CACCTTCTGGATCTCCTTTGGAAC7VAGTACTGGGTGAACACTCTTTCTTCTO 

TTGGGCAATGGAGACTATGTTGCCGGGCAAATATCAGACTTGGCTGAGAAGCTCGAGCAA 

GCGGAGAGTCAGCTCGCTAACTCCCGGTATGGAGGAATTGCGCCAGCCGGTCACCAAAGG 

AGGAAAGAGGATGAGCCTCAACTCGCGAAGATAACTCGGGATAGTGCAAAGATAACTGTC 

GAGCAGGTCCATGGACTAATGTCACAGGTTATCAAAGACATCTTGTTCAATTCCGCTCGT 

CAGTCCAAGAAGTCTGCTGACGACTCATCAGATCCAGAGCCCATGATTACATCGTGAAGT 

TGGTCTATTCTTTTGTTTTT^ 

GGCAATGCCCATTGTTCCCTATATCTCTAGTGTAGTATCTGCTTCAGACAAAGATCTTTG 
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GGTTATTAAATGACATTAACATAAAAAAAA 

>G291 Amino Acid Sequence (domain in AA coordinates: 132-160) 

MEGSSSAIARKTWELENNILPVEPTDSASDSIFHYDDASQAKIQQEKPWASDPNYFKRVH 

ISALALLKMWHARSGGTIEIMGLMQGKTEGDTIIVMDAFALPVEGTETRVNAQSDAYEY 

MVEYSQTSIOjAGRLENVVGWYHSHPGYGCWLSGIDVSTQMLNQQYQEPFLAVVIDPTRTV 

SAGKVEIGAFRTYPEGHKISDDHVSEYQTIPLNKIEDFGVHCKQYYSLDITYFKSSLDSH 

LLDLLWNKYWVNTLSSSPLLGNGDYVAGQISDLAEKLEQAESQLANSRYGGIAPAGHQRR 

KEDEPQLAKITRDSAKITVEQVHGLMSQVIICDILFNSARQSKKSADDSSDPEPMITS* 

>G427 (49.. 1230) 

TTTCCCTCTCCGAAACAGAAATTCAAAAACAAATTCAACACGAAAACGATGGCGTTTCAT 
AACAATCACTTTAATCATTTCACCGACCAACAACAACATCAGCCTCCTCCTCCGCCGCAA 
CAGCAGCAGCAACAACATTTTCAAGAATCAGCACCCCCTAATTGGCTCCTCCGCTCCGAC 
AACAACTTCCTCAATCTCCACACAGCTGCCACAGCCGCCGCTACAAGCTCCGATTCTCCT 
TCTTCCGCCGCCGCTAACCAGTGGCTCTCACGATCCTCATCCTTCCTCCAACGAGGCAAC 
ACCGCAAACAACAACAACAACGAAACATCCGGTGACGTCATCGAAGACGTTCCCGGCGGA 
GAGGAGTCAATGATCGGAGAGAAGAAGGAGGCGGAGAGGTGGCAGAATGCGAGACACAAG 
GCGGAGATACTGTCTCATCCACTATACGAGCAACTTTTGTCGGCACACGTGGCGTGCCTG 
AGGATCGCAACGCCGGTGGATCAGCTTCCGAGGATAGACGCACAGCTTGCTCAGTCTCAA 
AACGTCGTGGCTAAGTACTCT^ACTTTAGAAGCCGCTCAAGGACTCCTCGCCGGCGATGAC 
AAGGAGCTTGACCACTTCATGACGCATTATGTACTATTGCTTTGCTCTTTCAAAGAACAA 
CTGCAACAGCATGTTCGTGTTCATGCAATGGAAGCTGTTATGGCCTGTTGGGAGATTGAA 
CAGTCGCTTCAAAGTTTTACAGGAGTATCTCCTGGTGAAGGCACAGGAGCAACAATGTCT 
GAGGATGAAGATGAGCAAGTAGAGAGTGATGCTCATTTGTTTGATGGAAGCTTAGATGGG 
TTAGGGTTTGGTCCTCTAGTTCCCACTGAGAGCGAGAGATCTTTGATGGAACGAGTCAGA 
CAAGAACTCAAACATGAACTCAAGCAGGGTTACAAGGAGAAAATTGTGGACATAAGAGAG 
GAGATACTGAGGAAGAGAAGAGCTGGAAAATTACCAGGAGACACCACCTCTGTTCTCA7^A 
TCATGGTGGCAATCTCATTCTAAGTGGCCTTACCCTACTGAGGAAGATAAGGCGAGGTTG 
GTGCAGGAGACGGGTTTGCAGCTCAAACAGATAAACAATTGGTTCATCAATCAAAGAAAG 
AGGAATTGGCATAGCAATCCATCTTCTTCTACCGTCTCAAAGAATAAACGCCGAAGCAAT 
GCAGGTGAAAACAGCGGAAGAGACCGTTGAGATCAAGCTTGCATGTAGAGATCCAAAAGC 
TTTATAGAAAGGTGGAGGCATGAAGACAAAGAATTCTTACACAACAAACGTAGGACGTAA 
TTTTGTGCCAGTACATGGTATGGCTTTCATATTTGGTAATGATTAGGGCCACACAAAATt 
AAACCCCAAAGCATGATTTGTAATATGAGGTTTTAGATGGACTTTATGATAGGATCGTCA 
GTCTTCACTGCCATCTCCATTCTCCACCATCAATCCATCATTATATCTTGTGAAAAAAAA 
A 

>G427 Amino Acid Sequence (domain in AA coordinates: 307-370) 

MAFHNNHFNHFTDQQQHQPPPPPQQQQQQHFQESAPPNWLLRSDNNFLNLHTAATAAATS 

SDSPSSAAANQWLSRSSSFLQRGNTANNNNNETSGDVIEDVPGGEESMIGEKKEAERWQN 

ARHKAEILSHPLYEQLLSAHVACLRIATPVDQLPRIDAQLAQSQNWAKYSTLEAAQGLL 

AGDDKELDHFMTHYVLLLCS FKEQLQQHVRVHAMEAVMACWE I EQSLQS FTGVS PGEGTG 

ATMSEDEDEQVESDAHLFDGSLDGLGFGPLVPTESERSLMERVRQEIiKHELKQGYKEKIV 

DIREEILRKRRAGKLPGDTTSVLKSWWQSHSKWPYP 

NQRKRNWHSNPS S STVS KNKRRSNAGENSGRDR* 

>G509 (122.. 1054) 

CTTCCTCCTTTGCTAATAAACTTTTCTTTGAACCTTACACGCCTTGTTGATATTACTCTC 
TTAAATATATATTTTCGTACATTAACACAGACATATATAAAGCTAAAGATTTCTTCACGT 
AATGGGTTTGAAAGATATTGGGTCCAAATTGCCACCGGGGTTTCGATTTCATCCAAGTGA 
TGAAGAGTTGGTTTGTCATTATCTTTGCAACAAGATTAGGGCCAAATCTGATCATGGTGA 
TGTTGATGATGATGATGATGATGTTGATGAAGCTTTGAAGGGTTCTACTGATCTTGTGGA 
GATTGACTTGCATATCTGTGAGCCATGGGAGCTTCCTGATGTGGCAAAGTTAAACGCAAA 
GGAATGGTACTTCTTCAGTTTCCGTGATCGAAAGTATGCTACTGGATATCGCACGAACAG 
AGCGACAGTAAGCGGATACTGGAAAGCAACAGGAAAAGATCGAACGGTGATGGATCCACG 
TACAAGGCAATTGGTAGGGATGAGAAAAACACTAGTGTTCTACAGAAACAGAGCACCAAA 
TGGGATCAAAACTACTTGGATCATGCACGAGTTCCGTCTTGAGTGTCCTAACATCCCACA 
TAAGGAAGACTGGGTCTTGTGCAGAGTGTTCAACAAAGGCAGAGACTCATCGCTACAAGA 
CAATAATTATTATAACAATGATAATCAGACGCAAAGGCTTGAAGTTAATGACGCTCCGGA 
TCTTAATTACAACAATCAGTTGCCACCTTTGCTATCATCCCCTCCTCATAATCATCAACA 
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TGAGAAGATGAAAATCCAAGTTTGTGATCAGTGGGAGCAGCTAATGAAGCAGCCTTCAAG 

GACCACCGGCCACCCCTATCATCACCATTGTCATCATCAAACCATAGCATGTGGTTGGGA 

GCAGATGATGATCGGTTCGCTGTCATCACCTTCGAGTCATGGCCCTGATCACGAGTCCTT 

TGCTAAATTTGCTTTACCGTCGACAATAACAACAGTGTCAACATCAGTGGTGATCATCAT 

CAGAATTATGAGAAGATTTTGTTGTCATCACTAGACATGACGAGTTTGGATCACGACAAG 

ACATGTATGGGATCATCATCGGATGGTGGTATGGTCTCTGATCTTCACATGGAATGTGGT 

GGATTGAGTTTTGAGACCGAAAATATCCTCGCTTTCCAATGAACATAATTCAAGGGGTTC 

GCCAATTTGTTGATTCGTGAATTATACAAACATTTTATCTATAGATTTATCACATTATCA 

AACATGTAAGTTGTGTGGCATTTGGGTATAGGGTTTGTGTGATTCTAGGTTTTTAGGACG 

ATGTATGTTGTTATATTTAGCGTGTTTTTAGGATTTATTCTCATTTTAAAATTATATGAA 

AACCCATTACTATGAATACAATTAGTTTTCTTTGTTGTAAATAATATTTTAGATTATCAA 
AAAAAAAAAAAAAA 

>G509 Amino Acid Sequence (domain in AA coordinates: 13-169) 
MGLKDIGSKLPPGFRFHPSDEELVCHYLCNKIRAKSDHGDVDDDDDDVDEALKGSTDLVE 
IDLHICEPWELPDVAKLNAKEWYFFSFRDRKYATGYRTNRATVSGYWKATGKDRTVMDPR 
TRQLVGMRKTLVFYRNRAPNGI KTTWIMHEFRLECPNI PHKEDWVLCRVFNKGRDS SLQD 
NNYYNNDNQTQRLEVNDAPDLNYNNQLPPLLSSPPHNHQHEKMKIQVCDQW 

TTGHPYHHHCHHQTIACGWEQMMIGSLSSPSSHGPDHESFAKFALPSTITTVSTSWIII 
RIMRRFCCHH* 

>G519 (85.. 894) 

CACAAAGATCCTCCGATTCGAAGGTTTATAAAAACTCAAAATCGAATCTTATCCACAAGA 
AAACAACAAGGTACTTTTCCAAAAATGAAGGCGGAGTTGAATTTGCCGGCGGGATTCCGA 
TTTCATCCGACGGACGAAGAGCTTGTCAAGTTCTATCTTTGCCGGAGATGTGCGTCAGAA 
CCGATTAACGTTCCGGTTATCGCAGAGATTGACTTGTACAAATTCAATCCATGGGAGCTT 
CCAGAAATGGCGTTGTACGGTGAGAAAGAATGGTACTTCTTCTCGCATAGAGACCGGAAA 
TACCCAAACGGGTCGAGACCAAACCGGGCAGCTGGAACCGGTTATTGGAAAGCGACTGGA 
GCTGATAAACCGATCGGAAAACCGAAGACGTTAGGGATTAAGAAAGCACTCGTCTTCTAC 
GCAGGAAAAGCTCCGAAAGGGATTAAAACGAATTGGATTATGCACGAGTATCGTCTCGCT 
AATGTCGATCGATCTGCTTCTACCAACAAGAAGAACAACTTAAGACTTGATGATTGGGTT 
TTGTGTCGGATATACAATAAGAAAGGAACAATGGAGAAGTATTTACCGGCGGCGGCTGAG 
AAACCGACGGAAAAGATGAGTACGTCGGACTCAAGATGCTCAAGTCACGTGATTTCACCG 
GACGTCACGTGTTCTGATAACTGGGAGGTTGAGAGTGAGCCCAAATGGATTAATCTGGAA 
GACGCGTTAGAGGCATTTAATGATGACACGTCCATGTTTAGTTCCATTGGTTTGTTGCAA 
AATGACGCCTTTGTTCCTCAGTTTCAGTACCAGTCCTCCGATTTCGTCGATTCGTTTCAG 
GACCCGTTCGAGCAGAAACCGTTCTTGAATTGGAATTTTGCTCCTCAAGGGTAAAAATAA 
TCGGCAAAAAGTTGAAGCTTTTCAGAGTCTTCGATCACCGGCATTGTGTCGGATCCTGAC 
CCGGAGACCAAGTCGGGTC^TACGATTACATAATCGGGTTATTGAGATTTCCACATTTGG 
ATTTCCGAGACTAACCAACTTAACGGATTCTGGGGTAATTGGGGGGTTTTGCACAGGTGA 
ATCACACTGAGTCAGCAAGTTTCGATTTTTTGGTTTTGTTTTGTA 

TCTAAAGATATCACGAAGTAGATTCAGAAGAACTGTAAAAGCAATTGTGACCACCCGTTA 

TGAATCATAAATATATTCAATGAAGC^TGAGCTTATTTTTTTTTTAAAAAAAAA 

>G519 Amino Acid Sequence (conserved domain in AA coordinates: 11-104) 

MKAELNLPAGFRFHPTDEELVKFYLCRRCASEPINVT^ 

KEWYFFSHRDRKYPNGSRPNRAAGTGYWKATGADKPIGKPKTLGIKKALVFYAG 

KTNW I MHE YRLANVDRS AS TNKKNNLRLDD WVLCR I YNKKGTME KYL P AAAE KPTE KM S T 

SDSRCSSHVISPDVTCSDNWEVESEPKWINLEDALEAFNDDTSMFSSIGLLQNDAFVPQF 
QYQS SDFVDS FQD P FEQKP FLNWNFAPQG * 
>G561 (86.. lies) 
AATTTGTTTTTTTTTCTTTTGTC 

CTGTGTCATTACTCTGCATTGAGCAATGGGTAGCAACGAAGAAGGAAACCCCACTAACAA 
CTCTGATAAGCCATCGCAAGCTGCTGCTCCTGAGCAGAGTAATGTTCATGTGTATCATCA 
TGACTGGGCTGCTATGCAGGCATATTATGGGCCTAGAGTTGGTATACCTCAATATTACAA 
CTCAAATTTGGCGCCTGGTCATGCTCCACCGCCTTATATGTGGGCGTCTCCATCGCCAAT 
GATGGCTCCTTATGGAGCACCATATCCACCATTTTGCCCTCCTGGTGGAGTTTATGCTCA 
^CCTGGTGTTCAAATGGGCTCACAACCACAAGGTCCTGTTTCT 

TACAACCCCTTTGACCATTGATGCACCAGCTAATTCAGCTGGAAACTCAGAT(^TGGGTT 
CATGAAAAAGCTGAAAGAGTTCGATGGACTTGCAATGTCAATAAGCAATAACAAAGTTGG 
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GAGTGCTGAACATAGCAGCAGTGAACATAGGAGTTCTCAGAGCTCCGAGAATGATGGCTC 
TAGCAATGGTAGTGATGGTAATACAACTGGGGGAGAACAATCTAGGAGGAAAAGAAGGCA 
ACAAAGATCACCAAGCACTGGTGAAAGACCCTCATCTCAAAACAGTCTGCCTCTTAGAGG 
TGAAAATGAGAAACCCGATGTGACTATGGGGACTCCTGTTATGCCCACAGCAATGAGTTT 
CCAAAACTCTGCTGGCATGAACGGTGTGCCACAGCCATGGAATGAAAAAGAGGTTAAACG 
AGAGAAGAGAAAACAGTCAAACCGAGAATCTGCTAGGAGGTCAAGACTGAGGAAGCAGGC 
TGAAACAGAACAACTATCTGTCAAAGTTGACGCATTAGTAGCTGAGAACATGTCTCTGAG 
GTCTAAACTAGGCCAGCTAAACAATGAGTCTGAGAAACTACGGCTGGAGAACGAAGCTAT 
ATTGGATCAACTG AAAG CGCAAGCAACAGGGAAAACAGAGAAC CTG ATCTCTCGAGTTGA 
TAAGAACAACTCTGTATCAGGTAGCAAAACTGTGCAGCATCAACTGTTAAATGCAAGTCC 
GATAACCGATCCTGTCGCGGCTAGCTGACCGTGGCCGCAACAATGAGAACCCGATATTTC 
TTCCTTTGGGTTGTGATTGTAACTTAAAAGGAGACTTTTTGTTTTTATTCTTAGATTTGT 
AGCTCTCTGCATAGTGAGCATAAATTGATGTAATATGGTTTAAGAGATTCGGTGTTCTCT 
GGTGTGTGCTGCAACCACATAATTGGTGATAGATAGGTTTAGTTATATAAGCAAATGTAT 
TAGAGATAAGGGGAGACATATTTGATGGTCTTT 

>G561 Amino Acid Sequence (domain in AA coordinates: 248-308) 
MGSNEEGNPTmSDKPSQAAAPEQSNVHVYHHD^ 

PPPYMWASPSPMMAPYGAPYPPFCPPGGVYAHPGVQMGSQPQGPVSQSASGVTTPLTIDA 

PANSAGNSDHGFMKKiKJSFDGLAMSISI^KVGSAEHSSSEHRSSQSSENDGSSNGSDGNT 

TGGEQSRRKRRQQRSPSTGERPSSQNSLPLRGENEKPDVTMGTPVMPTAMSFQNSAGMNG 

VPQPWNEKEVKREKRKQSNRESARRSRLRKQAETEQLSVICTOALVAENMSLRSKLGQLNN 

ESEKLRLENRAILDQLKAQATGKTENLISRVDKNNSVSGSKTVQHQLLNASPITDPVAAS 
* 

>G590 (102.. 1223) 

TCGACAGACACTCTCCCTCTCTCCATGCCCATAAAATCTCAAAGACTGTTTAAAAAAAAA 
AATGTTTTAGCTTTAACTGCTTTTTTTTTGTTGTTGGTGTAATGATATCACAGAGAGAAG 
AAAGAGAAGAGAAGAAGCAGAGAGTGATGGGAGATAAGAAATTGATTTCATCTTCTTCTT 
CTTCCTCGGTTTACGATACTCGTATCAATCATCATCTTCATCATCCTCCGTCTTCTTCCG 
ACGAAATCTCTCAGTTTCTCCGGCATATTTTCGACCGTTCTTCTCCTTTACCTTCTTACT 
ACTCCCCGGCGACGACTACAACGACGGCGTCTTTGATTGGTGTGCACGGGAGCGGTGACC 
CACATGCAGATAACTCGAGAAGTCTCGTTTCTCATCATCCACCGTCAGATTCTGTGCTTA 
TGTCGAAACGTGTCGGAGATTTCTCTGAGGTTTTAATCGGCGGAGGATCAGGCTCAGCCG 
CCGCGTGTTTTGGTTTCTCCGGTGGTGGTAATAATAACAACGTTCAAGGAAATAGCTCTG 
GGACTCGAGTATCGTCTTCTTCCGTTGGAGCTAGTGGCAACGAGACAGATGAGTATGACT 
GTGAAAGCGAGGAAGGAGGAGAAGCTGTAGTTGATGAAGCTCCCTCTTCCAAGTCAGGTC 
CTTCTTCTCGTAGTTCATCTAAAAGATGCAGAGCTGCTGAAGTTCATAATCTCTCTGAGA 
AGAGGAGGAGAAGTAGAATTAATGAAAAAATGAAAGCTTTACAAAGTCTCATCCCTAATT 
CAAATAAGACGGATAAGGCTTCAATGCTTGATGAAGCCATTGAGTATCTGAAACAGCTTC 
AGCTCCAAGTTCAGATGTTGACTATGAGAAATGGAATAAACTTGCATCCTTTGTGTTTAC 
CTGGAACTACATTACACCCATTGCAACTCTCTCAGATTCGACCCCCTGAAGCAACCAATG 
ATCCTCTGCTTAATC^TACCAATCAGTTTGCTTCGACTTCTAATGCACCGGAAATGATCA 
ATACTGTGGCTTCTTCATACGCTTTGGAACCTTCTATTCGCAGTCACTTTGGACCTTTCC 
CTCTCCTTACTTCACCCGTGGAGATGAGTCGGGAAGGTGGGTTAACTCATCCAAGGTTGA 
ACATTGGTCATTCCAACGCAAACATAACCGGGGAACAAGCTCTGTTTGATGGACAACCTG 
ACCTAAAAGATCGAATTACTTGAACAGTGTCCCAACTTCGGGATCTCTATGTGTTCTTGT 
TTCTTAGAACGCAAGCCATAAAGCTGTCTGAC 

>G590 Amino Acid Sequence (domain in AA coordinates: 202-254) 

MISQREEREEKKQRVMGDKKLISSSSSSSVYDTRINHHLHHPPSSSDEISQFLRHIFDRS 

SPLPSYYSPATTTTTASLIGVHGSGDPHADNSRSLVSHHPPSDSVLMSKRVGDFSEVLIG 

GGSGSAAACFGFSGGGNNl^VQGNSSGTRVSSSSVGASGNETDEYDCESEEGGEAVVDEA 

PSSKSGPSSRSSSKRCRAAEVHNLSEKRRRSRINEKMKALQSLIPNSNKTDKASMLDEAI 

EYLKQLQIiQVQMLTMRNGINLHPLGLPGTTLHPLQLSQIRPPEATNDPLLNHTNQFASTS 

NAPEMINTVASSYALEPSIRSHFGPFPLLTSPVEMSREGGLTHPRLNIGHSNANITGEQA 

LFDGQPDLKDRIT* 

>G818 (65.. 1060) 

GTATTTCTTACAATAAACGACCAAAAAGTTAATACAAGAAATAGAAACGGTGTAGGAAGC 
TACTATGACGGCAATTCCAAACGTCGTCGATATTGAATCTTCTTCCTCTTCGCTTTGTCA 
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AGAGACGGCAACGGAGACCGTCACCGTTGAAAGAGGCTCGTCTGATTCATCTTCAAAGCC 
AGACGACGTCGTTTTACTAATCAAGGAAGAGGAGGATGACGCGGTTAACTTGTCACTTGG 
TTTTTGGAAATTGCACGAGATAGGTTTAATAACACCGTTCTTGAGAAAGACGTTTGAGAT 
CGTCGATGACAAAGTAACAGACCCGGTTGTATCATGGAGCCCGACCCGTAAAAGCTTTAT 
CATTTGGGATTCTTACGAGTTCTCAGAGAATCTACTTCCCAAATACTTCAAGCACAAGAA 
CTTCTCCAGTTTTATTCGTCAGCTTAACTCTTACGGTTTTAAAAAGGTCGATTCAGATAG 
GTGGGAATTTGCTAACGAAGGGTTTCAAGGAGGGAAGAAACATTTGCTTAAGAACATCAA 
GAGGAGAAGCAAAAACACTAAATGTTGTAACAAGGAAGCGAGTACCACCACGACAGAGAC 
TGAGGTTGAGTCATTGAAGGAGGAACAGAGTCCAATGAGATTGGAGATGTTGAAGCTGAA 
ACAACAACAAGAAGAATCTCAACATCAGATGGTCACTGTGCAGGAGAAGATCCACGGAGT 
TGATACCGAACAACAGCATATGCTTAGTTTCTTTGCAAAGTTGGCTAAAGATCAAAGATT 
TGTAGAGAGACTGGTGAAGAAGAGAAAGATGAAAATACAGAGAGAGCTAGAAGCAGCTGA 
ATTCGTGAAGAAGCTCAAGTTGCTTCAGGATCAAGAAACTCAAAAGAACTTGTTAGATGT 
AGAAAGAGAATTTATGGCCATGGCTGCAACAGAACACAATCCCGAGCCTGACATTTTGGT 
GAACAATCAAAGCGGGAATACGAGATGTCAGCTTAACTCAGAGGACCTACTTGTTGACGG 
TGGCTCAATGGATGTAAATGGGAGGATAGAGATAGAGTAGAGCAAAACCGGTAACATAGC 
AATAGAGAAGGTACCAAATCCCAAGGCTTGAGATCCGAAT 

>G818 Amino Acid Sequence (domain in AA coordinates: 70-162) 
MTAIPNVVDIESSSSSLCQETATETVTVERGSSDSSSKPDDWLLIKEEEDDAVNLSLGF 
WKLHEIGLITPFLRKTFEIVDDKVTDPVVSWSPTRKSFIIWDSYEFSENLLPKYFKHKNF 
SSFIRQLNSYGFKKVDSDRWEFANEGFQGGKKHLLKNIKRRSKNTKCCNKEASTT^ 

VESLKEEQSPMRLEMLKLKQQQEESQHQMVWQEKIHGVDTEQQHMLSFFAKLAKDQRFV 
ERLVKKRKMKIQRELEAAEFVKKLKLLQDQET^ 

NQSGNTRCQLNSEDLLVDGGSMDVNGRIEIE* 
>G849 (218.. 2077) 

AACTCGAGAATTCTTCATTTCTTTTAAATCTTAGAATCTCGAGTTTTTGTATAAATCGAT 

TCTAATTTTTCCTTTGTACATTGTTTTATATATACATAAAACACACAAATCGGGTATGGG 

GGAATTTGGGTTTTAAGATAGCGTGATCTGTAATAATAAGTGGTTCGCGATCGTGATCAA 

GAAACTGGTGGCTGATAGTGATATGCATATTTGAGAGATGGTGTTCAAGAGAAAGTTAGA 

TTGCCTTTCCGTGGGATTTGATTTTCCCAACATTCCCAGAGCTCCTCGTTCATGCAGGAG 

GAAGGTTCTAAACAAGAGGATTGATCATGATGATGATAACACTCAGATCTGTGCAATTGA ' 

CTTACTAGCTTTGGCTGGAAAGATTCTACAGGAAAGCGAGAGTTCCTCTGCGTCTTCTAA 

TGCATTTGAAGAAATTAAGCAAGAGAAAGTAGAAAATO^ 

TTCTGACCAAGGAAACTCTGTGTCAAAGCCTACTTATGATATCTCTACTGAGAAGTGTGT 
GGTGAACAGTTGTTTTTCATTTCCGGATAGTGACGGCGTTTTGGAGCGGACTCCGATGTC 
TGATTACAAGAAGATTCATGGTTTGATGGATGTAGGGTGTGAAAACAAGAATGTAAATAA 
TGGGTTCGAGCAAGGAGAAGCAACCGATCGCGTGGGTGATGGAGGCTTAGTCACTGATAC 
TTGCAACTTAGAGGATGCAACTGCGTTAGGTCTGCAGTTTCCGAAATCAGTCTGTGTGGG 
TGGTGATTTAAAATCACCATCCACCTTGGATATGACCCCTAATGGTTCCTATGCTAGACA 
TGGGAACCATACTAACCTAGGTAGAAAAGATGATGATGAAAAATTCTATAGTTACCATAA 
ACTTAGCAATAAATTTAAGTCGTATAGGTCTCCAACAATTCGAAGAATAAGAAAGTCCAT 
GTCGTCCAAATACTGGAAACAAGTTCCAAAAGATTTTGGATACAGTAGAGCTGATGTGGG 
TGTGAAGACTCTTTATCGCAAAAGAAAATCATGTTATGGTTACAACGCATGGCAGCGTGA 
GATCATTTATAAGAGAAGAAGATCACCTGACAGAAGCTCGGTCGTAACTTCTGATGGAGG 
ACTCAGTAGTGGAAGTGTTTCCAAGTTACCCAAGAAGGGAGATACAGTAAAGCTAAGCAT 
TAAGTCCTTTAGGATTCCAGAGCTTTTTATTGAAGTTCCAGAAACTGCAACAGTAGGATC 
ACTAAAGAGGACTGTGATGGAGGCTGTCAGTGTTTTACTCAGCGGAGGAATACGTGTTGG 
GGTGTTAATGCATGeGAAGAAGGTTAGAGATGAAAGGAAAACTCTGTCCCAGACTGGGAT 
CTCATGTGATGAAAATCTAGACAACCTTGGGTTCACCTTGGAGCCTAGTCCCAGCAAAGT 
TCCCCTACCTTTGTGTTCTGAAGATCCTGCTGTGCCAACCGACCCTACAAGTTTGTCTGA 
ACGGTCTGCGGCGTCTCCTATGCTAGATTCTGGAATTCCACATGCAGATGACGTGATTGA 
TTCAAGAAATATTGTGGACAGTAACCTCGAATTAGTTCCATATCAGGGTGACATATCTGT 
TGATGAACCTTCATC^GATTCAAAAGAGCTTGTCCCACTTCCAGAGTTGGAAGTCAAGGC 
GCTTGCCATAGTTCCGTTGAACCAGAAACCTAAGCGTACTGAGCTAGCCCAGAGGAGAAC 
TAGGAGACCCTTCTCTGTGACAGAGGTAGAAGCTCTTGTACAAGCAGTTGAGGAACTCGG 
GACTGGAAGATGGCGTGATGTAAAATTGCGTGCTTTCGAGGATGCAGATCATCGGACTTA 
CGTGGACTTGAAGGACAAATGGAAGACGCTAGTO(^CACAGCAAGTATATCCCCACAG^ 
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ACGAAGAGGAGAGCCGGTGCCACAAGAACTGCTAGACAGAGTCTTGAGGGCATACGGGTA 
TTGGTCGCAGCACCAAGGAAAACATCAGGCGAGAGGAGCGTCCAAAGATCCAGACATGAA 
CAGAGGTGGAGCTTTTGAATCAGGTGTTTCAGTGTAAAAAAGGAGGTACGCATTGGTGGG 
TGGGTGTACAGAAGCAAACAACACAATAAATGGACAACTCAATTTCTGCAAAGTTTAATT 



>G849 Amino Acid Sequence (domain in AA coordinates: 324-413, ! 

MVFI^RKLDCLSVGFDFPNIPRAPRSCRRKVLNKRIDHDDDNTQICAIDLLALAGKILQES 

ESSSASSNAFEEIKQEKVENCKTIKSESSDQGNSVSKPTYDISTEKCWNSCFSFPDSDG 

VLERTPMSDYKKIHGLMDVGCENKNVNNGFEQGEATDRVGDGGLVTDTCNLEDATAliGLQ 

FPKSVCVGGDLKSPSTLDMTPNGSYARHGNHTNLGRKDDDEKFYSYHKLSNKFKSYRSPT 

IRRIRKSMSSKYWKQVPKDF6YSRADVGVKTLYRKRKSCYGYNAWQREIIYKRRRSPDRS 

SWTSDGGLSSGSVSKLPKKGDTVKLSIKSFRIPELFIEVPETATVGSLKRTVMEAVSVL 

LSGGIRVGVLMHGKKVRDERKTLSQTGISCDENLDNLGFTLEPSPSKVPLPLCSEDPAVP 

TDPTSLSERSAASPMLDSGIPHADDVIDSRNIVDSNLELVPYQGDISVDEPSSDSKELVP 

LPELEVKALAIVPLNQKPKRTELAQRRTRRPFSVTEVEALVQAVEELGTGRWRDVKLRAF 

EDADHRTYVDLKDKWKTLVHTASISPQQRRGEPVPQELLDRVLRAYGYWSQHQGKHQARG 

AS KDPDMNRGGAFESGVS V* 
>G892 (21.. 1004) 

TATAACAATTCCTTCCAACAATGTCATTGAGTCAGCCAATAACACGGACCGATAGTGCAC 

CCAATGGAGCATTTAGGACTTTTGGTCTCTACTGGTGCTACCATTGTGATCGTATGGTCA 

GAATTGCATCCTCTAACCCATCAGAGATCGCCTGTCCTCGATGTTTGAGGCAATTTGTCG 

TTGAGATTGAAACGAGACAACGGCCTCGGTTTACTTTCAACCATGCTACTCCGCCTTTTG 

ATGCTTCTCCTGAGGCTCGTCTTCTCGAAGCTCTCTCGCTCATGTTTGAGCCTGCAACCA 

TAGGTAGGTTTGGTGCAGACCCATTTCTTAGGGCAAGATCCAGAAACATCTTGGAACCTG 

AATCAAGACCCCGACCGCAACATCGAAGACGACACAGCCTTGACAATGTTAACAATGGTG 

GTTTACCTCTACCAAGAAGAACATATGTTATTCTCCGGCCCAATAATCCGACTAGTCCAC 

TCGGAAACATAATTGCGCCACCAAATCAAGCACCACCACGGCATGTGAACTCACATGATT 

ACTTTACTGGAGCATCAAGCTTAGAGCAGCTGATTGAACAACTAACACAAGACGATAGGC 

CTGGACCACCACCTGCGTCAGAACCCACCATTAATTCCCTACCATCTGTGAAAATAACAC 

CACAACATCTAACTAACGACATGTCCCAATGCACAGTGTGCATGGAAGAATTCATTGTTG 

GTGGGGACGCAACGGAATTACCATGTAAACATATTTACCATAAAGATTGTATAGTCCCGT 

GGCTTAGGCTTAACAATTCTTGCCCTATCTGCCGCCGTGACCTGCCACTTGTCAACACCG 

TTGCTGAATCTCGAGAAAGGAGCAATCCTATTAGACAAGACATGCCTGAAAGAAGGCGTC 

CAAGGTGGATGCAACTCGGTAACATTTGGCCATTTAGAGCAAGATACCAAAGGGTTAGTC 

CAGAAGAAACAGCAAACCAGAATCCTCGAGATAACAGGAGCTAACTCTGAATATTCCATG 

GGAAATAAAAATCGTGACTATCTATATGTATAGACTCTATGAGACATTGTCTATTTGAAT 

GTGCATGTATATCTCAGAAATAAACTCAAGCGAAACATATTTAACGACTAAAAAAAA 

>G892 Amino Acid Sequence (domain in AA coordinates: 177-270) 

MSLSQPITRTDSAPNGAFRTFGLYWCYHCDRMVRIASSNPSEIACPRCLRQFVVEIETRQ 

RPRFTFNHATPPFDASPEARLLEALSLMFEPATIGRFGADPFLRARSRNILEPESRPRPQ 

HRRRHSLDNVNWGGLPLPRRTYVILRPHNPTSPLGNIIAPPNQAPPRHVNSHDYFTGASS 

LEQLIEQLTQDDRPGPPPASEPTINSLPSVKITPQHLTNDMSQCTVCMEEFIVGGDATEL 

PCKHIYHKDCIVPWLRLWSCPICRRDI.PLVNTVAESRERSNPIRQDMPERRRPRWMQLG 

NIWPFRARYQRVS PEETANQNPRDNRS * 
>G961 (1..1200) 

ATGTCAAAATCTATGAGCATATCAGTGAACGGACAATCTCAAGTGCCTCCTGGGTTTAGG 
TTTCATCCGACCGAGGAAGAGCTGTTGCAGTATTATCTCCGGAAGAAAGTTAATAGCATC 
GAGATCGATCTTGATGTCATTCGCGACGTTGATCTCAACAAGCTCGAGCCTTGGGACATT 
CAAGAGATGTGTAAAATAGGAACAACGCCAC7VAAACGACTGGTATTTCTTTAGCCACAAG 
GACAAAAAATATCCGACGGGAACGAGAACTAACAGAGCCACTGCGGCTGGATTTTGGAAA 
GCAACTGGCCGCGACAAGATCATATATAGCAATGGCCGTAGAATTGGGATGAGAAAGACT 
CTTGTTTTCTACAAAGGCCGAGCn'CCTCACGGCCAAAAATCTGATTGGATCATGCATGAA 
TATAGACTCGATGACAACATTATTTCCCCCGAGGATGTCACCGTTCATGAGGTCGTGAGT 
ATTATAGGGGAAGC^TCACAAGACGAAGGATGGGTGGTGTGTCGTATTTTCAAGAAGAAG 
AATCTTCACAAAACCCTAAACAGTCCCGTCGGAGGAGCTTCCCTGAGCGGCGGCGGAGAT 
ACGCCGAAGACGACATCATCTCAGATCTTCAACGAGGATACTCTQGACCAATTTCTTGAA 
CTTATGGGGAGATCTTGTAAAGAAGAGCTAAATCTTGACCCTTTCATGAAACTCCCAAAC 




'TTTTTCT 
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CTCGAAAGCCCTAACAGTCAGGCAATCAACAACTGCCACGTAAGCTCTCCCGACACTAAT 

CATAATATCCACGTCAGCAACGTGGTCGACACTAGCTTTGTTACTAGCTGGGCGGCTTTA 

GACCGCCTCGTGGCCTCGCAGCTTAACGGACCCACATCATATTCAATTACAGCCGTCAAT 

GAGAGCCACGTGGGCCATGATCATCTCGCTTTGCCTTCCGTCCGATCTCCGTACCCCAGC 

CTAAACCGGTCCGCTTCGTACCACGCCGGTTTAACACAGGAATATACACCGGAGATGGAG 

CTATGGAATACGACGACGTCGTCTCTATCGTCATCGCCTGGCCCATTTTGTCACGTGTCG 

AATGTTTTGCTGCTTGTTTGTCTCCTTCGTCTGCAGCTTCAGTTCTGGCCGTTCCAACCA 

TGGCAGAGGCAGGTTCATTTCGATCTTTCATCGCCTCAGATGCAGATCTCTCTCCATTGA 

>G961 Amino Acid Sequence (conserved domain in AA coordinates: 15-140) 

MSKSMS I SVNGQSQVPPGFRFHPTEEELLQYYLRKKVNS I EIDLDVIRDVTJLNKLEPWDI 

QEMCKIGTTPQNDWYFFSHKDKKYPTGTRTNRATAAGFWICATGRDKIIYSNGRRIGMRKT 

LVFYKGRAPHGQKSDWIMHEYRLDDNIISPEDVTVHEWSIIGEASQDEGWWCRIFKKIC 

NLHKTLNSPVGGASLSGGGDTPKTTSSQIFNEDTLDQFLELMGRSCKEELNLDPFMKLPN 

LESPNSQAINNCHVS S PDTNHNIHVSlSrVVDTSFVTS WAALDRLVASQLNGPTS YS I TAVN 

E SHVGHDHLALPS VRS PYPSLNRS AS YHAGLTQEYTPEMELWNTTTS SLS S S PGPFCHVS 

NVLLLVCLLRLQLQFWPFQPWQRQVHFDLSSPQMQISLH* 

>G1465 (163. .1125) 

TATCCTTCGCAAGACCCTTCCTCTATATAAGGAAGTTCATTTCATTTGGAGAGGACACGC 
TGACAAGCTGACTCTAGCTTATCTGGTACCGTCGACCTCATTCTTGCGTTTGATCTTTCT 
TTCTCTAGATCCCATATTTTTCTTGATCAATTTAGTTTCATTATGGAGGAAGATGCAGCT 
TTTGATCTACTCAAAGCCGAACTCTTAAACGCAGAAGACGATGCAATAATCTCACGTTAT 
CTGAAGCGTATGGTCGTCAACGGAGACTCATGGCCTGATCACTTCATCGAAGACGCAGAC 
-GTGTTC^CAAGAATCCAAATGTGGAGTTCGATGCTGAGAGCCCTAGCTTCGTGATAGTT 
AAACCTCGAACAGAGGCTTGTGGTAAAACCGATGGATGTGAAACTGGTTGCTGGAGGATC 
ATGGGTCGTGATAAACCGATAAAATCGACGGAGACTGTGAAGATTCAAGGGTTCAAGAAG 
ATTCTCAAGTTCTGCCTAAAGAGGAAACCTAGAGGATACAAGAGAAGTTGGGTAATGGAA 
GAGTATAGGCTTACCAATAACTTGAACTGGAAGCAAGATCATGTGATTTGCAAGATTCGG 
TTTATGTTTGAAGCTGAAATCAGTTTCTTGCTAGCCAAGCATTTCTACACTACATCAGAA 
TC^CTTCCTCGAAATGAGCTGTTGCCAGCTTACGGATTCCTTTCATCAGATAAGCAATTG 
GAGGATGTATCTTATCCGGTGACGATAATGACTTCTGAAGGAAACGATTGGCCTAGCTAC 
GTTACCAACAATGTGTATTGTCTGCATCC^TTGGAGCTCGTTGATCTTCAAGATCGGATG 
TTTAATGATTACGGAACCTGCATCTTCGCTAACAAGACTTGTGGTAAAACCGATAGATGC 
ATTAATGGTGGTTACTGGAAAATTTTGCACCGTGATAGGCTGATCAAGTCAAAGTCCGGG 
ATAGTTATTGGTTTCAAGAAGGTGTTTAAGTTTCATGAAACGGAGAAAGAAAGATACTTC 
TGTGGTGGAGAAGATGTGAAGGTAACTTGGACTCTAGAAGAGTATAGGCTTAGCGTGAAG 
CAGAATAAATTCTTGTGCGTTATCAAGTTTACTTATGATAACTAAGAATCTTTTCTTTGG 
ATTTTATGATCATCTTAGTATCGCGACCGCTCTAGACAGGCCTCGTACCGGATCCTCTAG 
CTAGAGCTTTCGTTCGTATCATCGGTTTCGACAACG 

>G1465 Amino Acid Sequence (conserved domain in AA coordinates: 242-306) 

MEEDAAFDLLKAELLNAEDDAIISRYLKRMVTOGDSWPDHFIEDADVFNKNP 

PSFVIVTCPRTEACGKTDGCETGCWRIMGRDKPIKSTETVKIQGFKKILKFCLKRKPRGYK 

RSWVMEEYRLTNNLNWKQDHV^ 

SSDKQLEDVSYPVTIMTSEGNDWPSYVTSmVYCM 

GKTDRCINGGYWKILHRDRLIKSKSGIVIG^ 

YRLSVKQNKFLCVI KFTYDN* 

>G425 (45.. 1196) 

GAAAACAGTCTTCTCTTCTCCGATCCCAAAAACGCAGGAAAACAATGTCGTTTAACAGCTCCC 

ACCTCCTTCCTCCA^AAGAAGACCTTCCTCTCCGACACTTCACCGATCAATCACAGCAACCTC 

CGCCGCAGCGTC^CTTCTCTGAAACACCTTCGCTTGTCACCGCCAGTTTCCTCAACCTCCCTA 

CCACCCTTACCACTGCGGATTCCGATCTCGCTCCTCCGCACCGCAACGGAGACAATTCCGTT 

GCTGATACAAACCCACGCTGGCTCTCCTTTCATTCGGAGATGCAAAATACTGGAGAAGTACG 

TTCTGAAGTTATCGACGGAGTCAACGCCGATGGTGAAACGATACTCGGCGTTGTAGGAGGT 

GAAGATTGGCGGAGTGCTAGCTATAAGGCGGCGATTTTAAGACATCCGATGTACGAGCAGC 

TTCTTGCGGCTCACGTGGCTTGCCTTAGGGTTGCGACTCCCGTTGACCAGATTCCGAGGATC 

GATGCTCAGCTCAGTCAGTTGCATACCGTCGCCGCGAAATACTCCACTCTTGGTGTGGTTGTT 

GACAACAAGGAACTTGATCATTTCATGTCACATTATC 

ACTCCAACACCACGTTTGTGTCCATGCAATGGAAGCCATTACGGCTTGTTGGGAGATTGAACA 
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>G2069 (1..1026) 

ATGGAAGGAGGAGGAAGAGGACCAAATCAAACGATTCTCAGTGAAATAGAACATATGCCT 
GAAGCTCCACGTCAACGTATCTCTCATCACCGTCGAGCTCGCTCTGAAACCTTCTTCTCC 
GGCGAATCAATCGACGATCTCCTCTTATTCGATCCTTCCGATATCGATTTCTCTTCTCTA 
GACTTCCTCAACGCTCCACCACCACCACAACAATCACAACAACAACCGCAAGCTTCTCCC 
ATGTCCGTTGATTCGGAAGAAACCTCATCGAACGGTGTTGTTCCTCCTAATTCTCTTCCT 
CCAAAACCCGAAGCTAGATTCGGTCGCCATGTTCGTAGCTTCTCGGTTGATTCCGATTTC 
TTCGATGATTTGGGTGTTACTGAGGAGAAGTTXATAGCTACAAGTTCAGGAGAGAAGAAG 
AAAGGGAATCATCATCATAGCAGGAGTAATTCTATGGATGGAGAGATGAGTTCGGCGTCG 
TTTAATATCGAATCGATTTTAGCTTCTGTGAGTGGTAAAGATAGTGGGAAGAAGAATATG 
GGTATGGGTGGTGATAGACTTGCTGAGCTTGCTTTGCTTGATCCTAAAAGAGCTAAAAGG 
ATTTTAGCGAATAGACAATCTGCGGCGAGGTCGAAAGAGAGGAAGATTAGGTATACTGGT 
GAGTTAGAGAGGAAGGTTCAGACACTTCAGAATGAAGCTACTACATTGTCTGCTCAAGTC 
ACTATGTTACAGAGAGGAACATCAGAGCTGAACACTGAAAATAAACACCTCAAAATGCGG 
CTTCAAGCTTTAGAGCAACAAGCTGAACTTAGGGATGCTTTGAATGAAGCGCTGCGGGAT 
GAACTGAACCGACTTAAGGTGGTAGCTGGAGAAATTCCTCAGGGGAATGGAAATTCTTAC 
AACCGTGCTCAATTCTCATCTCAGCAATCGGCAATGAATCAGTTTGGGAACAAAACGAAC 
C^CAGATGAGTACAAACGGGCAGCCATCGCTCCGAA.GCTACATGGATTTCACGAAGAGA 
GGCTGA 

>G2069 Amino Acid Sequence (domain in AA coordinates: TBD) 

MEGGGRGPHQTILSEIEHMPEAPRQRISHHRRARSETFFSGESIDDLLLFDPSDIDFSSL 

DFjmPPPPQQSQQQPQASPMSTOSEETSSNGVVPP 

FDDLGVTEEKFIATSSGEKKKGNHHHSRSNSm 

GMGGDRItAELALLDPKRAKRIIiANRQSAARSKERKIRYTGELER 

TMLQRGTSELNTENKHLKMRLQALEQQM 

in^QFSSQQSAMNQFGNKTNQQMSTNGQPSLPSYMDFTKRG* 

>G1852 (55.. 1857) 

CATCTGATCTGCTCTCGAAGACGAAAGCTTCGAGTACTGGTTGAAGCTAAAGCTATGGGA 

CACGTGAATCTACCTGCATCAAAGCGTGGTAACCCTCGTCAATGGCGTCTCCTCGACATC 

GTAACCGCTGCTTTCTTCGGTATCGTACTTCTCTTCTTCATCCTTTTATTCACTCCTCTT 

GGTGATTCCATGGCGGCTTCTGGTCGGCAAACGCTGCTTCTCTCTACGGCGTCAGATCCG 

AGGCAACGGC^GCGATTAGTG&CTTTGGTTGAAGCTGGTCA^ 

TATTGTCCTGCGGAAGCTGOTGCTGATATGC 

CTTAGTAGAGAGATGAATTTCTATAGGGAGAGACATTGTCCTTTGCCTGAGGAGACTCGG 
CTCTGTTTGATTCCTCCGCCTTCTGGTTATAAAATTCCTGTTCCGTGGCCTGAGAGTCTT 
CACAAGATTTGGCATGCAAACATGCCATATAACAAAATTGCTGACCGGA2\AGGTCATCAA 
GGATGGATGAAAAGGGAAGGGGAATACTTTACTTTCCCAGGCGGTGGCACGATGTTTCCT 
GGCGGAGCTGGCCAATACATTGAAAAGCTOGCACAGTATATTCCGCTTAATGGTGGAACT 
TTGAGAACTGCTCTTGACATGGGATGCGGGGTAGCTAGTTTTGGAGGTACTCTACTATCT 
CAAGGCATTCTAGCCCTCTCATTTGCTC CAAGAGATTCACATAAATCTCAAATT CAGTTC 
GCTTTOGAAAGAGGAGTGCCTGCIATTTGTTGCC^TGCTTGGCACTCGTAGACTCCCCTTT 
CCTG^TACTCCTTTGACCTGATGCACTGTTCCCGATGTTTGATTCCTTTTACGGCTTAC 
AATGCAACTTACTTCATCGAAGTAGATAGGTTACTGCGCCCTGGAGGATATCTTGTAATC 
TCTGGCCCACCTGTACAATGGCCTAAACAAGACAAAG 

GCTAGAGCTTTGTGCTATGAGCTAATTGCGGTTGATGGAAACACTGTCATCTGGAAGAAG 

CCTGTTGGAGATTCATGTCTACCTAGCCAGAATGAGTTTGGGCTTGAGTTGTGTGATGAG 

TCTGTTCCGCCAAGTGATGC^TGGTATTTTAAATTGAAGAGGTGTGTTACCAGGCCATCA 

TCCGTCAAAGGAGAACACGCTTTGGGAACTATATCCAAGTGGCCGGAGAGGCTTACTAAA 

GTTCCTTCTAGGGCCATTGTCATGAAAAACGGATTGGATGTGTTTGAAGCAGATGCAAGG 

CGGTGGGCAAGACGCGTTGCTTATTACAGGGATTCT^ 

ACTGTCCGCAATGTCATGGACATGAACGCATO 

TCTGATCCTGTGTGGGTTATGAATGTCATTCCAGCT 

ATTTATGAGAGAGGTCTCATCGGTGTTTAC^ 

CCCCGC^CGTATGATTTC^TCCATGTATC^ 

TCAAGCAAATCGAGGTGTAGCCTAGTAGATCTAATGGTAGAGATGGACAGAATATTACGT 
CCAG&AGGAAAGGTTGTGATCCX3AGACTCTCCTC 

GCTC^TGCTGTAAGATGGTCTTCTTCCATACACGAGAAAGAACCTGAATCCCATGGAAGA 
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YDMPSSDGTGGYSGWTSESVQGSNPGGVFTMWNE* 
>G761 (521. .1549 ) 

GGGGCCGACCGGCCGCCCGGGCAGGTCTAGGTTCAAAAGGACTCACAAGAGAGAGATAGT 

ATGATTGATAGGGAAAGAGAGAGAGATGAAAGAAAGTAAAATATATAATAGATTATTAGG 

ACACGAGTGTCATCTTTTGATTTGTGTCTTGTGTGCTCTCTCTTTCTTCTCTTCCTCGAA 

TGATCATCTTTATATAACCCTACTCTCTTTCTCTTTTCCCATTCTTTCATATCATTCTCC 

CTTTCTCTCTCGGGATCTGATCTCTCTTTCCAGTAACCTATTCCCGAGGAGCACTGTCAA 

ATCTTGTCCACTCTTTGATCTTATCTCGATCTCTTTCTCTTTCTAGTCTTGTGTAGTCTT 

CAAACTTGTGATGTTATCTATATAGTAATCACGAGAGAGAATCATACAATAGCTGAAACA 

TAAAGCTTTCTTAGAAGCTTTAAAAAGGTCTCATCTGGATTATCCTGTTTAATTTCTAGA 

GTTTCTTCAGGCAGATTATTAACCGATCAAGAAGACAAACATGAATTCATTTTCCCACGT 

CCCTCCGGGTTTTAGATTTCACCCGACAGATGAAGAACTTGTAGACTACTACCTGAGGAA 

AAAAGTCGCATCGAAGAGAATAGAAATTGATTTCATAAAGGACATTGATCTTTACAAGAT 

TGAGCCATGGGACCTTCAAGAGTTGTGCAAAATTGGGCATGAAGAGCAGAGTGATTGGTA 

CTTCTTTAGCCATAAAGACAAGAAGTATCCCACAGGGACTCGAACCAATAGAGCAACAAA 

AGCAGGGTTTTGGAAAGCCACCGGAAGAGATAAGGCTATCTATTTGAGGCATAGTCTAAT 

TGGCATGAGGAAAACACTTGTGTTTTACAAGGGAAGAGCCCCAAATGGACAAAAGTCTGA 

TTGGATCATGCACGAATACCGCTTAGAAACCGATGAAAACGGAACTCCTCAGGAAGAAGG 

ATGGGTTGTGTGTAGGGTTTTCAAGAAGAGATTGGCTGCAGTTAGACGAATGGGAGATTA 

CGACTCATCCCCTTCACATTGGTACGATGATCAACTTTCTTTTATGGCCTCCGAGCTCGA 

GACAAACGGTCAACGACGGATTCTCCCCAATCATCATCAGCAGCAGCAGCACGAGCACCA 

ACAACATATGCCATATGGCCTCAATGCATCTGCTTACGCTCTCAACAACCCTAACTTGCA 

ATGCAAGCAAGAGCTAGAACTACACTACAACCACCTGCAATCAAATATCGCGCATGAGGA 

ACAATTGAATCAAGGAAATCAGAACTTCAGCTCTCTATACATGAACAGCGGCAACGAGCA 

AGTGATGGACCAAGTCACAGACTGGAGAGTTCTCGATAAATTTGTTGCTTCTCAGCTAAG 

CAACGAGGAGGCTGCCACAGCTTCTGCATCTATACAGAATAATGCCAAGGACACAAGCAA 

TGCTGAGTACCAAGTTGATGAAGAAAAAGATCCGAAAAGGGCTTCAGACATGGGAGAAGA 

ATATACTGCTTCTACTTCTTCGAGTTGTCAGATTGATCTATGGAAGTGAGCTGAAAGAGA 

AGACATATAAATGCATATATACATATATATATATACGTACACACGAACACTAATCAAGTG 

TAGATGATGATGATGGTACAGATTTATATTTGCTTTGATTGATTCTTACTACATTATTGA 

ACTTATGTCATATGCATATATACATTGCGTATCTATGCATATTTATACTTGTACTCAATA 

TGATTAACCATATATAAACTCTAATCTAAATGTAACTCCAATATTTTTTAAATAGACAAT 

TGTCTCTTCTTATTAGAAAAAAAA 

>G761 Amino Acid Sequence (domain in AA coordinates: 10-156) 

MNSFSHVPPGFRFHPTDEELVDYYLRKKVASKRIEIDFIKDIDLYKIEPWDLQELCKIGH 

EEQSDWYFFSHKDKKYPTGTRTNRATKAGFWKATC 

PNGQKSDWIMHEYRLETOENGTPQEEGWWCR^ 

FMASELETNGQRRILPNHHQQQQHEHQQHMPYGLNASAYA^ 

SNIAHEEQLNQGNQNFSSLYI^SGNEQVI^QVTDWRVLDKFVASQLSNEEAATAS 

NAKDTSNAE YQVDEEKDPKRASDMGEE YTASTS S S CQIDLWK* 

>G1056 (10.. 798) 

GCTACATATATGGGTTCTATTAGAGGAAACATTGAAGAGCCTATATCTCAGTCATTAACG 

AGGCAGAACTCTCTCTATAGCTTAAAGCTCCATGAGGTTCA7UVCCCACTTAGGAAGTTCT 

GGAAAACCACTAGGAAGCATGAACCTTGATGAGCTTCTCAAGACTGTCTTGCCACCAGCT 

GAGGAAGGGCTTGTTCGTCAGGGAAGCTTGACGTTACCTCGAGATCTCAGTAAAAAGACA 

GTTGATGAGGTCTGGAGAGATATCCAACAGGACAAGAATGGAAACGGTACTAGTACTACT 

ACTACTCATAAGCAGCCTACACTCGGTGAAATAACACTTGAGGATTTGTTGTTGAGAGCT 

GGTGTAGTGACTGAGACAGTAGTCCCTCAAGAAAATGTTGTTAACATAGCTTCAAATGGG 

CAATGGGTTGAGTATCATCATCAGCCTCAACAACAACAAGGGTTTATGACATATCCGGTT 

TGCGAGATGCAAGATATGGTGATGATGGGTGGATTATCGGATACACCACAAGCGCCTGGG 

AGGAAAAGAGTAGCTGGAGAGATTGTGGAGAAGACTGTTGAGAGGAGACAGAAGAGGATG 

ATCAAGAACAGAGAATCTGCAGCACGTTCACGAGCTAGGAAACAGGCTTATACACATGAA 

TTAGAGATCAAGGTTTC^GGTTAGAAGAAGAAAACGAAAAACTTCGGAGGCTAAAGGAG 

GTGGAGAAGATCCTACCAAGTGAACCACCACCAGATCCTAAGTGGAAGCTCCGGCGAACA 

AACTCTGCTTCTCTCTGATCCTAAAGACTCTTCTTTCTTTCTTCTTCTT^ 

ATATCAGACCGCTTTGTTCTTTGTATATTGTGTAGACTTTATTGACTTTGAACAGCATGT 

CTTTATAAACATTTCTTGAGTGT 
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>G1056 Amino Acid Sequence (domain in AA coordinates 183-246) 

MGSIRGNIEEPISQSLTRQNSLYSLKLHEVQTHLGSSGKPLGSMNLDELLKTVLPPAEEG 

LVRQGSLTLPRDLSKKTVDEVWRDIQQDKNGNGTSTTTTHKQPTLGEI.TLEDLLLRAGW 

TETWPQENWNIASNGQWVEYHHQPQQQQGFMTYPVCEMQDMVMMGGLSDTPQAPGRKR 

VAGE IVEKTVERRQKRMI KNRE S AARSRARKQAYTHELE I KVSRLEEENEKLRRLKEVEK 

ILPSEPPPDPKWKLRRTNSASL* 

>G1447 (82 . . 1086) 

AAAAACCCTAACCCTAATTCTCTCAAGACAACTCAAAGGTCTCTCCTTTTTTAGGTTTAT 

TATCACTTCCGTATAATCGCCATGTCTTCTCTACCATGGAAAAAACCAAAATCGAGTCGA 

ATCTTAAGATTCATTTCTGAGTTTCAACAATCACCGTTCGTTGAAACTGGCTTTCCAACT 

TCTCTGATCGATCTCTTCTTCAAGAATCGCGATCGTCTAAAAAAATCTCCATCTAAACGC 

TTCCAACGAATCGAACGCCAGATTCGAACCGCTCCAAACGCTTCTTCGTTGAGTAATCAA 

GATACGATTTTTGAAAAGCCCTCGAGGATTAAAACCGTTCGAAGTAAGGTCGAGAAAGTT 

AATTGCGTTAAAGGTAAATCAGCGGCGTTGAAGAAGAACGCGATTAAAAATAGCGTTTTC 

GGCGGTAGCGGTGAGGTCGTTTTGATGGCGTTTAAGGTTTTGATAGTAGCGTTGCTCGCC 

TTGAGCACGAAGAAGAAGCTCACTTTAGGAATCACTCTCTCTGCCTTCGCTCTTCTCTTA 

ACAGAGCTCGTGGCGGCGCGTGTTTTCACGCGCTCTAATAACACCGACAAAGACAAAAAC 

GCGATTGCCCGCGAGAAAATCGAAACTTTTGATGAAACTCGAGTTCCCAAAGCGATTCCA 

TGTCCTGAGGAAACAGAGCATGTAGTATCTGAAACAGAGGTTTCGAAGTTGAAAGGTTTA 

ACGATACGTGATCTGTTGTCAAAGGACGAGAAATCAACAAGTAAAAGTTGGAGACTAAAA 

TCGAAGATTGTGAAGAAGTTGAGGAGTTACAATAAGAAGGATAAGAAGACGATGAAGATC 

AAAGAAGAGTCTTTGATTGAAGTCTCGAGTTTGGTTTTAGAAGATAAACCAAAGAAAATT 

GAGTCTGAGAGAGACGAAGAAGAAACGTTCAATCCTCCAGTGGTTGGATCAAACCTGAAT 

GGGATTGTTCTGATCGTGATTGTGCTAACCGGTTTGTTATGTGGGAAGGTCTTAGCTATT 

GTTCTGACACTATCATGTTTGGTTCTTAGATTAGGAGCAGTCAAAAAAGTTAATCTTTGC 

ATATAATTTTTTTTGTATTTTTTAACATGCTTGCATGTGAAACTGTAAATTTTTCTCATT 

CATATGAAGGAGATTGGATTGAATGTTGAATACTAAA 

>G1447 Amino Acid Sequence (domain in AA coordinates: 3-54, 124-156) 

MSSIiPWKKPKSSRILKFISEFQQSPFVETGFPTSLIDLFFKNRDRLKKSPSKRFQRIERQ 

IRTAPNASSLSNQDTIFEKPSRIKTVRSKVEKVNCVKGKSAALKKNAIKNSVFGGSGEVV 

LMAFKVLIVALLALSTKKKLTLGITLSAFALLLTELVAARVFTRSNNTDKDKKA1AREKI 

ETFDETRVPKAIPCPEETEHWSETEVSKLKGLTIRDLLSKDEKSTSKSWRLKSKIVKKL 

RSYNKKDKKTMKIKEESLIEVSSLVLEDKPKKIESERDEEETLNPPWGSNLNGIVLIVI 

VLTGLLCGKVLAI VLTLSCLVLRIiGAVKKVNLCI * 
>G323 (77.-826) 

CTGCTCATATCAGCCATTGACACAGTTGCTTTGGGTTTCCCTCAAACGGCGCCGATTGTC 

TGGATTTTGACCACTGATGGCCTTAGATCmATCTTTTGAAGATGCTGCTTTACTTGGAGA 

ACTCTATGGAGAAGGTGCATTTTGTTTCAAGAGCAAGAAACCTGAACCCATTACAGTCTC 

GGTTCCTTCTGATGATACTGATGATTCGAATTTTGACTGCAATATTTGCTTAGACTCGGT 

GCAAGAACCTGTTGTGACTCTCTGTGGTCACCTCTTTTGCTGGCCTTGTATTCACAAATG 

GCTTGATGTACAGAGCTTCTCAACAAGTGATGAATACCAAAGACATAGACAGTGTCCTGT 

TTGTAAATCTAAAGTTTCTCATTCTACTTTGGTTCCTTTGTATGGTAGAGGCCGTTGTAC 

TACTCAGGAGGAAGGTAAAAACAGTGTGCCTAAAAGACCCGTAGGACCGGTTTATCGGCT 

TGAAATGCCGAATTCACCTTATGCAAGTACTGATCTGCGGTTATCACAACGGGTTCATTT 

CAATAGCCCACAGGAAGGTTACTACCCTGTCTCAGGGGTGATGAGCTCGAACAGTTTATC 

ATACTCTGCTGTTTTGGATCCGGTGATGGTGATGGTTGGAGAAATGGTAGCTACGAGGTT 

GTTTGGAACACGAGTGATGGATAGATTTGCGTATCCGGACACTTACAATCTCGCAGGGAC 

TAGCGGGCCGAGGATGAGAAGGCGGATAATGCAGGCAGATAAATCGCTGGGAAGAATCTT 

CTTCTTCTTTATGTGTTGTGTTGTTCTGTGTCTTCTCTTGTTTTAGGTTTTCATAGCTAG 

CTTGGTTCTGCTACTGTTCAGTTTCTTCAGG 

>G323 Amino Acid Sequence (conserved domain in AA coordinates : 48 -96) 

MALDQSFEDAAIjLGELYGEGAFCFKSKKPEPITVSVPSDI)TDDSNFDCNICLDSVQEPVV 

TLCGHLFCWPCIHKWLDVQSFSTSDEYQRHRQCPVCKSKVSHSTLVPLYGRGRCTTQEEG 

KNSVPKRPVGPVYRLEMPNSPYASTDLRLSQRVHFNSPQEGYYPVSGVMSSNSLSYSAVL 

DPVMVMVGEMVATRLFGTRVMDRFAYPDTYNLAGTSGPRMRRRIMQADKSLGRIFFFFMC 

CWLCLLLF* 
>G176 (41.. 1606) 
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AGAAGAAGAAGAAGAAGAGTACCTCATACGTAAACCATTGATGGGCTCTTTTGATCGCCA 

AAGAGCTGTTCCGAAATTCAAAACAGCAACACCGTCACCGCTCCCTCTTTCTCCTTCGCC 

TTACTTCACTATGCCTCCTGGCCTTACTCCCGCCGACTTTCTCGACTCTCCTCTTCTCTT 

CACTTCCTCCAACATTTTGCCGTCTCCTACGACAGGCACATTTCCAGCGCAATCTCTGAA 

CTATAACAATAACGGTTTGCTCATTGACAAAAATGAAATCAT^ATATGAAGACACAACTCC 

TCCCTTGTTCCTACCATCTATGGTAACTCAGCCTTTACCTCAACTGGATTTATTCAAATC 

CGAAATCATGTCGAGTAACAAAACCTCTGATGACGGCTACAATTGGCGCAAATACGGGCA 

GAAGCAAGTCAT^AGGAAGCGAAAACCCGAGGAGTTACTTCAAATGCACGTATCCAAATTG 

TCTCACAAAGAAGAAAGTAGAGACGTCTCTTGTGAAGGGTCAGATGATTGAGATTGTCTA 

TAAAGGAAGCCACAATCATCCCAAGCCCCAATCCACGAAGCGATCATCTTCCACCGCTAT 

AGCAGCACATCAGAACAGCAGTAATGGAGACGGTAAAGACATTGGTGAAGATGAAACAGA 

GGCCAAGAGATGGAAAAGAGAAGAGAATGTGAAGGAGCCAAGAGTGGTGGTTCAGACAAC 

AAGTGATATAGACATTCTTGACGATGGCTACAGATGGAGAAAGTATGGTCAGAAAGTCGT 

CAAGGGTAATCCAAATCCAAGGAGCTATTACAAGTGCACATTTACAGGATGTTTTGTAAG 

GAAACACGTTGAAAGAGCATTTCAAGATCCCAAGTCAGTGATCACAACTTACGAAGGAAA 

ACACAAACACCAAATCCCGACCCCAAGAAGAGGTCCAGTTTTAAGATCTGCTGCAATGGC 

TTCTCCTCTTCTCCCAACTTCGACTACTCCTGATCAACTTCCCGGCGGCGATCCACAGTT 

GCTGAGCTCTCTACGCGTCCTCTTGTCCCGCGTTCTAGCCACCGTCCGTCACGCTTCTGC 

AGATGCCAGACCCTGGGCAGAGCTCGTTGACCGGTCAGCGTTTTCCCGGCCACCATCGCT 

CTCGG AG GC AACGTCACGAGTAAGGAAGAACTTTTC CTATTTC CGAG C CAATTACATAAC 

CTTAGTGGCAATCTTACTCGCCGCGTCTCTGCTCACGCACCCTTTCGCTCTCTTCCTCCT 

CGCATCGCTGGCCGCTTCTTGGCTTTTCCTCTACTTTTTCCGTCCGGCGGATCAGCCGTT 

GGTCATTGGAGGACGCACGTTCTCCGATCTTGAGACGCTAGGGATACTCTGCCTGTCCAC 

TGTGGTGGTGATGTTCATGACCAGCGTTGGATCGCTCTTGATGTCCACTCTAGCAGTTGG 

GATCATGGGCGTGGCCATCCACGGAGCGTTTCGTGCTCCCGAAGACCTGTTTCTTGAAGA 

ACAAGAAGCCATTGGATCTGGACTTTTCGCATTCTTCAACAACAATGCCTCTAATGCAGC 

TGCCGCTGCCATAGCCACCTCAGCAATGTCACGCGTTCGAGTCTGAGATTGTTGAAGAGA 

CTACATTCCTACACCGCATTTCCAAAGTGTGATATTTATTCATATTGAATTGTT 

>G176 Amino Acid Sequence (domain in AA coordinates: 117-173,234-290) 

MGSFDRQRAVPKFKTATPSPLPLSPSPYFTMPPGLTPADFLDSPLLFTSSNILPSPTTGT 

FPAQSLNYN1TOGLLIDKNEIKYEDTTPPLFLPSMVTQPLPQLDLFKSEIMSSNKTSDDGY 

NWRKYGQKQVKGSENPRSYFKCTYPNCLTKKKVETC 

RSSSTAIAAHQNSSNGDGKDIGEDETEAKRWKREENVKEPRWVQTTSDIDILDDGYRWR 

KYGQKWKGNPNPRSYYKCTFTGCFWKHVERAFQDPKSVITTYEGKHKHQIPTPRRGPV 

LRSAAI^SPLLPTSTTPDQLPGGDPQLLSSLRVLLSRVLATVRHASADARPWAELVDRSA 

FSRPPSLSEATSRVRKNFSYFRANYITLVAILLAASIiLTHPFALFLLASLAASWLFLYFF 

RPADQPLVIGGRTFSDLETLGILCLSTVVVMFMTSVGSLLMSTLAVGIMGVAIHGAFRAP 

EDLFLEEQEAIGSGLFAFFNNNASNAAAAAIATSAMSRVRV* 

>G174 (194.. 1585) 

CCCAATTTGAGATTGTTCGATTTCGATCTACGAGATTCTTACAAGAACATAAGCAGCTTC 
GGTTTTTTGGGATTATCTTATTTGGTCGGATGATGATCTTCTCGATGTCTGTGCTAGGCT 
TTGGGAATTAGATATATTTGGGGTTAAGCTCGAGTCTCTCCGGTTTTGAGTTTACTTGAG 
TTTGTTAGTATTTATGGCTGAGGTGGGAAAAGTTCTGGCTAGTGATATGGAGTTAGACCA 
TTCAAATGAGACTAAAGCAGTGGATGATGTTGTTGCCACTACTGATAAAGCGGAGGTCAT 
ACCAGTGG CTGTAACTAGAACTGAAAC CGTTGTTG AAAGTTTGGAATCTACTGACTGTAA 
GGAGCTTGAAAAACTTGTTCCACATACGGTAGCTTCGCAGTCGGAAGTAGATGTTGCTTC 
CCCGGTATCCGAGAAAGCACCGAAGGTTTCTGAAAGTAGCGGTGCATTATCTTTGCAGTC 
TGGTTCGGAAGGGAMAGTCCTTTTATTCGTGAGAAGGTTATGGAAGACGGATACAACTG 
GCGGAAATATGGACAGAAACTTGTGAAAGGAAATGAGTTTGTAAGGAGCTATTACAGGTG 
CACTCACCCTAACTGCAAAGCGAAAAAACAGTTGGAACGGTCTGCGGGTGGACAAGTCGT 
GGATACCGTTTACTTTGGGGAACATGATCACCCAAAGCCTCTTGCTGGTGCTGTTCCTAT 
CAATCAGGATAAGCGAAGTGATGTCTTCACAGCTGTTAGTAAAGAGAAAACATCTGGATC 
CAGTGTTCAGACACTTCGTCAAACCGAACCACCAAAGATCCATGGAGGATTACATGTTTC 
AGTTATTCCACCAGCTGATGATGTGAAAACTGATATTTCACAATCAAGTAGGATAACGGG 
GGACAACACTCACAAGGATTATAATAGTCCTACCGCAAAGCGAAGGAAGAAAGGAGGGAA 
CATTGAGCTGAGTCCAGTGGAGAGGTCAACCAATGATTCACGCATTGTGGTTCACACTCA 
GACTCTGTTTGATATTGTGAATGATGGGTACCGATGGCGTAAATATGGTCAGAAATCAGT 
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AAAAGGCAGCCCATATCCAAGGAGCTACTATAGATGTTCAAGCCCTGGATGCCCCGTCAA 
GAAACACGTAGAGAGGTCATCTCATGACACAAAGTTGCTTATAACAACTTACGAGGGAAA 
ACACGACCACGATATGCCTCCAGGAAGAGTTGTTACTCATAATAACATGCTGGACTCGGA 
AGTTGATGATAAAGAAGGAGATGCCAACAAGACTCCACAGAGCTCAACTCTTCAATCCAT 
TACAAAAGACCAGCATGTCGAAGATCACTTAAGAAAGAAAACGAAGACTAATGGCTTTGA 
GAAAAGTCTTGATCAAGGTCCAGTTTTGGATGAGAAGCTGAAGGAGGAAATAAAAGAGAG 
ATCAGATGCAAACAAAGATCACGCAGCCAATCACGCCAAGCCGGAAGCAAAGTCAGATGA 
TAAAACCACTGTTTGTCAAGAGAAGGCAGTAGGAACCCTGGAGAGCGAGGAACAAAAACC 
CAAGACAGAGCCTGCCCAAAGCTAAGCATTCAGTGTTGTACCGAGTGGTAATTTATATGG 
CTCTTTTAACATAGATTAGTACAGGCGATATGGTTATAGACTGTACAGTTGTTGTTCAGG 
CGGGACCAGATTTAGATTAGTGTTTAATGGAATAGTATGCTTTAATACCTTTATGTAACC 
ACTTCCATTTGGTTCAAATAAGAGTTACAGGAAGAGAAGGTAACACAACAAGAGCCCTTC 
TTTGTTGATGGAGCCTGTGTAATAGTTGTAGCATGGGGATGTATATGATTTGATTCAACC 

TTATTAATGGTTATGAGACAAAACTATC 

>G174 Amino Acid Sequence (domain in AA coordinates: TBD) 
^aevgkJlasdmeld^^^ 

Sphtvasqsevdvaspvsekapkvsessgalslqsgsegnspfirekvmedgynwrkyg 

q^vkgnefvrsyyrcthpnckakkqlersaggqv^^ 
rSvftavskektsgssvqtlrqteppkihgglhvsvippaddvktdisqssritgdnth 

SraSPTAKSKKGGNIELSPVERSTNDSRIVVHTQTLFDIVNDGYRWRKyGQKSVKGSP 

SSyyrSpgc^ 

EGDANKTPQSSTLQSITKDQHVEDHLRKKTKTNGFEKSLDQGPVLDEKLKEEIKERSDAN 
KDHAANHAKPEAKSDDKTTVCQEKAVGTLESEEQKPKTEPAQS* 

MGGATACCAlcAicCAGCAACCACCTCCCTCCGCCGCCGGAATCCCTCCTCCACCACCT 
GGAACCACCATCTCCGCCGCAGGAGGAGGAGCTTCTTACCACCACCTTCTCCAACAACAA 
CAACAACAGCTCCAACTATTCTGGACCTACCAACGCCAAGAGATCGAACAAGTTAACGAT 
TTCAAAAACCATCAGCTTCCACTAGCTAGGATAAAAAAGATCATGAAAGCCGATGAAGAT 
GTTCGTATGATCTCCGCAGAAGCACCGATTCTCTTCGCGAAAGCTTGTGAGCTTTTCATT 
CTCGAGCTCACGATCAGATCTTGGCTTCACGCTGAGGAGAATAAACGTCGTACGCTTCAG 
AAAAACGATATCGCTGCTGCGATTACTAGGACTGATATCTTCGATTTCCTTGTTGATATT 
GTTCCTAGAGATGAGATTAAGGACGAAGCCGCAGTCCTCGGTGGTGGAATGGTGGTGGCT 
CCTACCGCGAGCGGCGTGCCTTACTATTATCCGCCGATGGGACAACCAGCTGGTCCTGGA 
GGGATGATGATTGGGAGACCAGCTATGGATCCGAATGGTGTTTATGTCCAGCCTCCGTCT 
CAGGCGTGGCAGAGTGTTTGGCAGACTTCGACGGGGACGGGAGATGATGTCTCTTATGGT 
AGTGGTGGAAGTTCCGGTCAAGGGAATCTCGACGGCCAAGGGTAA 

>G715 Amino Acid Sequence (domain in AA coordinates: 60-132) 
MDTNNQQPPPSAAGIPPPPPGTTISAAGGGASYHHLLQQQQQQLQLFWTYQRQEIEQWD 
FKNHQLPLARIKKIMKADEDVRMISAEAPIIjFAKACEIiFILELTIRSWIjHAEENKRRTLQ 
KND IAAAITRTDI FDFLVD I VPRDE IKDEAAVLGGGMWAPTASGVP YYYPPMGQP AGPG 
GMMIGRPAMDPNGVYVQPPSQAWQSVWQTSTGTGDDVSYGSGGSSGQGNLDGQG* 

ATCTGAAGTGAACCAAGCTCAGGTTTTGTCTTCTCTTTGATC^^ 

TAAATTAGAGTTATATCCTTTATAAAGGATTTTGCTTTTTC^CCAAGAAACCCTAAATTC 

GGTGTCTCAGCAAGAATCACGTGATTCTCGTTCCTCTTCCTCACGAAACCCATCATCTTC 

TATCTCATTTGAGAAATGGGTCAAAAGTTTTGGGAGAATCAAGAAGATCGAGCGATGGTT 

GAATCCACCATAGGCTCTGAAGCTTGCGACTTTTTCATCTCAACAGCTTCAGCTTCCAAC 

ACTGCCTTGTCCAASCTTGTCTCACCACCAAGTGATTCCAATCTCCAACAAGGGTTACGT 

CACGTTGTTGAAGGATCTGATTGGGATTATGCTCTTTTCTGGCTAGCGTCCAACGTTAAT 

AGCTCTGATGGTTGTOTCTTGATCTGGGGAGATGGTCATTGCCGTGTCAAAAAGGGTGCT 

TCAGGTGAGGATTACTCTCAGCAAGATGAGATCAAAAGACGTGTGCTTCGCAAGCTTCAC 

TTGTCGTTCGTTGGTTCAGATGAAGATCATCGTTTGGTGAAATCAGGAGCTCTTACTGAT 

CTCGACATGTTTTATCTGGCTTCTTTGTACTTTTCCTTTAGGTGTGATACCAATAAGTAC 

GGTCCTGCTGGAACCTATGTGTCTGGGAAGCCTCTTTGGGCTGCAGATTTGCCTAGCTGC 

TTGAGTTATTATAGGGTTAGGTCTTTCTTAGCTAGGTCAGCTGGTTTTCAGACTGTGTTG 

TCTGTACCAGTGAATTCTGGAGTTGTGGAGCTTGGTTCTTTAAGACATATTCCAGAAGAT 

AAGAGTGTGATTGAGATGGTGAAATCAGTGTTTGGTGGGTCTGACTTTGTTCAGGCTAAA 
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GAAGCTCCTAAAATCTTTGGTCGACAGCTGAGTCTTGGTGGAGCAAAACCTCGGTCTATG 

AGTATTAATTTCTCCCCGAAGACCGAGGATGACACGGGTTTCTCATTGGAATCGTATGAG 

GTGCAAGCGATCGGAGGCTCTAATCAAGTGTATGGTTATGAGCAAGGGAAAGATGAGACA 

TTGTATCTAACTGACGAGCAAAAGCCGAGGAAGAGAGGGAGAAAACCAGCAAATGGAAGA 

GAAGAGGCTCTAAACCATGTGGAAGCGGAACGGCAGAGGAGGGAGAAGCTGAACCAGAGA 

TTCTACGCTTTGAGAGCGGTGGTGCCTAACATCTCCAAGATGGACAAGGCTTCGCTCCTT 

GCAGACGCAATCACTTACATCACGGATATGCAGAAGAAAATCAGGGTGTATGAAACAGAG 

AAGCAGATAATGAAGAGGAGGGAGAGTAATCAGATAACTCCAGCAGAGGTTGATTATCAA 

CAGAGGCATGATGATGCAGTTGTAAGGCTAAGCTGTCCGTTGGAAACTCATCCAGTTTCA 

AAGGTGATACAAACGTTGAGGGAGAATGAAGTTATGCCTCATGATTCCAACGTGGCCATC 

ACAGAGGAGGGTGTGGTTCACACATTCACTCTCCGGCCTCAGGGTGGCTGCACCGCTGAG 

CAGTTGAAGGACAAGCTCCTTGCCTCTCTATCACAGTAACTATCACAGCAGTAACTGCTA 

TGTAATAAGTGTAACCGTGTTGGAGGTTGTATCAATGTACTATTGCAAGCCAACCAAAAA 

AAACTCCAGCTTAGTAGGATCGTGTAATTTTCCTTATATGTAATGTTGAGATTTGTCTTT 
TACATATAAAGATTTGA 

>G58 8 Amino Acid Sequence (domain in AA coordinates: 3 09-376) 

MGQKTWENQEDRAMVESTIGSEACDFFISTASASNTALSKLVSPPSDSNLQQGLRHWEG 

SDWD YALFWLASNVNS SDGCVLI WGDGHCRVKKGASGEDYSQQDE I KRRVLRKLHLS FVG 

SDEDHRLVKSGALTDLDMFYLASLYFSFRCDTNKYGPAGTYVSGKPLWAADLPSCLSYYR 

WSFLARSAGFQTVLSVPWSGWELGSLRHIPEDKSVIEIWKSVFGGSDFVQAKEAPKI 

FGRQLSLGGAKPRSMSINFSPKTEDDTGFSLESYEVQAIGGSNQVYGYEQGKDETLYLTD 

EQKPRKRGRKPANGREEALNHVEAERQRREKLNQRFYALRAWPNISKMDKASLLADAIT 

YITDMQKKIRVYETEKQIMKRRESNQITPAEVDYQQRHDDAWRLSCPLETHPVSKVIQT 

LRENEVMPHDSNVAITEEGWHTFTLRPQGGCTAEQLKDKLLASLSQ* 

>G1758 (69..G77) 

GTCCCTCCTCTTAGCTTCAACCGCCGGAAAAACTAAACAACCTTCTTGGAAAAAAAGAGA 
AACTAAAAATGAACTATCCTTCAAACCCTAACCCTAGCTCCACAGATTTCACTGAATTTT 
TCAAGTTCGATGATTTTGACGATACTTTTGAGAAGATCATGGAAGAAATCGGCCGTGAGG 
ACCACTCGTCGTCACCGACTTTGAGTTGGAGTTCATCGGAAAAGTTAGTGGCTGCAGAAA 
TCACAAGCC CGCTTCAAACAAG C CTAGCTACCTCACCTATGAGCTTTGAAATAGGTG ACA 
AAGATGAAATCAAAAAGAGGAAGAGACACAAAGAAGATCCGATTATTCACGTCTTCAAAA 
CGAAATCATCAATTGATGAAAAGGTTGCTTTAGATGATGGGTATAAATGGAGGAAATACG 
GAAAGAAGCCGATAACGGGTAGTCCATTTCCAAGGCATTATCACAAGTGTTCGAGCCCAG 
ATTGCAACGTGAAGAAGAAGATCGAAAGAGATACGAACAATCCAGATTACATATTGACAA 
CATACGAAGGTAGACATAACCACCCAAGCCCTTCTGTAGTTTATTGTGATTCAGACGACT 
TTGATCTTAACTCTCTCAACAATTGGTCCTTTC^ 

ATTCTGCTCCATATTGATCGATCGTAGTTACAAGTTTGTGTATATAGATGTATATATATA 
TATCACCAATTCACCATCGTAATCACGTCTCACATGTAACTACGTACATATATCTTGTTC 
GGGGTTCGTTTTGTAATGTATTGAATTGGTGGAGGTAGAATGGAAGTCATCTTGTATAGT 
TGTACTTGTATGTAAGGTTTGATAGTCATTTTTTATAAAGTAACTAATTTGTACAA 
>G1758 Amino Acid Sequence (domain in AA coordinates: TBD) 

MNYPSNPNPSSTDFTEFFKFDDFDDTFEKIMEEIGREDHSSSPTLSWSSSEKLVAAEITS 
PLQTSLATS PMS FE I GDKDE IKKRKRHKEDP I IHVFKTKS S IDEKVALDDG YKWRKYGKK 

PITGSPFPRHYHKCSSPDCNVKKKIERDTNNPDYILTTYEGRHNHPSPSVVYCDSDDFDL 
NSLNNWSFQTANTYSFSHSAPY* 

>G2X48 (66.. 737) 

GTCTCTAATATAAGCTTGAACGTTGCTATATATAAATGTAAAGGCGAACGCATAAGAAAA 

GAAAAATGGAGAATGAAGCTTTTGTAGATGGTGAATTGGAGTCTCTTTTGGGGATGTTCA 

ACTTTGATCAATGTTCATCTAACGAATCGAGCTTTTGCAATGCTCCAAATGAGACTGATG 

TTTTCTCTTCTGATGATTTCTTCCCATTTGGTACAATTCTGCAAAGTAACTATGCGGCCG 

TTCTTGATGGTTCCAACCACCAAACGAACCGAAATGTCGACTCAAGACAAGATCTGT^ 

AACCAAGGAAGAAGCAAAAGTTAAGCTCGGAAAGCAATTTGGTTACCGAGCCTAAGACTG 

CTTGGAGAGATGGTCAAAGCCTAAGCAGTTATAATAGTTCAGATGATGAAAAGGCTTTAG 

GTTTAGTGTCTAATACATCAAAAAGCCTAAAACGCAAAGCGAAAGCCAACAGAGGGATAG 

CTTCCGATCCTCAGAGCCTATACGCTAGGAAACGAAGAGAAAGGATAAACGATAGGCTAA 

AGACATTGCAGAGCCTAGTTCCTAATGGGACAAAGGTCGATATAAGCACAATGCTGGAAG 

ATGCTGTCCATTACGTGAAGTTCCTGCAGCTTCAAATCAAGCTCTTGAGTTCAGAAGATC 
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TATGGATGTATGCACCTCTTGCTCACAATGGTCTGAATATGGGACTACATCACAATCTTT 
TGTCTCGGCTTATTTAAGACAAAATCATTGGAATAACATAACTTACAGTACTTGTTTTTT 
TTCTCGTTCTATATTCATGATTATGGTTATTTTTTGTTTGAGTTGTTCAATTTTTCTGTC 
TATTGCGTTCTATGAACTTGACACTCTTTTTGTAATTATTATATGCTAAAGACAATTTGG 
ACTAACAGCATTTTAATAAAAAAAAAAAA 

>G2148 Amino Acid Sequence (conserved domain in AA coordinates : 130-268) 

MENEAFVDGELESLLGMFNFDQCSSNESSFCNAPNETDVFSSDDFFPFGTILQSNYAAVL 

DGSNHQTNRNVDSRQDLLKPRKKQKLSSESNLVTEPKTAWRDGQSLSSYNSSDDEKALGL 

VSNTSKSLK^KAKANRGIASDPQSLYARKKRERINDRLKTLQSLVPNGTKVDISTMLEDA 

VH YVKFLQLQI KLLS S EDL WMY APL AHNGLNMGLHHNLL S RL I * 

>G2379 (52.. 798) 

CGCCGTCACTCTCCTCCCGGTGCCGCACATTAGCAACACTACTCCCGACGAATGGAGACG 
ACGACGCCGCAGTCAAAATCAAGTGTGTCCCACCGACCGCCGTTGGGAAGAGAAGACTGG 
TGGAGTGAGGAAGCGACGGCGACGCTGGTAGAAGCCTGGGGCAATCGTTACGTCAAGCTG 
AACCACGGAAATCTCCGGCAGAATGACTGGAAAGACGTCGCCGACGCCGTTAACTCTAGA 
CACGGTGATAACAGCCGTAAGAAGACCGACTTACAGTGTAAGAACCGGGTCGATACTTTG 
AAGAAGAAGTACAAAACAGAGAAAGCTAAACTCTCGCCGTCGACTTGGCGTTTCTATAAC 
CGCCTCGATGTTCTAATCGGTCCCGTTGTGAAGAAATCGGCTGGCGGAGTTGTCAAATCA 
GCGCCTTTTAAGAATCATCTGAATCCAACTGGATCGAACTCTACTGGAAGCTCTCTTGAA 
GATGATGATGAGGATGATGATGAGGTTGGTGATTGGGAATTCGTTGCTAGGAAGCATCCT 
CGTGTGGAAGAGGTAGATCTGAGTGAAGGATCAACGTGTAGGGAACTAGCTACGGCGATT 
CTCAAGTTTGGAGAAGTTTACGAGAGAATTGAAGGGAAGAAGCAACAGATGATGATTGAG 
TTGGAGAAGCAGAGAATGGAAGTGACAAAGGAGGTAGAGTTAAAACGAATGAACATGTTG 
ATGGAGATGCAGTTAGAGATTGAGAAATCAAAGCACCGGAAACGCGCAAGTGCTTCAGGT 
AAGAAGAACTCACATTAGG 

>G2379 Amino Acid Sequence (domain in AA coordinates : 19-110 , 173-232) 

METTTPQSKSSVSHRPPLGREDWWSEEATATLVEAWGNRYVKLNHGNLRQNDWKDVADAV 

NSRHGDNSRKKTDLQCKRRVDTLKKK^ 

VKSAPFKNHLNPTGSNSTGSSLEDDDEDDDEVGDWEFVARKHPRVEEVDLSEGSTCRELA 

TAILKFGEWERIEGKKQQMMIELEKQRMEWKEra^ 

ASGKKNSH* 

>G1462 (63.. 1031) 

CGTCGACCATTCTTGCGATTGATCTTTCTCTAGATAATTTTTTTGATCGATTTAGTTTCA 

TTATGGAGGACGACGACGCAGCTTATGATCTAATCAAACACGAACTGTTATACTCAGAAG 

ACGAAGTAATAATCTCACGTTATCTGAAGGGTATGGTCGTTAACGGAGATTCTTGGCCAG 

ATCACTTCATCGAAGACGCAAACGTGTTCACCAAGAATCCAGATAAGGTGTTCAATTCTG 

AGAGACCTAGATTCGTGATCGTTAAACCACGAACAGAGGCTTGTGGTAAAACCGATGGAT 

GTGATTCGGGTTGCTGGAGGATCATTGGTCGTGATAAACTGATAAAGTCGGAGGAGACTG 

GGAAGATTCTAGGGTTCAAGAAGATACTCAAGTTTTGCCTAAAGAGGAAACCTATAGACT 

ACAAGAGAAGTTGGGTAATGGAAGAGTATAGGCTTACCAATAACTTGAACTGGAAGCAAG 

ATCATGTGATTTGCAAAATTCGGTTTATGTTTGAAGCTGAAATTAGTTTCTTGCTAAGCA 

AGCATTTCTACACTACATCAGAATCGGTTCTTGAAAATGAGCTGTTGCCATCTTATGGAT 

ATTATTTATCCAATACACAAGAGGAGGATGAATTTTATCTGGACGCGATAATGACTTCGG 

AAGGAAACGAGTGGCCTAGCTACGTTACCAACAACGTGTACTGTCTGCATCCATTGGAGC 

TTGTGGATCTTCAAGATCGGATGTTTAATGATTACGGAACCTGCATCTTCGCTAACAAGA 

CTTGTGGTGAAACTGATAAATGCGATGGTGGTTACTGGAAGATCCTGCACGGTGATAAGC 

TGATCAAGTCAAATTTCGGAAAGGTCATTGGTTTCAAGAAGGTATTTGAGTTCTATGAAA 

CGGTGAGACAAATATATCTTTGTGATGGAGAAGAAGTGACGGTAACTTGGACTATACAAG 

AGTATAGGCTTAGCAAAAACGTGAAGCAGAATAAAGTGTTGTGCGTTATCAAGTTGACTT 

ATGATAGATAGGATACTTTACTTTGGTTTTTGTGATCATCTTAGTATCTTACGAATATTC 

TAGATACACACATCTATAGGCGACCGCTCTAGACAGGCCTCGTACCG 

>G1462 Amino Acid Sequence (domain in AA coordinates: TBD) 

MEDDDAAYDLIKHELLYSEDEVIISRYLKGMVWGDSWPDHFIEDAWFTKNPDKVFNSE 

RPRFVIVXPRTEACGKTDGCDSGCWRIIGRDKLIKSEETGKILGFKKILKFCLKRKPIDY 

KRSWVWEEYRLTl^LNWKQDHVICKIRFMFEAEISFLLSKHFYTTSESVLENELLPSYGY 

YLSOTQEEDEFYLDAIMTSEGNEWPSYVTNNVYCLHPLELVX)LQDRMFN^ 

CGETDKCDGGYWKILHGDKLIKSNFGKVlGFKKVFEFYETWQlYIiCDGEEVTVTWTIQE 
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YRLSKNVKQNKVLCVI KLTYDR* 
>G1211 (44.. 1120) 

TGAAACCTAGATTTCTGCAACTGAATTCCTAATTCGAAAAAGAATGGAGGGTTCGTCGTC 

GACGATAGCAAGGAAGACATGGGAACTAGAGAACAGCATTCTAACAGTAGACTCACCTGA 

TTCAACCTCCGACAACATCTTCTACTACGACGATACTTCACAGACTAGGTTCCAGCAAGA 

GAAACCGTGGGAGAATGATCCTCACTACTTTAAACGAGTCAAGATCTCAGCGCTCGCTCT 

TCTTAAGATGGTGGTTCACGCTCGCTCTGGTGGTACAATTGAAATAATGGGTCTTATGCA 

AGGTAAGACCGATGGTGATACTATCATTGTTATGGATGCTTTTGCTTTACCAGTGGAAGG 

TACTGAGACAAGGGTTAATGCTCAGGATGATGCTTATGAGTACATGGTTGAGTATTCACA 

GACCAACAAGCTCGCGGGGCGGCTGGAGAATGTTGTTGGATGGTATCACTCTCACCCTGG 

ATATGGATGCTGGCTCTCCGGTATTGATGTTTCTACGCAGACGCTTAACCAACAGCATCA 

GGAGCCATTTTTAGCTGTTGTTATTGATCCCACAAGGACTGTTTCAGCTGGTAAGGTTGA 

GATTGGTGCTTTCAGAACATACTCTAAAGGATATAAGCCTCCAGATGAACCTGTTTCTGA 

GTATCAAACTATTCCTTTAAATAAGATTGAGGACTTTGGTGTTCACTGCAAACAGTACTA 

TTCATTAGATGTCACTTATTTCAAGTCATCTCTTGATTCTCACCTTCTGGATCTACTATG 

GAACAAGTACTGGGTGAACACTCTTTCTTCTTCTCCACTGCTGGGTAATGGAGACTATGT 

TGCTGGACAAATATCAGACTTAGCTGAGAAGCTTGAGCAAGCCGAGAGTCATCTGGTTCA 

GTCTCGCTTTGGAGGAGTTGTGCCATCATCCCTTCATAAGAAAAAAGAAGATGAGTCTCA 

ACTAACTAAGATAACTCGGGATAGCGCAAAGATAACTGTGGAACAGGTCCATGGACTAAT 

GTCGCAGGTCATAAAAGATGAATTATTCAACTCAATGCGTCAGTCCAACAACAAATCTCC 

CACTGACTCGTCGGATCCAGACCCTATGATTACATATTGAAGTTGCTCTTCTTTTGGTTT 

CTANTTTTGGATTGACCCATCATTTGTTGTCCTTTCATTTATTTTCTGTTGTGTAAAGAA 

TTATAATGNCGNCGCGAATTCGCGGCCGCTAAAAAAANACAGGAAATTGAAAANAATTCN 

NCCATTCCAACATCTTTATTTAATATTATCTCCTCNATTATATAATATTCAAACATCCCT 

ANTANCTTCATTTGACCGTCCCCCTCCCTCCCGTGTTGCNTTGGTGCTGGCCCC 

>G1211 Amino Acid Sequence (domain in AA coordinates: 123-179) 

MEGSSSTIARKTWELENSILTVDSPDSTSDNIFYYDDTSQTRFQQEKPWENDPHYFKRVK 

I SALALLKMVVHARSGGTIEIMGLMQGKTDGDTI ^ 

MVEYSQTNKLAGRLENWGWYHSHPGYGCWLSGIDVSTQTL^ 

SAGKVEIGAFRTYSKGYKPPDEPVSEYQTIPLNKIEDFGVHCKQYYSLDVTYFKSSLDSH 
LLDLLWNKYWVOT^ 

KEDESQLTKITRDSAKITVEQVHGLMSQVIKDELFNSMRQSNNKSPTDSSDPDPMITY* 
>G1048 (5. .892) 

GACCATGGCGGAGGAATTTGGAAGCATAGATTTACTCGGAGATGAAGATTTCTTCTTCGA 
TTTCGATCCTTCAATCGTAATTGATTCTCTTCCGGCGGAGGATTTTCTTCAGTCTTCACC 
GGATTCATGGATCGGAGAAATCGAGAATCAATTGATGAACGATGAGAATCATCAAGAGGA 
GAGTTTTGTGGAATTGGATCAGCAATCGGTTTCAGATTTCATAGCGGATCTACTCGTTGA 
TTATCCAACTAGCGATTCTGGCTCCGTTGATTTGGCGGCTGATAAAGTTCTAACCGTCGA 
TTCTCCCGCCGCCGCTGATGATTCCGGGAAGGAGAATTCGGATTTGGTTGTTGAGAAGAA 
GTCTAATGATTCTGGTAGCGAGATTCATGATGATGATGACGAAGAAGGAGACGATGATGC 
TGTGGCTAAAAAACGAAGAAGGAGAGTAAGAAATAGAGATGCGGCGGTTAGATCGAGAGA 
GAGGAAGAAGGAATATGTACAAGATTTAGAGAAGAAGAGTAAGTATCTCGAAAGAGAATG 
CTTGAGACTAGGACGTATGCTTGAGTGCTTCGTTGCTGAAAACCAGTCTCTACGTTACTG 
TTTGCAAAAGGGTAATGGCAATAATACTACCATGATGTCGAAGCAGGAGTCTGCTGTGCT 
CTTGTTGGAATCCCTGCTGTTGGGTTCCCTGCTTTGGCTTCTGGGAGTAAACTTCATTTG 
CCTATTCCCTTATATGTCCGACACAAAGTGTTGCCTCCTACGTCCAGAACCAGAAAAGCT 
GGTTCTAAACGGGCTCGGGAGTAGTAGCAAACCGTCTTATACCGGCGTTAGTCGGAGATG 
TAAGGGTTCGAGGC€TAGGATGAAATACCAAATCTTAACCCTTGCGGCGTGACAACGCCT 
TTTTTAACTGCTTCTTTTGCGCATTTTGAGTTGTAGATGAGTGTCTTTTAGTTTTCTCTC 
TCTTGTTTTGTATTTCGCTGTTGAAAGTTTTCTGTCTAATATCGATAAGTTAACAGTGAA 
AAAAAAAAAAAAAAA 

>G1048 Amino Acid Sequence (domain in AA coordinates 138-190) 

MAEEFGSIDLLGDEDFFFDFDPSIVIDSLPAEDFLQSSPDSWIGEIENQLMNDENHQEES 

FVELDQQSVSDFIADLLVDYPTSDSGSVDLAADK^ 

NDSGSEIHDDDDEEGDDDAVAKiaiRRRVRNRDAAWSRERKKEYVQDLEKKSKYLK 
RLGRMLECFVAENQS IjRYCLQKGNGNNTTMMS KQESAVLLIJSSLLLGSLLWLLGVNF I CL 
FPYMSHTKCCLLRPEPEKLVLNGLGSSSKPSYTGVSRRCKGSRPRMKYQILTLAA* 
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>G986 (31. .846) 

CATTAAATTGGCTCCTGTGAACCTAAATTTATGGACTATGATCCCAACACCAATCCGTTC 
GACCTTCATTTCTCCGGTAAACTTCCGAAAAGAGAAGTCTCGGCTTCAGCTTCTAAAGTT 
GTAGAGAAGAAATGGTTAGTGAAAGATGAGAAGAGAAATATGCTACAAGATGAAATAAAC 
CGGGTTAATTCGGAGAACAAGAAGCTAACCGAAATGTTAGCAAGAGTCTGTGAGAAGTAC 
TATGCTCTTAATAATCTTATGGAGGAGTTGCAGAGTCGAAAGAGTCCTGAAAGTGTTAAC 
TTTCAGAACAAACAGCTAACGGGGAAACGAAAACAAGAACTTGATGAGTTTGTTAGCTCC 
CCAATTGGACTCAGTCTCGGACC7VATCGAGAACATCACCAACGATAAAGCGACGGTTTCA 
ACCGCTTACTTTGCTGCTGAGAAGTCTGACACAAGCTTGACTGTGAAAGATGGATATCAA 
TGGAGGAAATACGGGCAAAAGATTACGAGAGATAATCCATCTCCTAGAGCTTACTTCAGA 
TGCTCGTTTTCACCGTCTTGTCTAGTCAAGAAGAAGGTGCAACGAAGTGCAGAAGATCCA 
TCTTTCTTGGTAGCCACTTACGAAGGGACACATAACCACACCGGACCACATGCAAGTGTG 
TCCAGGACAGTGAAACTTGATCTAGTTCAAGGTGGGCTTGAACCAGTTGAGGAAAAGAAA 
GAGAGAGGGACGATTCAAGAGGTTTTGGTGCAACAAATGGCTTCTTCGTTGACCAAAGAT 
CCTAAGTTCACTGCAGCTCTTGCGACTGCTATTTCCGGGAGATTGATAGAGCATTCAAGA 
ACATGTU^AGTTCTCTAGAACATGTATATTTCTGTTTTGTTCTATTTTGTTGCTCATTCCT 
AGTAAAAAGGTAAAGATTTGTTTGATCTTGATTAGGAGGCATAGATGTCAATTTTAATGT 
GTGTGTATATAATTACATCAAATCTAAGTATCCAAAAAGGGTCACCCCCATTTTATCTTA 
TG 

>G986 Amino Acid Sequence (domain in AA coordinates: 146-203) 
MDYDPNTNPFDLHFSGKLPKREVSASASKWEKKWLVro 

EMLARVCEKYYALNWLMEELQSRKSPESVNFQNKQLTGKRKQELDEFVSSPIGLSLGPIE 
NITOTKATVSTAYFAAEKSDTSLTVKDGYQWRKYGQKITRDNPSPRAYFRCSFSPSCLVK 
KKVQRSAEDPS FLVAT YEGTHNHTGPHAS VS RTVKLDLVQGGLE PVE EKKERGT I QEVLV 
QQMASSLTKDPKFTAALATAI SGRLIEHSRT* 
>G789 (259.. 1593) 

GGCAAGAAGAACCTTAGCCTCTCTTTCTTCTTTCTCTCTCTCTCTCTCTGTGGTACTGTT 

CTGTTTCAACTTTACTCCCTCAGTTTCAGAACAATTCCCTATCTAGAAGAGAGATAAAAC 

CGAGAAGGTTTTGGAGATAGAATCTTTTGTTCTTCTTTTGTCCCTCCTTGCTCGATTTTT 

GTTACGTGTGAAGCAATAAAAAAAAACTGATATAGCTAAATCTTCCATCCATTCAGAGGC 

TTCTAAATCTGATCTGACATGGAACAAGTGTTTGCTGATTGGAATTTTGAAGATAATTTT 

CACATGTCCACTAATAAAAGATCAATCAGACCAGAAGATGAATTAGTGGAGCTATTGTGG 

AGAGATGGTCAAGTGGTTTTACAAAGCCAAGCTCGTAGAGAACCGTCAGTCCAAGTCCAA 

ACCCACAAACAAGAAACCCTAAGAAAACCCAACAATATTTTTCTTGACAACCAAGAAACA 

GTACAAAAGCCTAACTACGCTGCTCTAGATGATCAAGAAACCGTCTCCTGGATACAATAC 

CCTCCGGATGACGTCATCGACCCTTTCGAATCCGAGTTCTCCTCTCATTTCTTCTCTTCG 

ATCGATCACCTCGGAGGTCCTGAGAAGCCACGAACGATCGAAGAGACAGTTAAGCATGAG 

GCTCAAGCCATGGCTCCTCCTAAGTTTAGATCCTCGGTTATAACAGTCGGACCGAGTCAT 

TGCGGCAGCAACCAGTCAACAAATATTCATCAGGCCACTACACTTCCGGTTTCTATGAGT 

GATAGAAGCAAGAACGTCGAAGAAAGACTTGACACTTCGTCAGGTGGCTCCTCCGGTTGC 

AGCTATGGAAGGAAC7VACAAAGAAACCGTTAGTGGAACAAGTGTAACCATTGACCGTAAA 

AGAAAACATGTTATGGATGCTGATCAAGAATCTGTGTCTCAATCAGATATAGGTTTGACC 

TCAACCGATGATCAAACCATGGGTAACAAATCGAGCCAACGGTCAGGATCTACTCGAAGA 

AGCCGTGCAGCTGAAGTTCATAATCTCTCAGAAAGGAGGAGGAGAGATCGGATCAATGAA 

AGAATGAAAGCTCTTCAAGAACTCATACCTCACTGCAGCAGAACAGATAAAGCTTCGATA 

TTGGATGAAGCAATTGATTACTTAAAATCACTTCAAATGCAACTCCAAGTGATGTGGATG 

GGAAGTGGAATGGCGGCGGCGGCAGCAGCAGCAGCAAGTCCGATGATGTTTCCCGGGGTA 

CAATCATCTCCATACATTAATCAGATGGCTATGCAAAGTCAGATGCAATTGTCTCAATTC 

CCGGTTATGAACCGGTCCGCTCCGCAGAACCATCCCGGTTTAGTATGTCAAAACCCGGTA 

CAGTTGCAGCTCCAAGCACAGAACCAAATCTTATCGGAGCAGCTCGCTAGGTACATGGGC 

GGGATTCCCCAGATGCCGCCGGCGGGAAATCAGATGCAGACCGTGCAACAACAACCAGCG 

GACATGTTGGGATTTGGATCTCCGGCGGGACCGCAAAGTCAACTGTCGGCACCGGCGACC 

ACCGACAGTCTTCATATGGGTAAAATAGGCTGACTTGGCATATAGTTTTCCTCCGAAATT 

ATTCTTCTTACAGTTGGTGATTGTTATTTATTTTTGGTCGCCTAAGCAAGCATAAAAGCT 

AAGTCAAATGTATTATAGAGATCTAATAAGTTAGTCTCATACTTATAACTTATTTTTAAA 

CAGTTGAATTATAGTATCAATCAAGTGTTGGGAACCTAAAGATCATACATGTGTCAATAC 

TTTTATATTTGTTCTCAAGGTTCATCAGAAAAACAAAATAAAAAGGATAGACTAGGCCTG 
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CATTTGACATTATCATGGGCTTTTTTGGGTCTATGAATATGAACATTAACCCC ' 

>G789 Amino Acid Sequence (domain in AA coordinates; 253-313) 

MEQVFADWWFEDNFHMSTNKRSIRPEDELVELIiWRDGQWLQSQARREPSVQVQTHKQET 

LRKPNNIFLDNQETVQKPNYAALDDQETVSWIQYPPDDVIDPFESEFSSHFFSSIDHLGG 

PEKPRTIEETVKHEAQAMAPPKFRSSVITVGPSHCGSNQSTNIHQATTLPVSMSDRSKNV 

EERLDTSSGGSSGCSYGRNWKETVSGTSVTIDRKRKHVMDADQESVSQSDIGLTSTDDQT 

MGNKSSQRSGSTRRSRAAEVHNLSERRRRDRINERMKALQELIPHCSRTDKASILDEAID 

YLKSLQMQLQVMWMGSGMAAAAAAAASPMMFPGVQSSPYINQMAMQSQMQLSQFPVMNRS 

APQNHPGLVCQNPVQLQLQAQNQILSEQLARYMGGIPQMPPAGNQMQTVQQQPADMLGFG 

SPAGPQSQLSAPATTDSLHMGKIG* 

>G2085 (1..930) 

ATGTTTGGTCGCCATTCGATTATCCCAAATAACCAGATTGGTACCGCCTCTGCTTCCGCT 
GGTGAAGACCATGTCTCTGCCTCCGCTACGTCTGGTCACATTCCTTACGACGATATGGAA 
GAAATCCCTCATCCTGACTCTATCTATGGTGCTGCCTCCGATTTGATTCCCGATGGCTCT 
CAATTGGTTGCTCACCGATCCGATGGCTCTGAATTACTTGTTTCTCGGCCACCGGAAGGG 
GCGAATCAGCTTACGATCTCGTTCCGTGGACAAGTTTACGTTTTTGATGCCGTTGGTGCT 
GACAAGGTGGATGCTGTGTTGTCGCTGTTGGGTGGTTCTACTGAGCTTGCTCCTGGTCCG 
CAGGTGATGGAACTAGCTCAACAGCAGAATCATATGCCTGTTGTAGAATATCAGAGCCGC 
TGTAGCCTTCCGCAACGGGCACAATCCTTGGATAGGTTTCGGAAGAAGAGGAATGCTAGA 
TGTTTCGAGAAGAAAGTAAGATACGGTGTTCGCCAAGAAGTTGCCTTAAGAATGGCACGT 
AATAAAGGTCAATTCACCTCTTCAAAGATGACAGATGGGGCTTATAACTCTGGCACAGAT 
CAAGATTCTGCCCAAGATGATGCCCATCCAGAAATATCGTGTACTCATTGCGGCATTAGT 
TCCAAATGTACACCAATGATGCGACGTGGCCCTTCCGGCCCCAGGACTCTCTGCAATGCC 
TGTGGACTTTTTTGGGCTAACAGGGGTACATTGAGGGATCTCTCAAAGAAAACAGAAGAG 
AATCAGTTGGCTTTAATGAAACCGGATGATGGTGGGAGTGTTGCTGATGCTGCTAACAAC 
TTAAACACTGAAGCTGCAAGTGTTGAAGAACACACTTCCATGGTTTCTCTTGCCAATGGG 
GATAATTCTAATCTGTTAGGTGATCACTAA 

>G2085 Amino Acid Sequence (domain in AA coordinates: TBD) 

MFGRHSIIPNNQIGTASASAGEDHVSASATSGHIPYDDMEEIPHPDSIYGAASDLIPDGS 

QLVTOIRSDGSELLVSRPPEGANQLTISFRGQVYVFDAVGADKVDAVLSLLGGSTELAPGP 

QVMELAQQQiraMPVVEYQSRCSLPQRAQSLDRFRKKRNARCFEKKVRYGTO 

NKGQFTS SKMTDGAYNSGTDQDSAQDDAHPE I SCTHCGI S S KCTPMMRRGPSGPRTLCNA 

CGLFWANRGTLRDLSKKTEENQLJ^MK^^ 

DNSNLLGDH* 

>G1783 (1..603) 

ATGGCCGCGTTTCCGCAGTGGACAAGGGTCGATGACAAACGTTTTGAGTTAGCTCTGCTT 
CAAATCCCGGAGGGTTCGCCGAATTTTATAGAGAATATCGCCTATTATCTCCAGAAACCG 
GTGAAGGAGGTGGAGTACTACTACTGCGCGTTGGTCCATGATATTGAGCGGATCGAATCG 
GGTAAGTATGTTTTGCCCAAATACCCGGAAGACGATTACGTGAAACTGACGGAAGCAGGT 
GAGTCTAAGGGCAATGGGAAAAAGACGGGAATTCCTTGGTCAGAAGAGGAACAGAGGTTG 
TTTCTGGAAGGACTAAATAAGTTTGGGAAAGGAGACTGGAAGAACATATCGAGGTATTGT 
GTGAAGTCAAGGACCTCGACGCAAGTGGCAAGCCATGCTCAGAAGTATTTTGCAAGGCAA 
AAGCAGGAGAGTACGAATACTAAACGCCCGAGTATTCATGACATGACTCTGGGAGTTGCG 
GTCAATGTCCCTGGATCCAACTTGGAGTCTACTGGCCAGCAACCACATTTTGGTGATCAA 
ATTCCTTCGAATCAATATTATCCCTCCCAGGAAAACTTTCGGGGTTTTGATCAGCGATGG 
TGA 

>G1783 Amino Acid Sequence (domain in AA coordinates: 81.. 129) 

MAAFPQWTRVDDKRFELALLQIPEGSPNFIENIAYYLQKPVKEVFIYYYCALVHDIERIES 

GKYVLPKYPEDDYVKLTEAGESKGNGKKl'GIPWSEEEQRLFLEGLNKFGKGDWKNISRYC 

VKSRTSTQVASHAQKYFJVRQKQESTNTKRPSIHDMTLGVAVNVPGSNLESTGQQPHFGDQ 

IPSNQYYPSQENFRGFDQRW* 

>G2072 (155.. 793) 

TCGACCCACGCGTCCGCCCACGCGTCCGGATCTTTTCACAGAAGACCZAACCAGCTTGGCT 

CGATGAGCTCCTAAGTGAGCCAGCATCACCTAAGATTAACAAAGGTCATAGACGTTCAGC 

TAGTGACACAGCTGCTTACTTGAACTCAGCTTTAATGCCTTCGAAGGAAAATCATGTTG^ 

TGGTTCGTCTTGGCAGTTCCAGAACTATGATTTGTGGCAGTCCAACTCTTATG 

CAATAAATTAGGATGGGATTTCTCTACAGCAAATGGAACTAATATCCAAAGAAATATGTC 
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ATGCGGAGCTTTAAATATGTCGTCGA7VACCCATTGAGAAACATGTAAGCAAAATGAAAGA 
AGGAACTTCTACAAAACCAGATGGTCCTAGATCAAAGACTGACTCAAAACGTATCAAACA 
TCAAAATGCTCATCGAGCGCGTTTGAGAAGGCTTGAGTACATATCAGACCTTGAAAGGAC 
CATCCAAGTGCTACAAGTTGAAGGATGTGAAATGTCATCTGCCATTCACTACTTGGATCA 
GCAGTTACTCATGCTTAGCATGGAAAATAGAGCTTTAAAACAACGTATGGATAGTTTAGC 
AGAAATCCAAAAGCTTAAACATGTGGAGCAGCAATTGCTTGAGAGAGAGATAGGAAACCT 
ACAGTTTCGACGACACCAACAACAACCACAGCAAAACCAAAAACAAGTCCAAGCAATACA 
AAATCGATACACCAAATATCAACCACCTGTTACACAAGAACCCGATGCCCAATTTGCAGC 
CTTGGCAATATGATTTAGGAAATATGGATACATTGTTCAGATTAAGCTGAGCTCCTCTTG 
CTCTACCTTAATGTCCATACAACATAGGTGAACTTGATGTTTGTAGCCTTGAATGAAAAC 
CTAAAAAAGCATCGTTATGTAAATCAAAATGTGGTTGCCCATATCCTCCTCTATTGCATT 
TCTCTCTATTTATGGCATGGTAGAGAATCTCTTGTCAAGAAACTTCATGTTATGTAATAA 
CTTGTAATCCTTCTTATTTCATCTATTATATATATGAAT7VAGTAATTTTTTTGCCAAAAA 
AAAAAAAAAAAAAAAAAAA 

>G2072 Amino Acid Sequence (conserved domain in AA coordinates : 90-149) 

MPSKENHVAGSSWQFQNYDLWQSNSYEQHNKLGWDFSTANGTNIQRl^SCGALNMSSKPI 

EKHVSKMKEGTSTKPDGPRSKTDSKRIKHQNAHRARLRRLEYISDLERTIQVLQVEGCEM 

SSAIHYLDQQLLMLSMENRALKQRMDSLAEIQKLKHVEQQLLEREIGNLQFRRHQQQPQQ 

NQKQVQAIQNRYTKYQPPVTQEPDAQFAALAI * 

>G931 (85.. 1071) 

GGAGGTTCTTTGACAGACACATGTATCATCAATCTTCTCTGTTGAAGCAGAGAGAGAGAG 
AGCTAATTGTTGCCTCTGAGTCACATGGATAAGAAAGTTTCATTTACTAGCTCTGTGGCA 
CATTCAACTCCACCATACCTTAGTACTTCCATCTCATGGGGACTTCCAACCAAATCCAAT 
GGTGTGACTGAATCACTGAGTTTGAAGGTGGTAGATGCAAGACCAGAACGTCTTATAAAC 
ACAAAGAATATCAGTTTCCAGGACCAGGATTCATCTTCAACTCTGTCCTCTGCTCAATCT 
TCTAACGATGTTACAAGTAGTGGAGATGATAACCCCTCAAGACAAATCTCATTTTTAGCA 
CATTCAGATGTTTGTAAAGGATTTGAAGAAACTCAAAGGAAGCGATTTGCAATTAAATCA 
GGCTCCTCCACGGCAGGAATCGCTGATATTCACTCTTCTCCTTCCAAGGCTAACTTCTCA 
TTTCACTATGCCGATCCACATTTTGGTGGTTTAATGCCTGCGGCTTACCTACCACAGGCA 
ACAATATGGAATCCCCAAATGACTCGAGTTCCGCTACCATTCGATCTCATAGAGAATGAG 
CCTGTCTTTGTCAATGCAAAGCAATTCCATGCAATTATGAGGAGGAGGCAACAGCGTGCT 
AAGCTAGAGGCGCAAAACAAACTAATCAAAGCCCGTAAGCCGTATCTTCATGAATCTCGA 
CATGTTCACGCTCTTAAACGACCTAGAGGATCTGGTGGAAGATTCCTAAACACCAAAAAG 
CTTCAAGAATCTACAGATCCAAAACAAGACATGCCAATCCAACAGCAACACGCAACGGGA 
AACATGTCAAGATTTGTGCTTTATCAGTTGCAGAACAGCAATGACTGTGATTGTTCAACC 
ACTTCTCGCTCTGACATCACATCTGCTTOTGACAGCGTTAATCTCTTTGGACACTCTGAA 
TTTCTGATATCAGATTGCCCATCTCAGACAAACCC^^ 

AATGACATGCATGGAGGTAGGAACACACACCATTTCTCTGTCCATATCTGAGCCGGTGGA 
ATCTGGTAATGTGTACGTTCCTACAAAAAAAGGGAAGTCATCCTTGGCTGCTACTTCGCT 
TATTAGCTAGTTCTTATTTCACACGCTTTGTCCAGATATC 

>G931 Amino Acid Sequence (domain in AA coordinates: TBD) 
MDKKVSFTSSVAHSTPPYLSTSISWGLPTKSNGVTES 

QDSSSTLSSAQSSNDVTSSGDDNPSRQISFLAHSDVCKGFEETQRKRFAIKSGSSTAGIA 
D IHS S PSKANFS FHYADPHFGGLMPAAYLPQATI WNPQMTRVPLPFDLIENE PVF VNAKQ 
FHAIMRRRQQRAiCLjEAQNKLIKARKPYLHESRHvl^ 

QDMPIQQQHATGNMSRFVLYQLQNSNDCDCSTTSRSDITSASDSVNLFGHSEFLISDCPS 
QTNPTMYVHGQSNDMHGGRNTHHFS VHI * 
>G278 (93.. 187*) 

TCGATCTTTAACCAAATCCAGTTGATAAGGTCTCTTCGTTGATTAGCAGAGATCTCTTTA 
ATTTGTGAATTTCAATTCATCGGAACCTGTTGATGGACACCACCATTGATGGATTCGCCG 
ATTCTTATGAAATCAGCAGCACTAGTTTCGTCGCTACCGATAACACCGACTCCTCTATTG 
TTTATCTGGCCGCCGAACAAGTACTCACCGGACCTGATGTATCTGCTCTGCAATTGCTCT 
CCAACAGCTTCGAATCCGTCTTTGACTCGCCGGATGATTTCTACAGCGACGCTAAGCTTG 
TTCTCTCCGACGGCCGGGAAGTTTCTTTCCACCGGTGCGTTTTGTCAGCGAGAAGCTCTT 
TCTTCAAGAGCGCTTTAGCCGCCGCTAAGAAGGAGAAAGACTCCAACAACACCGCCGCCG 
TGAAGCTCGAGCTTAAGGAGATTGCCAAGGATTACGAAGTCGGTTTCGATTCGGTTGTGA 
CTGTTTTGGCTTATGTTTACAGCAGCAGAGTGAGACCGCCGCCTAAAGGAGTTTCTGAAT 



241 



BNSDOCID: <WO__03013227A2_IA> 



• 

p c. tv y s o a e s s 0.5 



GCGCAGACGAGAATTGCTGCCACGTGGCTTGCCGGCCGGCGGTGGATTTCATGTTGGAGG 
TTCTCTATTTGGCTTTCATCTTCAAGATCCCTGAATTAATTACTCTCTATCAGAGGCACT 
TATTGGACGTTGTAGACAAAGTTGTTATAGAGGACACATTGGTTATACTCAAGCTTGCTA 
ATATATGTGGTAAAGCTTGTATGAAGCTATTGGATAGATGTAAAGAGATTATTGTCAAGT 
CTAATGTAGATATGGTTAGTCTTGAAAAGTCATTGCCGGAAGAGCTTGTTAAAGAGATAA 
TTGATAGACGTAAAGAGCTTGGTTTGGAGGTACCTAAAGTAAAGAAACATGTCTCGAATG 
TACATAAGGCACTTGACTCGGATGATATTGAGTTAGTCAAGTTGCTTTTGAAAGAGGATC 
ACACCAATCTAGATGATGCGTGTGCTCTTCATTTCGCTGTTGCATATTGCAATGTGAAGA 
CCGCAACAGATCTTTTAAAACTTGATCTTGCCGATGTCAACCATAGGAATCCGAGGGGAT 
ATACGGTGCTTCATGTTGCTGCGATGCGGAAGGAGCCACAATTGATACTATCTCTATTGG 
AAAAAGGTGCAAGTGCATCAGAAGCAACTTTGGAAGGTAGAACCGCACTCATGATCGCAA 
AACAAGCCACTATGGCGGTTGAATGTAATAATATCCCGGAGCAATGCAAGCATTCTCTCA 
AAGGCCGACTATGTGTAGAAATACTAGAGCAAGAAGACAAACGAGAACAAATTCCTAGAG 
ATGTTCCTCCCTCTTTTGCAGTGGCGGCCGATGAATTGAAGATGACGCTGCTCGATCTTG 
AAAATAGAGTTGCACTTGCTCAACGTCTTTTTCCAACGGAAGCACAAGCTGCAATGGAGA 
TCGCCGAAATGAAGGGAACATGTGAGTTCATAGTGACTAGCCTCGAGCCTGACCGTCTCA 
CTGGTACGAAGAGAACATCACCGGGTGTAAAGATAGCACCTTTCAGAATCCTAGAAGAGC 
ATCAAAGTAGACTAAAAGCGCTTTCTAAAACCGTGGAACTCGGGAAACGATTCTTCCCGC 
GCTGTTCGGCAGTGCTCGACCAGATTATGAACTGTGAGGACTTGACTCAACTGGCTTGCG 
GAGAAGACGACACTGCTGAGAAACGACTACAAAAGAAGCAAAGGTACATGGAAATACAAG 
AGACACTAAAGAAGGCCTTTAGTGAGGACAATTTGGAATTAGGAAATTCGTCCCTGACAG 
ATTCGACTTCTTCCACATCGAAATCAACCGGTGGAAAGAGGTCTAACCGTAAACTCTCTC 
ATCGTCGTCGGTGAGACTCTTGCCTCTTAGTGTAATTTTTGCTGTACCATATAATTCTGT 
TTTCATGATGACTGTAACTGTTTATGTCTATCGTTGGCGTCATATAGTTTCGCTCTTCGT 
TTTGCATCCTGTGTATTATTGCTGCAGGTGTGCTTCAAACAAATGTTGTAACAATTTGAA 
CCAATGGTATACAGATTTGTAATATATATTTATGTACATCAACAATAAAAAAAAAAAAAA 
AAAA 

>G278 Amino Acid Sequence (domain in AA coordinates: 2-593) 

MDTTIDGFADSYEISSTSFVATDNTDSSIVYLAAEQVLTGPDVSALQLLSNSFESVFDSP 

DDFYSDAKLVLSDGREVS FHRCVLS ARSS FFKSALAAAKKEKDS^TAAVKLELKE I AKD 

YEVGFDS VVTVLAYVYS SRVRPPPKGVSECADENCCHVACRPAVDFMLEVLYLAF I FKI P 

ELITLYQRHLLDVVDKVVIEDTLVILKLA^ 

LPEELVKEIIDRRKELGLEVPKVKKHVSNVHKALDSD^ 

FAVAYCNVKTATDLLKLDLADVNHR^ 

EGRTALMIAKQATMAVECNNIPEQCKHSLKGRLCVEILEQEDKREQIPRDVPPSFAVAAD 
ELKMTLLDLENRVALAQRLFPTEAQAAMEIAEMKGTCEFIVTSLEPDRLTGTKRTSPGVK 
IAPFRILEEHQSRLKALSKTVELGKRFFPRCSAVLDQIMNCEDLTQLACGEDDTAEKRLQ 
KKQRYMEIQETLKKAFSEDNLELGNSSLTDSTSSTSKSTGGKRSNRKLSHRRR* 
>G2421 (1..630) 

ATGGAGGGTTCGTCCAAAGGGTTGAGGAAAGGTGCATGGACTGCTGAAGAAGATAGTCTC 

TTGAGGCTIGTGTATTGGTAAGTATGGAGAAGGCAAATGGC^TCAAGTTCCTTTAAGAGCT 

GGGCTAAATCGGTGCAGGAAAAGTTGTAGACTAAGATGGTTAAACTATTTGAAGCCAAGT 

ATCAAGAGAGGAAAATTTAGTTCTGATGAAGTTGATCTTCTTCTTCGTCTTCATAAGCTT 

CTAGGAAATAGGTGGTCCTTGATTGCTGGTCGATTACCTGGTCGGACCGCTAATGATGTC 

AAGAACTACTGGAACACCCATCTGAGTAAGAAGCATGAACCGTGTTGTAAAACTAAGATA 

AAAAGGATAAATATTATAACCCCTCCTAATACACCGGCCCAAAAAGTTTGTGAAAATAGT 

ATCACATGTAACAAAGATGATGAGAAAGATGATTTTGTGGATAATTTTATGGTTGGAGAT 

AATATATGGTTGGAGCGTTTGCTAGACGAGGGCCAAGAGGTAGATGTGCTGGTTACAGAA 

GCGGCGGCAACAGAAAAGGAGGGCACTTTGGCGTTTGACGTTGAGCAACTTTGGAATTTG 

TTCGATGGAGAGACTGTGATCTTTGATTAGTGTTTATAAACGTTTGTGTTCTCTTGTTTG 

TGAGGTTTCTCTATTTAATTTAGTATCTATTT^ 

TTAGGCAAACCTTATGTTTCCGTTTCTGTGCGGCCGCTCTAG 

>G2421 Amino Acid Sequence (domain in AA coordinates: 9-110) 
MEGSSKGLRKGAWTAEEDSLLRQCIGKYGEGKWHQVPLRAGLmCRKSCRIJRWLNYLKPS 
IKRGKFSSDEVDLLIiRLHKLLGNRWSLIAGRLPGRT 

KR INI I TPPNTPAQKVCENS ITCNKDDEKDDFVDNFMVGDNIWLERLLDEGQEVDVLVTE 
AAATEKEGTLAFDVEQLWNLFDGETVIFD* 
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>G2032 (53.. 1789) 

TCCCTCCCAGAGTAAGAACTTCCATACTTTGCTCTAGATTTCTTGAGAAAAGATGCAGCC 

GATCTTCCATG CG ATCCTTAAAAATGACCTTC C AGCTTTTTTAG AGTTGGTAGAAGATAG 

TGAATCGTCTCTGGAGGAGAGAAACGAGGAAGAACACTTGAACAACACGGTTTTGCACAT 

GGCTGCAAAGTTTGGTCACCGAGAACTCGTCTCCAAGATTATTGAGCTCCGACCTTCCCT 

CGTGTCTTCCCGCAACGCATACAGAAACACACCTTTGCATCTTGCTGCTATCCTTGGAGA 

TGTAAACATAGTTATGCAGATGTTAGAGACTGGATTGGAAGTGTGTTCTGCACGCAATAT 

CAACAACCACACACCACTCCACTTGGCTTGCCGTAGCAATTCCATAGAGGCTGCCAGACT 

CATCGCGGAAAAGACACAATCAATTGGCCTCGGTGAACTCATTCTCGCCATATCAAGTGG 

ATCCACTAGTATCGTAGGGACTATACTGGAGAGATTCCCAGACCTAGCTAGGGAAGAAGC 

TTGGGTGGTTGAAGACGGCTCACAATCAACGCTACTGCATCATGCGTGTGATAAGGGAGA 

CTTTGAACTGACAACTATATTGTTAGGGCTCGATCAAGGATTAGAAGAAGCACTTAACCC 

CAATGGTTTATCACCTCTGCATCTTGCGGTCCTCAGAGGCTCGGTTGTGATCCTGGAGGA 

GTTCTTGGACAAGGTTCCATTGTCTTTCAGCTCAATCACGCCGTCGAAAGAGACAGTCTT 

TCATCTCGCTGCTCGAAACAAAAATATGGATGCCTTTGTTTTTATGGCAGAGAGTTTGGG 

AATTAACAGCCAAATTCTTCTACAGCAAACCGATGAAAGTGGCAACACTGTCTTACATAT 

TGCTGCATCCGTCTCTTTTGATGCTCCTCTTATACGTTACATTGTTGGTAAGAATATAGT 

AGATATCACGTCCAAGAACAAGATGGGTTTTGAAGCTTTTCAACTTCTCCCTCGAGAAGC 

CCAAGACTTTGAGTTGTTATCAAGGTGGCTGAGATTTGGTACCGAGACTTCACAAGAGCT 

GGATTCTGAGAACAATGTAGAACAACACGAAGGCTCTCAAGAGGTCGAGGTAATACGGTT 

GCTAAGGATTATAGGAATAAACACATCAGAGATAGCAGAGAGAAAGAGAAGCAAGGAACA 

GGAAGTGGAAAGAGGTCGTCAGAACTTGGAATATCAGATGCATATAGAAGCATTACAGAA 

TGCAAGAAATACGATTGCTATAGTGGCAGTCTTGATTGCTTCAGTTGCTTATGCCGGTGG 

GATAAACCCTCCGGGGGGCGTCTACCAAGACGGGCCATGGAGAGGGAAATCCTTAGTGGG 

GAAAACAACGGCGTTTAAGGTCTTTGCGATATGCAACAACATCGCACTGTTCACGTCCTT 

GGGCATCGTTATTCTTCTTGTTAGCATCATACCTTACAAGAGGAAACCCTTAAAGAGATT 

ATTGGTGGCCACGCATAGGATGATGTGGGTTTCTGTAGGTTTCATGGCGACGGCTTATAT 

AGCGGCGTCTTGGGTGACCATACCGCATTATCATGGAACACAATGGTTATTTCCAGCAAT 

TGTAGCCGTTGCTGGTGGAGCGTTGACCGTACTCTTTTTCTATCTCGGAGTTGAGACCAT 

CGGTCATTGGTTTAAGAAGATGAATCGTGTAGGGGATAATATACCTTCCTTTGCAAGAAC 

CAGTTCAGATTTAGCCGTCTCCGGAAAATCAGGCTATTTCACCTATTAAGAAAAACTGGT 

TTTCTAATTTCCCTGTAACCTGTGTAATTGTGTATGTG * 

>G2032 Amino Acid Sequence (domain in AA coordinates: entire protein) 
MQP I FHAILKNDLPAFLELVEDS ES SLEERNEEEHLNNTVLHMAAKFGHRELVS KI I ELR 
PSLVSSRNAYRNTPLHLAAILGDWIWQMLETGLEV^ 

ARLIAEKTQSIGLGELILAISSGSTSIVGTILERFPDLAREEAWVVEDGSQSTLLHHACD 
KGDFELTTILLGLDQGLEEALNPNGLSPLHLAVLRGSVVILEEFLDKVPLSFSSITPSKE 
TVFHLAARNKNMDAFVFMAES LGINSQI LLQQTDE SGNTVLHI AASVS FDAPL I RYI VGK 
NIVDITSKIJKMGFEAFQLLPREAQDFELLSRWLRFGTETSQELDSEN1WEQHEGSQEVEV 
IRLLRIIGINTSEIAERKRSKEQEVERGRQNLEYQMHIEALQNARNTIAIVAVLIASVAY 
AGGINPPGGVYQDGPWRGKSLVGKTTAFKVFAICNNIALFTSLGI VILLVS I IPYKRKPL 
KRLLVATHRMiyn^SVGFMATAYIAASWVTIPHYHGTQWLFPAIVAVAGGALTVLFFYLGV 
ETIGHWFKKMNRVGDNI PS FARTS SDLAVSGKSGYFTY* 
>G1396 (83.. 313) 

TCGACCTCGTTTCCTTTCCTCCTCTCTTCCTACCATTAGTACGTTACTGGAGCTGATCTC 

ACGTATATTTTGGATCGTAATCATGGACGGCGAAGATTTTGCCGGAAAGGCGGCTGCTGA 

AGCCAAGGGATTGAACCCGGGATTAATCGTGCTGCTTGTTGTTGGAGGTCCGCTTCTTGT 

GTTCCTAATCGCCAACTACGTGCTTTACGTTTATGCTCAGAAGAACCTACCTCCAAGGAA 

GAAGAAGCCCGTTTCCAAAAAGAAGCTCAAGCGGGAGAAGCTAAAGCAAGGAGTCCCTGT 

CCCTGGAGAATAAAAGCCAGCTTAAGCTTCCTTCACTTGTGCCTCCTTCAAAGCGGTTTT 

TGTTCGGTTACCAAATTT(^CCCTTGCGGGTTTTTTTCTTCCTTT 

ATTATCTTTGAGGCCT 

>G1396 Amino Acid Sequence (domain in AA coordinates: TBD) 
MDGEDFAGKAAAEAKGLNPGLIVLLWGGPLLVFLI 
KLKREKLKQGVPVPGE * 
>G619 (382.. 2748) 

ATTTTTTTCCAATCTGCZAT^TTTTAGTCTATGTCTGTTCCTTGTGCTCCCTCTTCTCAGT 
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ACCTGCAAATGGAGGAAGAAGAATCCTTCTCTGAAACCCTTGTTCTCATTTGATTCCTCC 
TTCCTCTCTCTTCTTCTCTCTCTGTCTCTGATTCGTTATTCCACACTTATGACTCATCTT 
TCCCGTC7UVTAGCTAAGTTTGCCTCTTCTTTGTGAAATTTAGCTGAAAAAGGAGAGGAAT 
TCCGAATTCTGTCACTTCAAAGCTCGAATTTTGCAAACTTTCCTTTGATGGGTTTTACTT 
GTTTTGTTGTAATCTGATTAAAAATAGAAACTTTTTGTTTTCTTCTTGTCTCCTTTTGCT 
CTTAAAAGAGAAGCTTTTTCAATGGAATTTGACTTGAATACTGAGATTGCGGAGGTGGAA 
GAGGAGGAGAATGATGATGTAGGAGTAGGAGTAGGAGGAGGAACAAGAATTGACAAGGGT 
AGGCTTGGAATTTCACCATCTTCTTCTTCTTCATGCTCTTCCGGATCATCATCGTCATCA 
TCTTCTACAGGCTCTGCATCTTCCATTTACTCTGAGCTTTGGCATGCTTGTGCTGGTCCT 
CTCACTTGTCTTCCCAAGAAAGGCAATGTAGTTGTCTATTTCCCTCAAGGTCATTTGGAG 
CAAGATGCTATGGTTTCATATTCGTCTCCTCTTGAAATCCCCAAATTTGACCTTAATCCC 
CAAATCGTCTGCAGGGTGGTTAATGTCCAGTTGCTTGCTAATAAGGACACCGATGAGGTC 
TACACTCAAGTCACTCTGCTTCCACTTCAAGAGTTTTCGATGCTAAATGGGGAGGGGAAA 
GAGGTCAAGGAGTTAGGAGGGGAGGAAGAGAGGAACGGAAGCTCATCCGTCAAGCGGACA 
CCTCATATGTTCTGTAAAACCTTAACAGCGTCTGACACAAGCACACATGGAGGCTTCTCT 
GTACCTAGAAGAGCCGCTGAAGATTGTTTTGCTCCTCTTGACTACAAACAACAGAGGCCA 
TCTCAAGAGCTCATTGCAAAGGACCTCCATGGAGTAGAGTGGAAGTTTCGCCATATCTAT 
AGAGGTCAACCAAGGAGGCATCTACTCACCACTGGTTGGAGTATCTTTGTCAGTCAAAAG 
AATCTCGTCTCTGGTGATGCGGTTCTCTTTCTGAGAGACGAAGGAGGAGAGCTGAGATTA 
GGAATCAGAAGAGCAGCACGGCCAAGAAATGGACTTCCTGACTCAATCATTGAGAAGAAT 
TCATGTTCAAACATTCTGTCTCTTGTGGCTAATGCTGTATCTACAAAAAGCATGTTTCAT 
GTGTTCTACAGTCCACGAGCGACGCATGCAGAGTTTGTGATTCCTTATGAGAAGTATATC 
ACAAGCATCAGGAGTCCTGTTTGCATAGGCACAAGATTTAGAATGCGATTTGAAATGGAC 
GATTCTCCTGAGAGAAGATGCGCTGGTGTAGTGACTGGAGTCTGTGACTTGGACCCGTAT 
AGGTGGCCAAACTCTAAATGGAGGTGCTTGTTGGTGCGATGGGATGAGTCTTTTGTGAGT 
GATCATCAAGAAAGAGTTTCACCTTGGGAGATTGATCCCTCGGTTTCTCTCCCACACTTG 
AGCATTCAGTCATCTCCAAGGCCTAAAAGGCCATGGGCAGGTTTACTGGATACTACCCCA 
CCCGGAAACCCCATAACAAAAAGGGGTGGTTTTTTGGACTTTGAGGAGTCGGTTAGACCC 
TCTAAGGTCTTGCAAGGTCAAGAAAATATAGGTTCTGCATCACCCTCACAGGGGTTTGAT 
GTTATGAACCGCCGGATACTGGATTTTGCGATGCAGTCTCATGCAAATCCAGTCCTTGTG 
TCGAGTAGAGTCAAGGATCGATTTGGTGAGTTTGTAGATGCTACTGGCGTGAACCCAGCT 
TGCTCAGGTGTTATGGACCTGGATAGGTTTCCAAGGGTCTTGCAAGGTCAAGAAATTTGC 
TCGCTTAAATCATTCCCGCAATTTGCTGGTTTCAGTCCAGCTGCTGCTCCTAATCCCTTT 
GCTTACCAAGCCAACAAGTCAAGTTACTATCCGCTAGCTTTGCATGGGATTAGGAGCACT 
CATGTTCCGTATCAGAATCCATACAATGCGGGAAACCAATCCTCGGGTCCCCCTTCACGT 
GCAATAAACTTTGGTGAAGAGACTAGAAAGTTTGATGCACAAAATGAAGGTGGCCTACCA 
AATAATGTTACAGCTGATTTGCCATTCAAGATTGATATGATGGGAAAACAGAAAGGCAGT 
GAGTTGAATATGAATGCTTCATCAGGATGTAAACITTTCGGATTCTCCTTACCAGTGGAG 
ACACCTGCATCTAAGCCGCAAAGCTCGAGCAAAAGAATCTGTACAAAGGTTCACAAGCAA 
GGAAGCCAAGTGGGGAGAGCTATTGATTTGTCGCGACTTAACGGGTATGATGATCTCCTT 
ATGGAGCTTGAACGGCTGTTCAACATGGAAGGGCTTCTCAGGGATCCTGAAAAAGGATGG 
AGGATCTTATATACTGATAGTGAGAACGATATGATGGTCGTTGGCGATGATCCATGGCAT 
GATTTCTGCAATGTGGTGTGGAAGATACACTTATACACGAAAGAGGAAGTGGAGAATGCG 
AATGACGATAACAAGAGTTGTTTAGAGCZAAGCTGCTCTCATGATGGAAGCATCAAAGTCA 
TCTTCTGTGAGCCAGCCTGATTCTTCTCCTACAATCACTAGGGTTTGATACCCATAAAGA 
AGCTTATTTCCTATGTTTTAAAGTGTGTTTTGCT^ 

GTCTTTGAATCCATTTATGTGTTTGTTTGTGTTTCTTCTGGTCTCCATGGATGTCTCATG 
TGTACCGTTTTACTeGAGAGATATGTGAGTTTATGGGATGTGTAAAGCATGCCATTGGAT 
TTTAAGGTTTTCAAAATTACAATATATATATTAGTTTTGAAGTTAAAAAAAAAAAAAAAA 
A 

>G619 Amino Acid Sequence (domain in AA coordinates: 64-406) 
MEFDLNTEIAEVEEEENDDVGVGVGGGTRIDKGRLGISPSSSSSCSSGSSSSSSSTGSAS 
SIYSELWHACAGPLTCLPKKGNVVVyFPQGHLEQDAMVSYSSPLEIPKFDLNPQIVCRVV 
NVQLLANKDTDEVYTQVTLLPLQEFSMLNGEGKEVKELGGEEERNGS S SVKRTPHMFCKT 
LTASDTSTHGGFSVPRRAAEDCFAPLDYKQQRPSQELIAKDLHGVEWKFRHIYRGQPRRH 
LLTTGWSIFVSQKNLVSGDAVLFLI^EGGELRLGIRRAARPRNGLPDSIIEKNSCSNILS 
LiVANAVSTKSMFHVFYS PRATHAEFVI P YEKY ITS I RS PVC I GTRFRMRFEMDDSPERRC 
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AGWTGVCDLDPYRWPNSKWRCLLVRWDESFVSDHQERVSPWEIDPSVSLPHLSIQSSPR 

PKRPWAGLLDTTPPGNP I TKRGGFLDFEES VRPS KVLQGQEN I GS AS PSQGFD VMNRRI L 

DFAMQSHAMPVLVSSRVKDRFGEFVDATGVNPACSGVMDLDRFPRVLQGQEICSLKSFPQ 

FAGFSPAAAPNPFAYQANKSSYYPLALHGIRSTHVPYQNPYNAGNQSSGPPSRAINFGEE 

TRKFDAQNEGGLPNNVTADLPFKIDMMGKQKGSELNMNASSGCKLFGFSLPVETPASKPQ 

SSSICRICTKVHKQGSQVGRAIDLSRLNGYDDLLMELERLFNMEGLLRDPEKGWRILYTDS 

ENDMMWGDDPWHDFCNVWKIHLYTKEEVENANDDNKSCLEQAALMMEASKSSSVSQPD 

SSPTITRV* 

>G2295 (33 . .917) 

GTAATATATAACAATAACTCAGGTTACAAAGGATGGTTCCGAAAGTGGTCGACCTACAAA 
GGATAGCGAACGATAAGACAAGGATAACAACTTACAAGAAGAGGAAAGCTAGTCTTTACA 
AGAAGGCACAAGAGTTCTCAACTCTCTGCGGCGTCGAGACATGTCTCATCGTCTACGGTC 
CCACGAAGGCTACCGATGTGGTGATTTCCGAGCCAGAGATATGGCCGAAGGACGAGACCA 
AAGTCAGGGCCATCATACGCAAGTACAAAGACACAGTGTCGACCAGCTGCAGGAAAGAAA 
CCAACGTGGAGACTTTCGTCAACGATGTAGGGAAAGGAAACGAGGTGGTGACTAAAAAGA 
GAGTGAAGCGTGAGAATAAGTATTCTAGTTGGGAGGAGAAGCTAGACAAGTGTTCACGAG 
AGCAACTACATGGGATTTTCTGTGCCGTGGATAGCAAGTTAAATGAAGCTGTAACGAGAC 
AGGAGCGTAGTATGTTTAGGGTTAATCATCAAGCCATGGACACACCATTCCCGCAGAATT 
TAATGGACCAACAATTCATGCCACAGTATTTTCATGAGCAGCCACAGTTTCAAGGCTTCC 
CTAATAATTTCAATAATATGGGTTTCTCGTTGATTTCACCTCATGATGGTCAGATTCAAA 
TGGACCCAAATCTCATGGAGAAGTGGACCGACTTGGCTTTGACTCAAAGCTTGATGATGT 
CAAAGGGAAACGATGGTACTCAATTCATGCAGAGGCAAGAACAACCATACTATAATCGTG 
AACAGGTTGTATCGAGGTCTGCAGGTTTCAATGTTAACCCGTTTATGGGATATCAAGTCC 
CGTTTAATATTCCTAATTGGAGATTATCGGGAAATCAAGTTGAAAATTGGGAGCTTTCAG 
GGAAGAJy\ACGATATGATTTGAATTACGGAGCTTTATTAGTTTTTAGGGTTTTATAGTTT 
TG 

>G22 95 Amino Acid Sequence (domain in AA coordinates: TBD) 
MVPKVVDLQRIAM)KTRITTYKKRKASLYKKAQEFSTLCGVETCLIVYGPTKATDVVISE 
PEI WPKDETKVI^I IRKYKD WSTSCRKETNVETFVTO^ 

EEKLDKCSREQLHGIFCAVDSKLNEAVTRQERSMFRVNHQAMDTPFPQNLMDQQFMPQYF 
HEQPQFQGFPNNFNNMGFSLISPHDGQIQMDPNL^ 

RQEQPYYl^EQVVSRSAGFNVNPFMGYQVPFNIPNWRLSGNQVENWELSGKKTI* 
>G312 (1..1755) 

ATGGCTTACATGTGCACTGATAGTGGCAATCTAATGGCTATTGCTCAACAAGTCATCAAA 
CAGAAGCAGCAA(^VAGAACAACAACAGCAGCAA(^TCATCAAGACCATCAGATTTTTGGT 
ATTAATCCTTTGTCTCTTAACCCATGGCCCAATACTTCCCTCGGGTTTGGGCTTTCAGGT 
TCGGCTTTTCCCGACCCGTTTCAAGTTACCGGCGGCGGAGATTCCAACGATCCTGGCTTT 
CCTTTTCCTAACTTAGACCACCACCACGCCACAACCACCGGCGGTGGGTTCAGGTTATCT 
GATTTCGGCGGTGGAACCGGCGGCGGCGAGTTTGAGTCCGACGAGTGGATGGAGACTCTT 
ATC^GCGGTGGAGACTCCGTTGCAGACGGTCCTGATTGTGACACCTGGCATGATAATCCC 
GATTACGTAATCTACGGTCCTGATCCATTCGATACTTACCCGAGTCGACTCAGTGTCCAA 
CCGTCAGATCTAAACCGAGTCATTGACACGTCGAGTCCGCTTCCTCCGCCGACCTTGTGG 
CCTCCTTCTTCGCCATTATCGATTCCTCCGCTTACTCATGAGTCACCAACCAAAGAAGAT 
CCAGAGACTAACGACTCCGAAGACGATGACTTCGACCTAGAACCACCTCTCCTCAAAGCT 
ATATACGACTGTGCACGGATCTCAGACTCTGACCCTAACGAAGCTTCCAAGACGCTTCTT 
CAGATCCGAGAATCTGTATCGGAGCTAGGTGATCCGACGGAGCGAGTTGCATTTTACTTC 
ACGGAAGCTCTCTCCAACAGACTGTCTCCTAATTCGCCGGCGACGTCGTCTTCTTCTTCA 
TCTACGGAGGATTTAATCTTATCTTATAAAACCCTAAACGACGCTTGTCCTTACTCCAAA 
TTCGCACATTTGACGGCGAATCAAGCGATTCTAGAAGCGACGGAGAAGTCGAACAAGATT 
CACATCGTCGATTTTGGAATCGTTCAAGGTATACAATGGCCTGCTCTTCTTCAAGCTCTA 
GCTACTCGTACTTCTGGTAAACCCACTCAAATCCGGGTCTCGGGTATACCCGCTCCATCT 
CTCGGTGAATCTCCGGAACCGTCGTTAATCGCCACCGGAAACCGCCTCCGTGATTTCGCC 
AAGGTTCTGGATCTGAATTTCGATTTCATCCCAATTCTCACTCCCATACATTTACTTAAC 
GGGTCAAGTTTCCGGGTCGACCCGGATGAAGTACTGGCCGTGAATTTCATGCTCCAGCTC 
TACAAATTACTCGACGAGACGCCGACGATAGTTGACACCGCACTACGGCTCGCCAT^ATCG 
TTGAACCCGAGGGTCGTCACTCTCGGAGAATACGAAGTGAGCTTAAACCGGGTCGGTTTC 
GCTAACCGGGTAAAGAACGCGCTTCAATTCTATTCCGCGGTTTTCGAATCCCTTGAACCG 



245 



BNSDOCID: <WO__03013227A2_IA> 



IP C T / U S O 3 / 2ESOS 



AACTTGGGGCGTGATTCGGAGGAGAGAGTGAGAGTTGAGCGAGAGTTGTTCGGCCGGAGA 

ATCTCGGGTTTGATTGGACCGGAGAAAACCGGAATTCATAGAGAAAGAATGGAAGAGAAA 

GAGCAATGGCGGGTATTAATGGAGAATGCCGGTTTTGAATCGGTTAAGCTGAGTAATTAC 

GCAGTGAGCCAAGCGAAGATTCTATTGTGGAATTACAATTACAGCAATTTGTATTCAATT 

GTTGAATCTAAGCCTGGCTTCATCTCTTTGGCCTGGAACGATTTACCTCTCCTCACTCTT 
TCTTCCTGGCG ATAA 

>G312 Amino Acid Sequence (domain in AA coordinates: 320-336) 

MAYMCTDSGNLMAIAQQVIKQKQQQEQQQQQHHQDHQIFGINPLSLNPWPNTSLGFGLSG 

SAFPDPFQVTGGGDSNDPGFPFPNLDHHHATTTGGGFRLSDFGGGTGGGEFESDEWMETL 

ISGGDSVADGPDCDTWHDNPDYVIYGPDPFDTYPSRLSVQPSDLNRVIDTSSPLPPPTLW 

PPSSPLSIPPLTHESPTKEDPETNDSEDDDFDLEPPLLKAIYDCARISDSDPWEASKTLL 

QIRESVSELGDPTERVAFYFTEALSNRLSPNSPATSSSSSSTEDLILSYKTLNDACPYSK 

FAHLTANQAILEATEKSNKIHIVDFGIVQGIQWPALLQALATRTSGKPTQIRVSGIPAPS 

LGESPEPSLIATGNRLRDFAKVLDLNFDFIPILTPIHLLNGSSFRVDPDEVLAVNFMLQL 

YKLLDETPTIVDTALRLAKSLNPRVVTLGEYEVSLNRVGFAWRVKNALQFYSAVFESLEP 

NLGRDSEERVRVERELFGRRISGLIGPEKTGIHRERMEEKEQWRVLMENAGFESVKLSNY 

AVSQAKILLWNYNYSNLYSIVESKPGFISLAWNDLPLLTLSSWR* 

>G1444 (192. .1001) 

AATCCCCTATCCTTCGCAAGACCCTTCCTCTATATAAGGAAGTTCATTTCATTTTTTTTT 

GACACGCTGACAAGCTGACTCTAGCATATCTGGCACCGGCGACCAGTCCTTCTTTGGTGC 

AAAGATCCCAAAAAATCAAAATCGAAAGAGAGAATAAATCAAAAGGAAGAATCTTTATCT 

GCTTTCTCTCGATGAGGATCCGGAAACGACAAGTGCCTCTTCCTTTATCGTCTCTATTAC " 

CAGTTCCTCTATCAGATCTCTACTTTAACCGCTCACCGACGGCCACCGCGAGATACTTTC 

GCGGTGGTTATAAAGACGGCGGTGATGATTTTGGTTCTCTTCAGCTTTCGCTTCCGCCGC 

CGTCGCAGATTTCTGATCGGCTTATTCAAAGAGATTTGATAAAGAAGAAGGAGGAGGTCA 

AGGCTTTGGATGATGATAATGGTGATGTAGACGTCAAGAGTCGTACTGATGCATCGGGCA 

GCAAGAATGTTAATCCCCGAGGAGAATCCGTCTCTTCAATACAAGTTGTCGAGAAGAATG 

AAAAGGTTGTGTCTTTGAGGAAGAGAAGAGGCTTTATCAACTTTGAGGATTACGAAGATG 

AGGAAGATGAAGAAGCTAGTGGCGGTGGAGGCCGTATTAATAAAGGGAAAAAGAAAGCGA 

AAAAGAGCGGTGGTGGGTTAGAGGAAGGATCACGGTGCAGCCGTGTTAACGGTAGAGGAT 

GGAGATGTTGTCAGCAAACGCTTGTTGGTTATTCTCTTTGTGAGCATCATCTCGGTAAAG 

GAAGGGTAAGGAGCATGAACAAGAGTGGTGGTGGTCGTGGCGGCGAGAAAAAGGCGGTGG 

TGGTGGAAGTGAAGAAGAAGAGAGTAAAGCTTGGCATGGTAAAGGCACGTTCAATAAGTA 

GTTTGCTTGGACAAACCAGCACTAGTGGTGGTACTAGTGGTGATGTTGATCAGGGTGAGA 

TAAGTGCACCTGCTGATCAGTTCGCTGCATGTGATAAGTAGGTCTGTTGATCAGCATTTG 

CATGTATATGGATATGTGTATGTTTATGTACATGATGATAATGGGCATAGCGCGGCCGCT 

CTAGACAGGCCTGGAACCGGATCCTCTAGCTAGAGCTTTCGTTAGTATCATCGGGTTTAG 
ACAACGTT 

>G1444 Amino Acid Sequence (domain in AA coordinates: 168-193) 
MRIRKRQVPLPLSSLLPVPLSDLYFNRSPTATARYFRGGYKDGGDDFGSLQLSLPPPSQI 
SDRL I QRDL I K3CKEEVKALDDDNGDVDVKSRTDASGS KNVNPRGES VS S I QWEKNEKW 

SLRKRRGFINFEDYEDEEDEEASGGGGRINKGKKKAKKSGGGLEEGSRCSRVNGRGWRCC 
QQTLVGYSLCEHHLGKGRWSMNKSGGGRGGEKKAVVVEVKKKRVKLGMV^ 
QTSTS GGTSGD VDQGE I S APADQFAACDK* 
>G801 (27.. 746) 

GATAGTGATAACGAAATCCTAATTCCATGGCCGACAACGACGGAGCAGTGAGTAACGGCA 
TCATAGTCGAGCAGACGTCAAACAAAGGACCTCTTAACGCCGTTAAGAAACCACCGTCTA 
AAGATCGACACAGCftAAGTTGACGGAAGAGGAAGAAGGATTCGTATGCCAATCATTTGCG 
CAGCTCGAGTTTTTCAATTGACCAGAGAGTTAGGTCACAAGTCCGATGGTCAAACCATAG 
AGTGGCTTCTCCGTCAAGCTGAGCCTTCTATCATAGCCGCCACTGGAACTGGCACTACTC 
CGGCGAGTTTCTCCACTGCTTCTCTCTCCACTTCTTCTCCGTTTACTCTCGGGAAACGTG 
TCGTCAGAGCGGAGGAAGGAGAATCCGGCGGCGGAGGAGGAGGAGGGTTAACAGTGGGAC 
ACACAATGGGGACTTCGTTAATGGGTGGTGGTGGTTCTGGTGGGTTTTGGGCTCTTCCGG 
CGAGGCCGGATTTCGGACAAGTCTGGAGCTTTGCAACCGGAGCTCCACCGGAAATGGTTT 
TTGCGCAGCAGCAGCAACCAGCTACACTCTTCGTCCGCCACCAGCAGCAA(^GCAAGCTT 
CCGCCGCCGCAGCAGCTGC^TGGGTGAGGCTTCAGCAGCTAGAGTTGGGAATTATCTTC 
CGGGTCATCATCTCAATTTGCTTGCTTCTTTGTCTGGTGGAGCTAACGGGTCGGGTCGGA 
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GGGAAGACGACCACGAACCACGTTGAGAAATGGTATTGTCTTTTTGGTAATGTATAGAAA 

AATTCCTATGTTTTATGTCATCGAAAGTGTTTAGAAAGTACCTCTAATTTGCGGTTTCTT 

TTGCTCCTTTTTTACTTAATTTAAGCTTATTGCTTGTTTGATTAGGGTTTTAGGGTTTAA 

GAATATTTGGTCTCGTTAATTTGTTTCGGAGAGTGATAGAAAGAGAGAGAGATTGATTGA 

TTGTTGTACCTAAAACGCTATAAAAGCTCTGTTTTTACTAGCGAAAAAA 

>G801 Amino Acid Sequence (domain in AA coordinates: 32-93) 

MADNDGAVSNGIIVEQTSNKGPLNAVKKPPSKDRHSKVDGRGRRIRMPIICAARVFQLTR 

ELGHKSDGQTIEWLLRQAEPSIIAATGTGTTPASFSTASLSTSSPFTLGKRWRAEEGES 

GGGGGGGLTVGHTMGTSLMGGGGSGGFWAVPARPDFGQVWSFATGAPPEMVFAQQQQPAT 

LFVRHQQQQQASAAAAAAMGEASAARVGNYLPGHHLNLLASLSGGANGSGRREDDHEPR* 

>G1950 (42.. 764) 

CTGAATTCGAACTTTGGAAGAAGAAGAAGCTTTGATCAATCATGGAAATTGCAACCGATA 
CAGCAAAGCAGATGAGAGACGAAGAGTTGTTCAAAGCAGCGGAATGGGGAGATTCATCGT 
TGTTCATGTCATTATCTGAAGAACAGCTCTCTAAATCTCTCAATTTCAGAAACGAAGATG 
GTCGCTCTCTCCTCCATGTCGCTGCTTCCTTCGGCCATTCTCAAATAGTGAAGTTGTTAT 
CAAGTTCAGATGAAGCAAAGACTGTAATCAATAGCAAGGATGATGAAGGATGGGCTCCTT 
TGCATTCCGCTGCTAGCATCGGTAATGCTGAGCTCGTTGAGGTGCTTTTGACCAGAGGTG 
CTGATGTCAATGCCAAAAATAACGGTGGTCGCACTGCTCTTCACTATGCTGCTAGCAAAG 
GCCGGTTGGAGATTGCTCAGCTTTTATTAACACACGGTGCAAAGATTAACATCACAGACA 
AGGTTGGTTGCACTCCGCTTCACAGGGCAGCAAGCGTGGGAAAGTTAGAAGTTTGTGAAT 
TTCTTATTGAAGAAGGAGCAGAGATCGATGCTACGGATAAAATGGGTCAAACTGCACTCA 
TGCATTCAGTTATCTGCGATGACAAACAGGTTGCGTTCCTGCTTATAAGACATGGTGCAG 
ATGTGGATGTAGAAGACT^AGGAAGGCTACACTGTTCTAGGCCGAGCTACCAATGAATTCC 
GACCTGCACTTATCGATGCTGCTAAGGCCATGCTTGAAGGATAAAATGACTCTGGATTAC 
TTTAAAACTTACTAACTCTGAGAGTTGTTTAGTTACTTAAAAGGATTTTTCTTTACTGTA 
TCATGTTTGCAAAATGTTTCTGCCTTATCAATTCATGTTCTGT 

>G1950 Amino Acid Sequence (domain in AA coordinates: 65-228) 
MEIATDTAKQMRDEELFKAAEWGDSSLFMSLSEEQLSKSLNFRNEDGRSLLHVAASFGHS 
QIVKLLSS SDEAKTVINSKDDEGWAPLHSAAS IGNAELVEVLLTRGADVNAKNNGGRTAL 
HYAAS KGRLE I AQLLLTHGAKINITDKVGCTPLHRAAS VGKLEVCEFL IEEGAE IDATDK 
MGQTALMHSVICDDKQVAFLLIRHGADVDVEDKEGYTVLGRATNEFRPALIDAAKAMLEG 
★ 

>G958 (55.. 1950) 

CGTCGACATGTTCATATTTGTTTCTAGCTAAGAAGTTTGTATAAGGCAGTGGACATGGCT 
CCTGTTTCAATGCCTCCAGGTTTCCGGTTTCATCCAACAGACGAAGAGCTTGTCATATAC 
TACCTCAAGCGAAAGATTAATGGTCGGACTATTGAGTTAGAGATAATACCCGAGATTGAT 
CTTTACAAATGCGAACCTTGGGATTTACCTGGGAAGTCCTTGCTGCCAAGTAAAGACCTA 
GAATGGTTCTTTTTCAGTCCTCGAGACCGGAAATATCCAAACGGATCAAGAACAAACCGG 
GCGACCAAAGCAGGTTACTGGAAAGCCACCGGGAAAGATCGTA7VAGTGACTTCACATTCA 
CGGATGGTTGGAACAAAGAAAACATTAGTTTATTACCGAGGAAGAGCGCCTCATGGCTCT 
CGTACCGATTGGGTCATGCACGAGTACCGTCTTGAAGAACAAGAATGTGACTCTAAATCC 
GGTATACAGGATGCCTATGCACTTTGTCGAGTATTTAAGAAGAGTGCTTTAGCCAACAAA 
ATTGAAGAACAACACCATGGTACGAAGAAGAACAAAGGAACGACTAATAGTGAACAATCT 
ACTTCTAGTACTTGTTTGTATTCTGATGGAATGTATGAAAACCTCGAAAACTCGGGGTAT 
CCAGTCTCACCTGAGACAGGAGGCTTAACTCAACTCGGTAATAATTCGTCGTCGGATATG 
GAAACGATAGAGAATAAATGGAGTCAGTTTATGTCGCATGACACGTCCTTCAACTTCCCA 
CCTCAGTCTCAATATGGAACAATCTCATATCCTCCCTCGAAGGTTGATATAGCGTTAGAG 
TGTGCAAGACTACAStAATCGTATGTTGCCACCAGTACCACCACTTTACGTAGAAGGTCTC 
ACACACAATGAATATTTTGGAAACAATGTAGCTAACGATACAGATGAAATGTTGAGCAAG 
ATTATAGCATTGGCTCAAGCCTCACATGAGCCACGAAACAGTCTAGACTCATGGGACGGT 
GGTTCTGCTTCCGGGAACTTCCATGGAGACTTTAACTATTCCGGAGAAAAAGTCTCATGC 
CTAGAGGCGAACGTGGAGGCTGTAGATATGCAAGAACACCATGTGAATTTTAAGGAAGAA 
AGACTTGTTGAAAACTTGAGATGGGTAGGAGTATCAAGCAAGGAACTTGAAAAGAGCTTC 
GTTGAAGAACACTCAACGGTAATTCCTATAGAAGATATTTGGAGATATCATAATGATAAT 
CAAGAACAAGAACATCATGATCAAGATGGTATGGACGTTAACAACAACAATGGAGATGTG 
GATGATGCTTTCACACTCGAGTTTTCGGAAAACGAACATAACGAGAATCTTTTGGACAAG 
AACGATCATGAGACAACGAGTTCCTCATGTTTTGAGGTGGTAAAAAAAGTTGAGGTTAGC 
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====== 

CTATTGATGCGTTGTGTTCATCGAGGTAACTCTAACAAAAACAGAGGCAGTGAAGGTTAC 

Sssssssssasss^S 

TTTCACTTTTCTATTGTACTCCCATTTGCCTAGGTCGTATGC ^TATATGGCT 



EQGFRFQDSFVLKKLGLSLAIILAVSTISLI^ 
>G1037 (1..1722) 



c s a ^ gaggag j tc ^^ 

TCTGATCCGAACAATGGGAAAGGTAATAGAAAACGTAAAGATCAGTATAATGAAGATrar 

a^gagctgc^taagaaatitgttgcagctcttaaccaatoS 

TCCGCCTTTACTTGAAGAGG atcagtggtgtggctaatcagcaa 

====== 

===== 

CG^^CAGGACAGCAGXTGAATAMGGTTrGGAGGiTSSS^SJS?^ 



CT 



gtctctccattaccgcattctagacccgaccccttggaatggaacaItctc^ 
tactctataccattctgtgactctgccaatacattgag^ctc^S 
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SHLQKFRLYLKRISGVANQQAIMANSELHFMQMNGLDGFHHRPIPVGSGQYHGGAPAMRS 

FPPNGILGRLNSSSGIGVRSLSSPPAGMFLQNQTDIGKFHHVSSLPLNHSDGGNILQGLP 

MPLEFDQLQTNNNKSRNMNSNKSIAGTSMAFPSFSTQQNSLISAPNNNVVVLEGHPQATP 

PGFPGHQINKRLEHWSNAVSSSTHPPPPAHNSNSINHQFDVSPLPHSRPDPLEWNNVSSS 

YSIPFCDSANTLSSPALDTTNPRAFCRNTDFDSNTNVQPGVFYGPSTDAMALLSSSNPKE 

GFWGQQKLQSGGFMVADAGSLDDIVNSTMKQV* 

>G2065 (33. .1124) 

AACCACACAAAACAAAACAAAAAAACATATTGATGGGGATGAAGAAGGTAAAGCTATCTT 

TGATAGCTAATGAAAGATCAAGGAAAACATCCTTCATGAAGAGGAAAAACGGGATATTCA 

AGAAACTCCACGAGTTGTCAACTCTATGTGGTGTCCAAGCTTGTGCTCTCATCTATAGTC 

CATTCATACCGGTTCCAGAGTCATGGCCGTCAAGGGAAGGTGCTAAAAAGGTAGCTTCAA 

AGTTTCTGGAGATGCCGCGGACAGCCCGAACCAGGAAGATGATGGATCAAGAAACCCATC 

TTATGGAGAGGATTACCAAAGCAAAAGAGCAACTAAAGAATTTGGCTGCTGAGAACCGAG 

AATTACAGGTTAGACGATTTATGTTTGATTGTGTTGAAGGCAAAATGTCCCAGTATCGTT 

ATGATGCAAAAGACCTTCAAGATTTGCTATCTTGTATGAATCTATATCTCGATCAGCTTA 

ACGGAAGGATCGAGTCCATTAAAGAAAACGGTGAGTCGTTGTTGTCTTCCGTCTCTCCTT 

TTCCTACTAGAATTGGTGTTGACGAAATTGGTGATGAGTCGTTTTCCGACTCTCCTATTC 

ATTCTACAACTAGGGTTGTAGATACTCCTAATGCTACCAATCCTCATGTTCTTGCGGGCG 

ATATGACTCCTTTTCTTGATGCGGACGCAAATGCGGTAACTGCTCCCAGTCGATTTTCTG 

ATCATATTCAATATGAAAATATGAATATGAGTCAAAATCTGCATGAACCGTTTCAACACC 

TTGTTCCTACTAACGTTTGTGATTTTTATCAAAATCAGAATATGAATCAGGTTCAATACC 

AGGCTCCTAATAATCTGTTTAATCAGATTCAACGAGAATTCTACAACATAAATTTGAATC 

TGAATTTGAATCTGAATTCAAATCAGTATCTGAATCAACAACAATCATTCATGAATCCGA 

TGGTGGAACAACATATGAATCATGTTGGAGGGCGTGAAAGCATTCCTTTCGTGGACAGAA 

ACTACTACAACTACAATCAACTAC CAG CCGTTG ATCTTGCTTCCACCAGTTACATGCCTT 

CAACCACCGATGTTTATGATCCTTACATCAACAACAATCTCTAATCACAAAAGACGGAGA 

TTTTGTAGTTTAA 

>G2065 Amino Acid Sequence (domain in AA coordinates: TBD) 
MGMKKVTKLSLIANERSRKTSFMKRKNGIFKKLHELSTLCGVQACALIYSPFIPVPESWPS 

REGAKKVASKFLEMPRTARTRKMMDQETO^ 

VEGKMSQYRYDAKDLQDLLSCMNLYLDQIiNGRIES I KENGESLLS SVSPFPTRIGVDEIG 
DESFSDSPIHSTTRVVDTPNATNPHVLAGDMTPFLDADANAVTAPSRFSDHIQYENM^S 

QNLHEPFQHLVPTNVCDFYQNQNMNQVQYQAPNN^ 

NQQQSFMNPMVEQHMNHVGGRESIPFVD^^ 

NNL* 

>G2137 (77.. 1123) 

GGGATTTGACTTTAGCACTTCAAAATCCAAAGCTAAAAGACAAAAAAGAATAGAGGTTCG 

ATTTGCATCTCCATTAATGGGCATCGATCTTTCTCTTAAGCTCGAGGCCGAGGAGAAAAA 

GAAAGAGATAGAAGGATCGAAACATAGCCGTGAGAACAAAGAAGACGAAGAACATGATGC 

TAGTGGTGATGAAGATGAACAAATGGTGAAAGAAGACGAAGATGATTCTTCTTCTTTAGG 

TTTAAGAACCCGAGAAGAAGAAAACGAACGTGAAGAGCTCTTGCAGCTACAGATCCAGAT 

GGAAAGTGTGAAAGAAGAGAATACTAGGTTGAGGAAGCTTGTCGAGCAGACTCTTGAAGA 

TTATCGTCATCTTGAGATGAAATTCCCGGTTATCGATAAAACCAAGAAGATGGATCTTGA 

AATGTTCCTTGGAGTACAAGGCAAACGATGTGTGGATATAACAAGTAAGGCTCGGAAAAG 

AGGAGCTGAGAGATCTCCGTCAATGGAAAGAGAAATAGGGCTTTCACTTTCTCTAGAGAA 

AAAACAGAAACAAGAAGAGAGCAAAGAAGCTGTTCAGTCTCATCACCAAAGATACAATAG 

TAGCAGCTTAGATATGAATATGCCACGTATCATTTCATCTTCTCAAGGTAATAGAAAGGC 

CAQGGTGTCCGTGAQGGCGAGATGTGAGACCGCAACAATGAATGATGGATGCCAATGGAG 

GAAGTACGGTCAGAAAACCGCGAAAGGGAATCCATGTCCTCGAGCTTATTACCGATGCAC 

CGTGGCTCCAGGATGTCCCGTTAGAAAACAGGTGCAAAGGTGTTTAGAAGACATGTCAAT 

ACTGATAACAACCTACGAAGGAACACATAACCATCCACTTCCGGTCGGAGCAACAGCCAT 

GGCTTCCACTGCCTCTACTTCTCCATTCTTGTTACTCGATTCCAGTGACAACCTCTCTCA 

TCCTTCCTATTACCAAACTCCTCAAGCCATAGACTCTTCTTTGATTACATACCCACAAAA 

TAGCAGCTACT^CAATCGAACCATAAGAAGCTTGAACTTTGATGGTCCATCTAGAGGAGA 

TCACGTTTCATCTTCTCAAAACCGATTAAATTGGATGATGTAGAGTTTCCTATATCTCTA 

TGCTTGTTCTTTGGTCCCATTATTTGTCATTATGGATTCTTTGCCTTTCTTCTTGTTCTC 

GTTTCTAACATTTATGTTTCGTATA 
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>G2137 Amino Acid Sequence (conserved domain in AA coordinates : 109-168) 

MGIDLSLKLEAEEKKKEIEGSKHSRENKEDEEHDASGDEDEQMVKEDEDDSSSLGLRTRE 

EENEREELLQLQIQMESVKEENTRLRKLVEQTLEDYRHLEMKFPVIDKTKKMDLEMFLGV 

QGKRCVDITSKARKRGAERSPSMEREIGLSLSLEKKQKQEESKEAVQSHHQRYNSSSLDM 

NMPRIISSSQGNRKARVSVRARCETATMNDGCQWRKYGQKTAKGNPCPRAYYRCTVAPGC 

PVRKQVQRCLEDMSILITTYEGTHNHPLPVGATAMASTASTSPFLLLDSSDNLSHPSYYQ 

TPQAIDSSLITYPQNSSYNWRTIRSLNFDGPSRGDHVSSSQNRLNWMM* 

>G746 (1..1311) 

ATGGGTGAGGAGTTAGCTGACACAATGAACCTGGATTTGAATCTTGGGCCTGGTCCTGAG 

TCTGATCTCCAACCTGCACCAAACGAGACTGTGAATTTGGCTGATTGGACTAATGACCCG 

CCTGAGAGATCTTCTGAAGCTGTGACAAGGATCAGGACTCGGCATAGGACACGGTTCAGA 

CAGCTTAATCTCCCGATCCCGGTTCTATCTGAAACCCATACCATGGCTATAGAGCTCAAC 

CAGTTGATGGGAAATTCTGTAAATAGAGCTGCTATGCAGACTGGTGAGGGTAGTGAAAGA 

GGCAATGAGGATTTGAAAATGTGTGAGAATGGCGATGGAGCCCTTGGGGACGGTGTATTG 

GATAAGAAAGCGGATGTCGAGAAAAGCAGTGGCAGCGACGGTAACTTTTTCGATTGTAAT 

ATATGTTTGGATTTGTCGAAGGAGCCGGTTCTCACCTGTTGTGGTCATCTTTACTGTTGG 

CCTTGTCTGTACCAATGGTTACAAATTTCGGATGCAAAGGAATGTCCTGTTTGTAAAGGA 

GAGGTGACCTCCAAAACCGTGACACCGATCTATGGACGTGGAAACCACAAGAGAGAAATT 

GAAGAGAGTTTAGATACTAAGGTCCCCATGAGACCACACGCGAGACGCATTGAGAGCTTG 

AGGAATACAATTCAAAGGTCGCCTTTTACAATACCAATGGAAGAAATGATTAGACGTATA 

CAGAATAGGTTTGACAGGGATTCAACCCCAGTCCCTGATTTTAGTAACCGAGAGGCATCA 

GAAAGAGTCAACGATCGAGCCAATTCGATCCTTAACCGGTTGATGACATCTAGGGGAGTT 

AGATCAGAGCAGAACCAGGCTAGTGCTG CAG CAGCAGCCATTGTCGCAGCATCAGAGGAT 

ATTGATCTAAATCCAAACATTGCTCCTGATCTTGAAGGAGAAAGCAACACGAGATTCCAT 

CCTCTCTTGATCAGGAGACAGTTACAGTCGCACCGAGTTGCAAGGATCTCGACTTTCACT 

TCTGCGTTGAGTTCAGCTGAGAGGCTTGTGGATGCGTATTTTAGGACTCATCCGTTGGGG 

AGGAACCACCAAGAGCAAAACCATCATGCTCCTGTTGTGGTTGATGATAGAGACTCATTC 

TCAAGCATTGCAGCTGTTATAAACTCTGAGAGTCAAGTGGATACTGCAGTTGAGATCGAT 

TCTATGGCTCTTTCGACATCGTCCTCGAGGAGAAGGAATGAGAATGGTTCGAGGGTTTCT 

GATGTAGACAGTGCAGATTCTCGTCCGCCTAGGAGAAGGAGATTTACTTGA 

>G746 Amino Acid Sequence (domain in AA coordinates: 139-178) 

MGEELADTMNLDLNLGPGPESDLQPAPNETV^^ 

QIiNLPIPVLSETHTMAIEIiNQLMGNSWRAAMQTGEGSERGNEDLK^ 

DKKADVEKSSGSDGNFFDCNICLDLSKEPVLTCCGHLYCWPCLYQWLQISDAKECPVCKG 

EWSKTVTPIYGRGNHKREIEESLDTKVPMRPHARRIESLRNTIQRSPFTIPMEEMIRRI 

QNRFDRDSTPVPDFSNREASERV^RANSILNRLMTSRGTOSEQNQASAAAAAIVAASED 

IDLNPNIAPDLEGESNTRFHPLLIRRQLQSHRVARISTFTSALSSAERLVDAYFRTHPLG 

RjraQEQNHHAPWVT)DRDSFSSIAAVINSESQTO 

DVDSADSRPPRRRRFT* 

>G2701 (46.. 837) 

GTGT1TGTAGTTGAAACTTATTCTTCCCTTTTTTTGTTTTTAGGTATGGAGACTCTGCAT 
CCATTCTCTCACCTACCTATCTCTGACCACCGGTTCGTTGTTCAAGAGATGGTGAGCTTA 
CACAGCTCGAGTAGCGGTAGCTGGACTAAAGAAGAGAACAAGATGTTCGAACGAGCTCTT 
GCGATATACGCTGAAGACTCGCCTGATCGCTGGTTTAAAGTTGCTTCCATGATCCCTGGA 
AAGACTGTTTTTGATGTTATGAAGCAATATAGTAAGCTTGAAGAAGACGTTTTCGATATT 
GAAGCAGGACGTGTTCCCATTCCTGGTTATCCTGCAGCTTCTTCTCCCTTGGGGTTTGAC 
ACGGACATGTGTCGTAAACGGCCTAGTGGAGCTAGAGGATCTGATCAAGATCGAAAGAAA 
GGAGTCCCTTGGACAGAGGAAGAACACAGGAGATTCTTGTTAGGCCTTCTCAAGTACGGT 
AAAGGAGATTGGAGAAACATATCGAGAAACTTCGTGGTGTCAAAGACGCCAACGCAAGTG 
GCGAGCCACGCCCAAAAGTATTACCAGAGACAGCTCTCCGGAGCCAAGGACAAACGCAGG 
CCAAGTATCCATGACATCACAACCGGCAATCTTCTCAATGCCAATCTCAACCGTTCCTTT 
TCCGATCATAGAGATATTCTCCCTGATTTAGGGTTTATCGATAAGGATGATACGGAGGAG 
GGAGTAATATTTATGGGTCAGAATCTCTCTTCAGAAAATCTGTTTTCTCCATCACCAACT 
TCATTCGAAGCTGCCATTAACTTCGCCGGAGAAAATGTCTTCAGTGCCGGAGCTTAAGGC 
AACATAGAATCCCCAAACTCAGCGGC 

>G2701 Amino Acid Sequence (domain in AA coordinates: 33-81, 129-183) 
METLHPFSHLPISDHRFWQEMVSLHSSSSGSWTKEENKMFERALAIYAEDSPDRWFKVA 
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SMI PGKTVFDVMKQYSKLEEDVFDI EAGRVPI PGYPAASS PLGFDTDMCRKRPSG ARGSD 
QDRKKGVPWTEEEHRRFLLGLLKYGKGDWRN I S RNF WSKTPTQVASHAQKY YQRQLSG A 
KDKRRPSIHDITTGNLLNANLNRSFSDHRDILPDLGFIDKDDTEEGVIFMGQNLSSENLF 

SPSPTSFEAAINFAGENVFSAGA* 
>G1819 (1. -639) 

ATGGAAGAGAACAACGGCAACAACAACCACTACCTGCCGCAACCATCGTCTTCCCAACTG 
CCGCCGCCACCATTGTATTATCAATCAATGCCGTTGCCGTCATATTCACTGCCGCTGCCG 
TACTCACCGCAGATGCGGAATTATTGGATTGCGCAGATGGGAAACGCAACTGATGTTAAG 
CATCATGCGTTTCCACTAACCAGGATAAAGAAAATCATGAAGTCCAACCCGGAAGTGAAC 
ATGGTCACTGCAGAGGCTCCGGTCCTTATATCGAAGGCCTGTGAGATGCTCATTCTTGAT 
CTCACAATGCGATCGTGGCTTCATACCGTGGAGGGCGGTCGCCAAACTCTCAAGAGATCC 
GATACGCTCACGAGATCCGATATCTCCGCCGCAACGACTCGTAGTTTCAAATTTACCTTC 
CTTGGCGACGTTGTCCCAAGAGACCCTTCCGTCGTXACCGATGATCCCGTGCTACATCCG 
GACGGTGAAGTACTTCCTCCGGGAACGGTGATAGGATATCCGGTGTTTGATTGTAATGGT 
GTGTACGCGTCACCGCCACAGATGCAGGAGTGGCCGGCGGTGCCTGGTGACGGAGAGGAG 
GCAGCTGGGGAAATTGGAGGAAGCAGCGGCGGTAATTGA 

>G1819 Amino Acid Sequence (domain in AA coordinates: 46-188) 

MEEl^GNimHYLPQPSSSQLPPPPLYYQSMPLPSYSLPLPYSPQMRNYWIAQMGNATDVK 

HHAFPLTRIKKIMKSNPEVNMVTAEAPVLISKACEMLILDLTMRSWLHTVEGGRQTLKRS 

DTLTRSDISAATTRSFKFTFLGDWPRDPSWTDDPVLHPDGEVLPPGTVIGYPVFDCNG 

VYASPPQMQEWPAVPGDGEEAAGEIGGSSGGN* 

>G1227 (372.. 1451) 

TCTTCCGTGTGTTAACAGAAGTCCCCACAATTGTCTGTCTTCGCTGCGAGACAAAACTGC 

CACAGCCAATAATGTTTCTCTGAGGGACCTTGCTTCTGTCAGAGACTCGCTCTCTCTCTC 

CTCTTCTTGCTCTGCTCAGCTCTCTCACCAACTCATCTTCAGTCCTCAAACAAACATCTG 

TTCTCATCTTTGTTTTCTTTCCTTTCTTTCTCATATCTCATTTTCAATTTTCCCAATTTC 

TCTTCAACATCTTCATAGCAATTTAAGACCACTATTCCATTATAAAGCTAACTGCTTTAG 

AAACTCCTCACATTTATTTCTTCCC.CATCATTGTTTTAGAGAGGGAGAAAGAAAAAGAGC 

TCAGCTTTCTGATGGAGAGGAGTATTCAAGGACAAAACAAGCTCTGTTGTTTGGACCAAA 

AAGTGAATGTGAGAAGAAGCCTACAAGTTCAAGAAACTGTAGAGGATCATCAAAGCTTTG 

CCCTTGAAGAGGAAGAACAACAACTCTCAACTCCGAGCTTGCTGCAAGACACAACAATAC 

CATTTCTACAAATGCTGCAACAAAGTGAAGACCCTTCACCGTTTTTGTCATTCAAAGACC 

CAAGCTTTCTAGCACTACTATCTCTCCAGAGACTTGAAAAGCCTTGGGAACTCGAAAACT 

ACCTCCCACATGAAGTTCCAGAGTTTCATTCACCGATCCATTCTGAAACCAACCACTACT 

ATCATAATCCATCTTTGGAAGGAGTCAATGAAGCCATCTCAAACCAAGAACTTCCATTCA 

ACCCACTAGAGAATGCGCGTTCAAGACGCAAGCGGAAAAACAACAACTTGGCATCATTGA 

TGACAAGAGAAAAGCGAAAGAGAAGAAGAACTAAACCAACAAAGAACATAGAAGAGATAG 

AGAGTCAAAGAATGACACACATTGCGGTTGAACGAAACCGCAGACGCCAAATGAACGTTC 

ATCTGAACTCACTCCGCTCCATCATTCCATCTTCATACATCCAGAGGGGAGACCAAGCGT 

CAATAGTAGGAGGAGCAATAGACTTCGTAAAGATCCTAGAGCAACAGTTGCAATCCCTTG 

AAG CACAAAAG AGAAGTCAAC AG AGTGATG AT AACAAAGAG C AAATTCCAGAAGATAAC A 

GTCTCAGGAACATTTCGTCGAACAAGTTGCGTGCGAGTAATAAAGAAGAACAAAGTAGCA 

AACTCAAAATCGAAGCCACAGTGATAGAGAGTCACGTCAACCTAAAAATTCAATGTACGA 

GGAAACAAGGACAACTTCTCAGATCAATCATATTGCTGGAGAAACTTCGATTCACTGTTC 

TTCATCTCAACATCACATCTCCGACCAATACATCTGTCTCTTATTCCTTCAACCTCAAGA 

TGGAAGATGAATGTAATTTGGGATCAGCGGATGAGATAACGGCGGCGATTCGTCAGATTT 

TCGACAGCTGATTGACTAATCCAAGTAAAAAGTAAAATAAAAAAAGAAACGTTTACTTTG 

GTAACTTCGTTTTCATGATTAAATTCTTTATTTGGTCGTATGTGATTGGAGTCTTCTCGG 

CATGGAACTTGACTTTGGTTTTAGGGTACTAGTCTCTACAGAAGCTGTGGTCCTTCTTTG 

GATGC 

>G1227 Amino Acid Sequence (domain in AA coordinates: 183-244) 

MERSIQGQNKLCCLDQKVNVRRSLQVQETVEDHQSFALEEEEQQLSTPSLLQDTTIPFLQ 

MLQQSEDPSPFLSFKDPSFLALLSLQTIiEKPWELENYLPHEVPEFHSPIHSETNHYYHNP 

SLEGVNEAISNQELPFNPLENARSRRKRKNNNLASLMTREKRKRR^ 
MTHIAVERNRRRQMNVHLNSLRSIIPSSYIQRGDQASIVGGAIDFVKILEQQLQSLEAQK 
RS QQSDDNKEQI PEDNSLRNI S SNKLRASNKEEQS S KLKI EATVI E SHVNLKIQCTRKQG 
QLLRSI ILLEKLRFTVLHLNITSPTNTSVS YSFNLKMEDECNLGSADEITAAIRQIFDS * 
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>G2417 (118.. 1311) 

CATACCGGTGGAAGATTCTGCTTTACTACGCTCTCCGCTTCTTCTTCTCCTCGATTCGAT 
TCTCCTCATGGGTTTATCATGAATTTTTAGGTTTTGAGTAATTCAGAAACTCGAGTGATG 
ATCCCGAATGATGATGATGATGCAAATTCTATGAAGAATTATCCGTTAAATGATGATGAT 
GCAAATTCTATGAAGAATTATCCGTTAAATGATGATGATGCAAATTCTATGGAGAATTAT 
CCGTTAAGGTCAATTCCGACGGAGCTTTCACACACTTGTTCATTGATACCACCTTCTTTA 
CCAAACCCTTCAGAAGCAGCAGCAGACATGTCCTTCAATTCAGAACTCAATCAAATCATG 
GCAAGGCCTTGTGATATGCTCCCTGCCAATGGTGGAGCTGTTGGTCATAACCCTTTTTTG 
GAACCAGGATTCAACTGCCCCGAGACAACAGATTGGATTCCCTCTCCACTCCCCCATATT 
TATTTTCCTTCGGGTTCTCCCAATCTAATAATGGAGGATGGTGTCATTGATGAGATTCAC 
AAACAAAGTGACTTGCCACTTTGGTATGACGACTTGATTACCACTGATGAAGATCCACTC 
ATGTCTAGTATCTTGGGCGATCTTCTCCTTGACACTAATTTCAACTCAGCTTCAAAGGTG 
CAGCAACCAAGTATGCAATCGCAGATTCAACAACCCCAAGCTGTTCTGCAGCAGCCTTCT 
TCTTGTGTGGAATTGCGCCCACTTGATAGGACAGTATCCTCAAACAGCAACAACAATAGC 
AACAGTAATAATGCAGCAGCAGCAGCTAAGGGACGTATGCGTTGGACGCCTGAACTTCAT 
GAGGTTTTTGTTGACGCTGTTAACCAGCTCGGTGGCAGTAATGAAGCAACTCCTAAAGGT 
GTCCTGAAGCATATGAAAGTCGAAGGTTTGACTATTTTTCATGTCAAAAGTCATTTGCAG 
AAATATAGAACAGCTAAATATATACCAGTACCATCAGAAGGTTCGCCGGAGGCAAGGTTG 
ACACCGCTTGAGCAAATTACATCTGATGATACGAAACGTGGGATAGATATCACTGAGACT 
CTGCGAATTCAGATGGAACATCAGAAGAAACTGCATGAGCAGCTTGAGAGTCTAAGAACA 
ATGCAACTTCGGATAGAAGAGCAAGGAAAGGCGCTGTTGATGATGATTGAGAAGCAAAAT 
ATGGGTTTCGGCGGACCAGAACAAGGAGAGAAAACAAGTGCGAAAACGCCTGAAAATGGT 
TCAGAGGAGTCGGAATCCCCGCGGCCAAAGCGTCCGAGAAATGAAGAATGAAGGAAACCT 
TTCTTCGGATGGTAGATCATAAAACTGTGGTTTTGGTGGAGTTGTAGAGTATGACTTATT 
AGGAGTAGAGCTTTCAGTCTTCTTCAGGC 

>G2417 Amino Acid Sequence (domain in AA coordinates: 235-285) 
MIPNDDDDANSMKNYPLNDDDANSMKNYPL 

LPNPSEAAADMSFNSELNQIMARPCDMLPANGGAVGHNPFLEPGFNCPETTDWIPSPLPH 

IYFPSGSPNLIMEDGVIDEIHKQSDLPLWTODLITTDEDPLMSSILGDLLLDTNFNSASK 

VQQPSMQSQIQQPQAVLQQPSSCVELRPIJDRTVSSNSNimSNSNNAAAAAKGRMRWTPEL 

HEVFVDAVNQLGGSNEATPKGVLKHMKVEGLTIFHVKSHLQKYRTAKYIPVP 

LTPLEQITSDDTKRGIDITETLRIQMEHQKKLHEQLESLRTMQLRIEEQGKALLMMIEKQ 

NMGFGGPEQGEKTSAKTPENGSEESESPRPKRPRNEE* 

>G2116 (104.. 1117) 

TTCATCTCCATC^TTATCTCCATTGACATTGTTCTCAATTGCGAATAATAATCAT^ 

TTCACACAACCAAAGCATTCATCTCTCAGATTCTCTTAAAAAAATGGAGAAATCAGATCC 

TCCACCAGTCCCAAAGCCCGGCGCCACTATTATCCCCTCCTCCGATCCAATTCCTAATGC 

CGATCCGATTCCATCTTCTTCCTTCCACCGCCGATCTCGCTCCGACGATATGTCCATGTT 

CATGTTCATGGATCCCCTCTCCTCCGCCGCACCACCTTCCTCCGACGACCTTCCCTCCGA 

CGACGATCTCTTCTCTTCTTTCATCGATGTCGATAGCCTCACCTCTAATCCCAATCCCTT 

TCAAAATCGTTCCCTCTCCTCCAACTCCGTTTCCGGCGCTGCTAATCCTCCTCCTCCTCC 

TTCCTCTCGTCCTCGCCACCGTCACAGCAATTCCGTTGACGCTGGATGCGCCATGTATGC 

CGGTGATATCATGGACGCTAAGAAAGCTATGCCTCCTGAAAAACTCTCTGAGCTTTGGAA 

CATCGATCCCAAACGCGCCATU^AGGATTCTAGCGAATCGACAATCTGCAGCTCGATCCAA 

AGAGAGAAAAGCTCGATACATTCAAGAACTTGAGCGCAAAGTTCAATCTCTTCAAACCGA 

AGCTACCACTCTCTCTGCTCAGCTTACTCTCTACCAGAGAGACACAAATGGACTAGCAAA 

CGAAAACACAGAGCTGAAACTTAGGTTGCAAGCAATGGAACAACAAGCTCAGCTTCGTAA 

TGCTTTAAACGAAG€GTTGAGGAAAGAAGTTGAAAGGATGAAGATGGAGACAGGAGAAAT 

CTCTGGTAATTCTIGATTCGTTTGATATGGGAATGC^GCAGATTCAGTATTCTTCCTCAAC 

TTTCATGGCTATTCCACCATATCATGGCTCAATGAACCTCCATGATATGCAGATGCATO 

TAGTTTCAATCCTATGGAGATGTCCAATTCTCAAAGCGTGTCGGACTTTCTACAGAACGG 

CCGAATGCAAGGGCTGGAGATTAGTAGCAATAGCTCAAGCTTAGTCAAATCTGAAGGACC 

TTCTCTCTCTGCTAGTGAGAGTAGCTCTGCCTATTGACGACAAGATTATGATGAGGCTCA 
TTTTTCTG 

>G2ll6 Amino Acid Sequence (conserved domain in AA coordinates • 150-210) 

MEKSDPPPVPKPGATIIPSSDPIPNADPIPSSSFHRRSRSDDMSMFMFMDPLSSAAPPSS 

DDLPSDDDLFSSFIDVDSLTSNPNPFQNPSLSSNSVSGAANPPPPPSSRPRHRHSNSVDA 
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GCAMYAGD I MD AKKAMPPEKLS ELWNIDPKRAKRI LANRQS AARS KERKARY I QELERKV 
QSLQTEATTLSAQLTLYQRDTNGLANENTELKLRLQAMEQQAQLRNALNEALRKEVERMK 
METGEISGNSDSFDMGMQQIQYSSSTFMAIPPYHGSMNLHDMQMHSSFNPMEMSNSQSVS 
DFLQNGRMQGLE ISSNSS SLVKSEGPSLSASESS S AY* 
>G647 (1..948) 

ATGATGATCGGCGAAAATAAAAACCGGCCACATCCAACGATCCATATCCCTCAATGGGAT 

CAAATCAACGATCCAACGGCCACAATCTCTTCACCATTCTCTTCCGTCAACCTTAACAGC 

GTTAACGACTACCCACACTCTCCGTCACCGTATCTCGACTCCTTCGCTTCTCTCTTCCGT 

TACCTCCCGTCAAACGAGTTAACAAACGATTCAGACTCATCAAGTGGCGACGAGTCATCA 

CCACTCACCGACTCATTCTCCTCCGACGAGTTTCGCATCTACGAGTTCAAAATCCGGCGA 

TGCGCTCGAGGTCGATCTCATGATTGGACGGAGTGTCCGTTCGCACATCCCGGAGAAAAA 

GCTCGACGACGTGATCCGAGAAAGTTTCATTACTdCGGCACCGCTTGTCCTGAGTTTCGT 

AAAGGAAGTTGTAGAAGAGGTGATTCGTGTGAGTTCTCTCATGGAGTTTTCGAGTGTTGG 

CTCCATCCTTCTCGTTACCGTACTCAGCCGTGTAAAGACGGAACTAGCTGCCGGAGAAGA 

ATCTGTTTCTTCGCTCATACGACGGAGCAGTTACGTGTATTACCTTGTTCGTTAGATCCA 

GATCTTGGATTCTTCTCAGGATTAGCTACTTCTCCGACTTCGATTCTTGTTTCTCCTTCG 

TTTTCACCACCGTCGGAATCTCCGCCGCTTTCTCCGAGTACCGGTGAACTTATTGCGTCG 

ATGAGGAAAATGCAATTGAACGGAGGTGGTTGTTCGTGGAGTTCTCCGATGAGATCTGCA 

GTTAGGTTACCTTTTTCGTCGTCTCTGCGTCCGATTCAGGCGGCAACGTGGCCGAGGATA 

AGAGAGTTTGAGATCGAAGAAGCTCCGGCGATGGAATTTGTGGAATCTGGGAAAGAGCTG 

AGAGCGGAGATGTATGCAAGACTCAGTAGAGAGAACTCACTCGGTTGA 

>G647 Amino Acid Sequence (domain in aa coordinates: 77-192) 

MMIGENKNRPHPTIHIPQWDQINDPTATISSPFSSVNLNSVNDYPHSPSPYLDSFASLFR 

YLPSNELTNDSDSSSGDESSPLTDSFSSDEFRIYEFKIRRCARGRSHDWTECPFAHPGEK 

ARRRDPRKFHYSGTACPEFRKGSCRRGDSCEFSHGVFECWLHPSRYRTQPCKDGTSCRRR 

ICFFAHTTEQLRVLPCSLDPDLGFFSGLATSPTSIIiVSPSFSPPSESPPLSPSTGELIAS 

MRKMQLNGGGCSWSSPMRSAVRLPFSSSLRPIQAATWPRIREFEIEEAPAMEFVESGKEL 

RAEMYARLSRENSLG * 
>G974 (377.. 1162) 

AAAAAAAAAGTTGATATACTTTCTGGTTTTCTCCTTAACTTTTATTCTTTACAAATCCAT 

CCCCCTTAGATCTGTTTATTTCCCGCTACTTTGATTCATTTCTGTTAGTAATCTGTCTTT 

CGTATAGAAGAAAACTGATTTCTTGGTTTGTATTTTCTTAAAGAGATCAATCTTTTTTTA 

TTTTTGATCTTCTTGTGTTTTTTTTTCTTTGTAGAATTAATCGTTTGTGAGGGTATTTTT 

TTAATTCCCTCCTCTCAGAAATCTACACAGAGGTTTTTTATTTTATAAACCTCTTTTTCG 

ATTTTCTTGAAAACAAAAAATCCTGTTCTTTACTTTTTTTACAAGAACAAGGGAAAAAAA 

TTTCTTTTTATTAGAAATGACAACTTCTATGGATTTTTAC^GTAACAAAACGTTTCAA^ 

ATCTGATCCATTCGGTGGTGAATTAATGGAAGCGCTTTTACCTTTTATCAAAAGCCCTTC 

CAACGATTCATCCGCGTTTGCGTTCTCTCTACCCGCTCCAATTTCATACGGGTCGGATCT 

CCACTCATTTTCTCACCATCTTAGTCCTAAACCGGTCTCAATGAAACAAACCGGTACTTC 

CGCGGCTAAACCGACGAAGCTATACAGAGGAGTGAGACAACGTCACTGGGGAAAATGGGT 

GGCTGAGATTCGTTTACCGAGGAATCGAACTCGACTTTGGCTCGGAACATTCGACACGGC 

GGAGGAAGCTGCTTTAGCTTATGACAAGGCGGCGTATAAGCTCCGAGGAGATTTTGCGCG 

GCTTAATTTCCCTGATCTCCGTCATAACGACGAGTATCAACCTCTTCAATCATCAGTCGA 

CGCTAAGCTTGAAGCTATTTGTCAAAACTTAGCTGAGACGACGCAGAAACAGGTGAGATC 

AACGAAGAAGTCTTCTTCTCGGAAACGTTCATCAACCGTCGCAGTGAAACTACCGGAGGA 

GGACTACTCTAGCGCCGGATCTTCGCCGCTGTTAACGGAGAGTTATGGATCTGGTGGATC 

TTCTTCGCCGTTGTCGGAGCTGACGTTTGGTGATACGGAGGAGGAGATTCAGCCGCCGTG 

GAACGAGAACGCGTTGGAGAAGTATCCGTCGTACGAGATCGATTGGGATTCGATTCTTCA 

GTGTTCGAGTCTTGTAAATTAGATGTTGCCATAGGGGTATTTTAGGGACTTTAGAGCTCT 

CTGCGATGGAGTTTTTGGTCATTGC^GAGATTTTATTATTATTAAGGGGGTTTGTTATGT 

TT^ATATCAAATAAGTTTATCTACTTTGATGTTAATTAGTGTTAATCTCTGCGTCGGTCCA 

AGCTGTTTTTTTTTGGCATGCTTCGACCGTGTGAGATO 
CTTGATTTTCTTAGTTCAAGTTAAATTGGCACAAAAAAAAAAAAAAAAAA 

>G974 Amino Acid Sequence (domain in AA coordinates: 81-140) 
MTTSMDFYSNKTFQQSDPFGGELMEALLPFIKSPSNDSSAFAFSLPAPISYGSDLHSFSH 
HLSPKPVSMKQTGTSAAKPTKLYRGVRQRHWGKWVAEIRLPRNRTRLWLGTFDTAEEAAL 
AYDKAAYKLRGDFARLNFPDLRHNDEYQPLQSS VDAKLEAI CQNLAETTQKQVRSTKKS S 
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SRKRSSTVAVKLPEEDYSSAGSSPLLTESYGSGGSSSPLSELTFGDTEEEIQPPWNENAL 

EKYPSYEIDWDSILQCSSLVN* 

>G1419 (27..692) 

GAAGACTCCAACATAATTCATCATCTATGGCTTCTTCACATCAACAACAGCAAGAACAAG 
ACCAGTCAGCTTTAGATCTCATAACCCAACACCTTCTTACTGATTTCCCTTCCTTAGACA 
CCTTTGCCTCCACCATCCACCACTGCACCACCTCAACTCTAAGCCAACGCAAACCACCTC 
TTGCCACTATAGCAGTTCCTACTACTGCACCGGTGGTTCAAGAGAATGATCAAAGGCATT 
ACAGAGGCGTCAGGAGAAGACCATGGGGTAAGTATGCGGCTGAGATCAGAGACCCAAACA 
AGAAAGGTGTTCGTGTCTGGTTAGGCACTTTTGACACAGCCATGGAAGCTGCAAGAGGTT 
ATGACAAGGCAGCTTTTAAACTACGAGGAAGCAAAGCTATTCTTAACTTCCCACTTGAAG 
CAGGAAAGCATGAGGACTTGGGAGACAACAAGAAGACTATTTCTTTAAAAGCAAAGAGGA 

agagacaggtgacggaggatgaaagccAgctgatcagccgtaaagctgttaagagggaag 

AAGCTCAGGTTCAGGCTGATGCTTGTCCATTAACGCCATCAAGTTGGAAGGGGTTTTGGG 
ACGGAGCAGACAGTAAAGACATGGGAATATTTTCCGTGCCTCTGTTATCTCCTTGTCCAT 
CTCTTGGACACTCTCAACTCGTAGTTACTTAAGCTTCAGAGGGTCAAACTGGAAAAAATC 
AACATTGGATTGTTTTCAAAGCTTCTAGATTAGCTGATTGTAAAAAAATGTTTTACTATA 
TTCATTCATTCTTCTTAAATGCAATTCTTTCTACCCTTCC 

>G1419 Amino Acid Sequence (domain in AA coordinates: 69-137) 

MASSHQQQQEQDQSALDLITQHLLTDFPSLDTFASTIHHCTTSTLSQRKPPLATIAVPTT 

APWQENDQRHYRGVRRRPWGKYAAE I RD PNK3CG VRVWLGTFDTAMEAARGYDKAAFKLR 

GSKAILNFPLEAGKHEDLGDNKKTISLK7VKRKRQVTEDESQLISRKAVKREEAQVQADAC 

PLTPSSWKGFWDGADSKDMGIFSVPLLSPCPSLGHSQLWT* 

>G1634 (22.. 855) 

TTATCTCGTAGCCTTTAAACGATGGAGACTCTGCATCCACTACTCTCGCACGTGCCAACT 

TCTGACCACCGGTTTGTAGTTCAAGAGATGATGTGCTTGCAAAGCTCGAGCTGGACTAAA 

GAAGAGAACAAGAAGTTTGAGCGAGCTCTTGCTGTCTACGCTGATGACACGCCTGATCGC 

TGGTTCAAAGTTGCTGCTATGATCCCTGGAAAGACCATATCAGATGTCATGAGGCAATAC 

TCTAAGCTTGAAGAAGACCTCTTCGATATCGAAGCAGGACTTGTCCCGATCCCGGGTTAC 

CGTTCAGTTACTCCTTGTGGATTTGATCAGGTTGTGAGTCCACGTGACTTTGATGCGTAT 

CGTAAACTTCCTAATGGAGCCAGAGGATTTGATCAAGACCGTAGGAAAGGAGTTCCATGG 

ACGGAGGAAGAACACAGGAGATTCTTGTTAGGGCTTCTCAAGTATGGGAAAGGAGATTGG 

AGAAACATATCGAGGAACTTTGTGGGATCAAAAACACCAACTCAGGTTGCAAGTCATGCC 

CAAAAGTACTACCAAAGACAGCTTTCCGGTGCGAAAGACAAACGACGGCCTAGCATTCAC 

GACATCACCACCGTCAATCTTCTCAATGCCAATCTTAGCCGTCCATCGTCTGATCACGGT 

TGCTTAGTCTCAAAACAGGCCGAGCCGAAACTAGGGTTCACCGACAGGGATAATGCAGAG 

GAGGGAGTTATGTTTCTTGGTCAGAATCTATCCTCGGTCTTCTCTTCCTACGATCCTGCC 

ATTAAGTTTTCCGGAGCAAATGTTTACGGTGAAGGAGGTTACTGTATCTCACAAGATCTT 

GAAACGAGAAAATGAGAATTTTGAAATTTTAACTATTGCAACGAAACCATAATTGC 

>G1634 Amino Acid Sequence '(domain in AA coordinates: 129-180) 

METLHPLLSHVPTSDHRFWQEMMCLQS S S WTKEENKKFERALAVYADDTPDRWFKVAAM 

ipGKTI SDVMRQYSKLEEDLFD IEAGLVP I PG YRS VTPCGFDQ WS PRDFDAYRKLPNGA 

RGFDQDRRKGVPWTEEEHRRFLLGLLKYGKGDWRNISRNFVGSKTPTQVASHAQKYYQRQ 

LSGAKDKRRPS IHD ITTVNLLNANLSRPS SDHGCLVS KQAEPKLGFTDRDNAEEGVMFLG 

QNLS S VFS S YDPAI KFSGANVYGEGGYC I SQDLETRK* 

>G1637 (1..954) 

ATGGTGAAGGAGACGGTGACGGTGGCGAAAACGTGCTCACACTGTGGCCATAATGGCCAT 

AACGCACGGACTTGTCFCAACGGCGTTAATAAGGCAAGTGTTAAACTGTTCGGCGTTAAT 

ATATCGTCTGATC(^ATTAGGCCGCCTGAGGTAACGGCGTTAAGGAAGAGTCTTAGTTTG 

GGAAACCTTGATGCTCTTCTCGCTAACGATGAAAGTAACGGTAGCGGTGATCCTATCGCC 

GCCGTTGATGATACCGGTTATCATTCCGATGGTC^GATTCATTCCAAGAAGGGTAJ\AACT 

GCTCATGAGAAGAAAAAGGGGAAGCCATGGACGGAAGAAGAACATCGTAATTTCTTAATC 

GGTTTAAACAAACTCGGAAAAGGAGATTGGAGAGGCATTGCAAAGAGTTTCGTGTCGAC^ 

AGAACACCAACACAAGTCGCAAGTCATGCTCAGAAATATTTTATTAGGTTA 

GACAAGAGAAAAAGACGTGCTAGTCTCTTTGACATCTCTCTCGAAGATCAGAAGGAGAAA 

GAGAGGAACTCTCAAGATGCTTCAACAAAGACTCCACCTAAACAACCAATAACCGGAATT 

CAAC^CCGGTAGTACAAGGTCATACTCAAACCGAGAT^ 

TCAATGGAGTATATGCCAATCTACCAACCCATACCACCTTACTACAACTTTCCACCTATT 
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ATGTACCATCCAAATTATCCAATGTACTATGCCAACCCTCAAGTACCGGTTAGGTTTGTT 

CATCCTTCTGGTATACCTGTTCCAAGACATATACCGATTGGTTTGCCTCTGTCTCAACCG 

AGTGAAGCTTCTAATATGACAAATAAAGACGGTTTGGATCTTCATATCGGTTTGCCTCCA 

CAAGCTACTGGAGCTTCTGACTTGACTGGTCATGGCGTTATTCATGTGAAATGA 

>G1637 Amino Aoid Sequence (domain in AA coordinates: 109-173) 

MVKETVTVAKTCSHCGHNGHNARTCI^NGWKASVKLFGVNISSDPIRPPEVTALRKSLSL 

GNLDALLANDESNGSGDPIAAVDDTGYHSDGQIHSKKGKTAHEKKKGKPWTEEEHRWFLI 

GLNICLGKGDWRGIAKSFVSTRTPTQVASHAQKYFIRLNVNDKRKRRASLFDISLEDQKEK 

ERNSQDAS TKTPPKQ P ITG I QQP WQGHTQTE I SNRFQNLS MEYM P I YQP I PP YYN F PP I 

MYHPNYPMYYANPQVPVRFVHPSGIPVPRHIPIGLPLSQPSEASNMTNKDGLDLHIGLPP 

Q ATGASDLTGHG V I H VK * 

>G1818 (601.. 1161) 

TAACAAATCAAATAATTAGAGAAATAACCAAAATTTAACTTTTAGAGGGACTACAGGATT 

TGTACTTTGTACATTCATATATTATTGTTATATATCGTTTCATACATTAATTTGAACCAA 

TGTAAATTAAGTAAAATTCAATTTAACATCATGAGCAAATTCTTATTAAAATTCTCTTAA 

AATTTTGAGCAAATTATGCTTTCACATTTAACATTTGAAAACATCATTTTTAACAAGATA 

TTCAAAACTAAGTTTTGTACAGCAAAATTTTAACTTTCAATTTTATAGAGAAAAAGGTAT 

TTTTTTTTTTGTTTCATTTTTATAAGACTATTATTTGGTATATAATATACACTTTAAGTA 

AAAACAAATCTCTTTCTTTTTTCTTCTTATAATACCAACCACAAGTCTGTCAGTCACACA 

CATACAGTTAATAACATTAAATATTCTTAACAAACTACTAAATAGGTTGAGATTCATATA 

TGTAAAGAGATCACTTCTTAATCTTATCCTACCATATCTTATATACGCTTAATTTTCCTT 

TATATATGCAAACCTCCACATAAAAATATCTCAAACCCAAACACTTCAAACAAAAAAAAA 

ATGGAGAACAACAACAACAACCACCAACAGCCACCGAAAGATAACGAGCAACTAAAGAGT 

TTCTGGTCAAAGGGGATGGAAGGTGACTTGAATGTCAAGAATCACGAGTTCCCCATCTCT 

CGTATCAAGAGGATAATGAAGTTTGATCCGGATGTGAGTATGATCGCTGCTGAGGCTCCA 

AATCTCTTATCTAAGGCTTGTGAAATGTTTGTCATGGACCTCACGATGCGTTCATGGCTC 

CATGCTCAAGAGAGCAACCGACTCACGATACGGAAATCTGATGTTGATGCCGTAGTGTCT 

CAAACCGTCATCTTTGATTTCTTGCGTGATGATGTCCCTAAGGACGAGGGAGAGCCCGTT 

GTCGCCGCTGCTGATCCTGTGGACGATGTTGCTGATCATGTGGCTGTGCCAGATCTTAAC 

AATGAAGAACTGCCGCCGGGAACGGTGATAGGAACTCCGGTTTGTTACGGTTTAGGAATA 

CACGCGCCACACCCGCAGATGCCTGGAGCTTGGACCGAGGAGGATGCGACTGGGGCAAAT 

GGAGGAAACGGTGGGAATTiuVTATTTGGATTGGGTTTTGTAACCGCTGTTGTGAGAACTT 

GAATTTCTTTTTGAGTTCTGCTTATGTTTTCAATGTTATGTTTTTTAGTTGTTGAATGTA 

TTTCTGTTGTTTTGTCCAAAAAAAAAAAAGAATGTATTTCTGTTGTTGTCTTTCAAATGA 

ATCTAATGGTTTATGAATATTGGCTTTAGATTAATTTATGCATACAAAAACACAAGGATT 

ACGGATAAAAAAGTCCTCAGTTTACCCATGGAAACATAATCTTCTAGTGATTCCTTATGA 

GAGTAGAAAAGAATCATATATTATAATCTATTTCATAAGAGATAGGGTACTGTAAACAAG 

GATGTTTATTCGGCTATTTCTTTTTTTTTTAATCACTTTTACTTGTCAAGACTCTTTTGT 

GTTTGCAGCTTTTTGTTAGATTACATTCTAGAGGCAACAAGATCCAGAGATCTAGCAAAA 

AAAACTTATTTTGAAACCTGAATCTATTTTAAAAATTTTCCAACTCATTTTTCGTTCTO 

TTCTTTGTTTTCCAACGGAATTTGGCGCACAAACGATTTATTTGAATTTTGTCTTTCAAG 

>G1818 Amino Acid Sequence (domain in AA coordinates: 36-113) 

MENNNl^QQPPKDNEQLKSFWSKGMEG 

NLLSKACEMFVMDLTMRSWLHAQESNRLTIRKSDVDAWSQTVIFDFLRDDVPKDEGEPV 
VAAADPVDDVADHVAVPDLIJNEELPPGTVIGTPVCYGLGIHAPHPQMPGAWTEEDATGAN 
GGNGGN* 

>G1820 (1..609) 

ATGGCTGAGAACAAGAACAACAACGGCGACAACATGAACAACGACAACCACCAGCAACCA 
CCGTCGTACTCGCAGCTGCCGCCGATGGCATCATCCAACCCTCAGTTACGTAATTACTGG 
ATTGAGCAGATGGAAACCGTCTCGGATTTCAAAAACCGTCAGCTTCCATTGGCTCGAATT 
AAGAAGATCATGAAGGCTGATCCAGATGTGCACATGGTCTCCGCAGAGGCTCCGATCATC 
TTCGCAAAGGCTTGCGAAATGTTCATCGTTGATCTCACGATGCGGTCGTGGCTCAAAGCC 
GAGGAGAACAAACGCCACACGCTTCAGAAATCGGATATCTCCAACGCAGTGGCTAGCTCT 
TTCACCTACGATTTCCTTCTTGATGTTGTCCCTAAGGACGAGTCTATCGCCACCGCTGAT 
CCTGGCTTTGTGGCTATGCCACATCCTGACGGTGGAGGAGTACCGCAATATTATTATCCA 
CCGGGAGTGGTGATGGGAACTCCTATGGTTGGTAGTGGAATGTACGCGCCATCGCAGGCG 
TGGCCAGCAGCGGCTGGTGACGGGGAGGATGATGCTGAGGATAATGGAGGAAACGGCGGC 
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GGAAATTGA 

>G1820 Amino Acid Sequence (domain in AA coordinates: 70-133) 
MAENNI^GDNMmDNHQQ 

KKIMKADPDVHMVSAEAPIIFAKACEMFIVDLTMRSWLKAEENKRHTLQKSDISNAVASS 

FTYDFLLDWPKDESIATADPGFVAMPHPDGGGVPQYYYPPGWMGTPMVGSGMYAPSOA 
WPAAAGDGEDDAEDNGGNGGGN * 

>G1903 (1 . .1200) 

ATGTCTAAATCTAGAGATACGGAGATAAAGTTGTTTGGGAGGACAATCACATCTCTTTTA 
GATGTGAATTGTTATGATCCGTCGTCGTTGTCCCCTGTTCACGATGTTTCTTCTGATCCA 
AGCAAGGAGGATTCGTCTTCTTCTTCATCTTCTTGTTCTCCAACTATTGGACCAATCAGG 
GTTCCGGTTAAAAAAAGTGAGCAAGAGAGTAACAAATTCAAAGATCCATATATATTATCC 
GATCTAAACGAACCACCAAAAGCAGTATCTGAGATTTCATCACCAAGAAGTTCCAAGAAC 
AACTGTGATCAACAGAGCGAGATCACAACAACAACTACCACAAGTACTACATCAGGAGAG 
AAATCAACGGCTCTCAAGAAACCGGACAAGCTTATTCCATGTCCTAGATGTGAAAGCGCA 
AACACCAAATTCTGTTATTACAACAACTACAACGTGAACCAGCCACGTTACTTCTGCAGG 
AACTGTCAGAGGTATTGGACAGCTGGTGGATCTATGAGGAACGTTCCTGTTGGCTCAGGT 
CGTCGCAAGAACAAAGGATGGCCTTCTTCAAACCATTACTTGCAAGTCACTTCTGAGGAT 
- TGTGATAATAATAACTCGGGGACGATCCTTAGTTTCGGTTCTTCGGAGTCTTCGGTTACA 
GAGACTGGTAAGCATCAGTCAGGTGATACAGCAAAGATAAGTGCTGATTCAGTTTCTCAA 
GAAAATAAAAGCTACCAAGGGTTTCTTCCTCCGCAAGTAATGTTACCTAATAATTCTTCT 
CCTTGGCCTTACCAATGGAGTCCAACGGGTCCTAACGCTAGTTTCTACCCTGTCCCCTTC 
TACTGGGGATGCACGGTTCCGATATACCCTACCTCAGAGACTTCATCATGTTTAGGAAAA 
CGGTCAAGAGATCAAACTGAAGGAAGAATCAATGATACTAATACAACAATAACTACTACA 
AGAGCAAGATTGGTCTCAGAATCTCTTAGAATGAATATCGAAGCTAGTAAGAGCGCTGTG 
TGGTCTAAGTTACCGACAAAACCCGAGAAAAAAACGCAAGGATTCAGTTTGTTCAATGGA 
TTTGACACAAAGGGAAACAGCAACAGAAGTAGCTTGGTCTCCGAAACTTCTCACAGTCTA 
CAAGCAAACCCTGCAGCGATGTCTAGAGCTATGAACTTCAGGGAGAGCATGCAACAATAA 
>G1903 Amino Acid Sequence (domain in AA coordinates: 134-180) 

MSKSRDTEIKLFGRTITSLLDVNCYDPSSLSPVHDVSSDPSKEDSSSSSSSCSPTIGPIR 

VPVKKSEQESNKFKDPYILSDLNEPPKAVSEISSPRSSKl^CDQQSEITTTTTTSTTSGE 
KSTALKKPDKLIPCPRCESANTKFCYYNN^ 

RRKNKGWPSSNHYLQVTSEDCDNNNSGTILSFGSSESSVTETGKHQSGDTAKISADSVSQ 
ENKSYQGFLPPQVMLPNNSS PWPYQWSPTGPNASFYPVPFYWGCTVP I YPTSETS SCLGK 
RSRDQTEGRINDTNTTITTTRARLVSESLRMNIEASKSAVWSKLPTKPEKKTQGFSLFNG 
FDTKGNSNRSSLVSETSHSLQANPAAMSRAMNFRESMQQ* 
>G371 (1..582) 

ATGGAGATTGAGAAGGATGAGGACGACACAACATTGGTTGATTCTGGAGGAGACTTCGAC 

TGC^QVTATGTTTGGATCAGGTTCGAGACCCGGTCGTGACTTTATGTGGCCACCTGTTT 

TGTTGGCCCTGCATTCACAAGTGGACTTATGCGTCCAACAATTCAAGACAACGAGTCGAT 

CAATACGATCATAAGAGGGAACCACCAAAATGTCCGGTATGCAAATCTGATGTCTCCGAG 

GCTACGCTTGTCCCGATCTACGGACGAGGACAGAAAGCTCCCCAGTCCGGTTCAAATGTA 

CCGAGCAGACCAACTGGTCCGGTTTATGACTTAAGAGGAGTTGGTCAACGTTTAGGAGAA 

GGGGAGAGTCAACGTTACATGTATAGAATGCCTGATCCGGTGATGGGTGTGGTATGCGAA 

ATGGTATACCGGAGACTATTTGGAGAGTCTTCGAGCAACATGGCACCTTACCGCGATATG 

AATGTCCGGTCTAGGCGACGGGCAATGCAGGCTGAGGAGTCATTAAGCAGAGTCTACTTG 

TTTCTACTTTGCTTCATGTTTATGTGTCTATTTCTCTTCTAA 

>G371 Amino Acid Sequence (domain in aa coordinates: 21-74) 

MEIEKDEDDTTLVDSGGDFDCNICLDQVRDPW^ 

QYDHKREPPKCPVCKSDVSEATLVPIYGRGQKAPQSGSNVPSRPTGPVYDLRGVGQRLGE 

GESQRYMYRMPDPVMGWCEMVYRRLFGESSSNMAPYRDMNVRSRRRAMQAEESLSRVYL 
FLLCFMFMCLFLF* 

>G597 (255.. 1310) 

AAAATTCTCCTGTAAAATTTAATATTATAAAAGTGGTTTCTTTTTC^TTT 
AATTTTCATCTTTAATCTTAAATTCTGGTAACCTTAATGCGCGATCCGCTTCT 
TTTGTGAGAGAGAAGAGATCTAAAAAAATCCACAATTTTGTTCAAATCTTGGAGTTAAAT 
G CTG AATTTTAGGC CTTGTTG CTTAGATTTATGGCTTAAAGTTTCAAACTTTTCATTGG A 
TATGTGAGAAGAAAATGTCAGGATCTGAGACGGGTTTAATGGCGGCGACCAGAGAATCAA 
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TGCAATTTACAATGGCTCTCCACCAGCAGCAGCAACACAGTCAAGCTCAACCTCAGCAGT 
CTCAGAACAGGCCATTGTCATTCGGTGGAGACGACGGAACTGCTCTTTACAAGCAGCCGA 
TGAGATCAGTATCACCACCGCAGCAGTACCAACCCAACTCAGCTGGTGAGAATTCTGTCT 
TGAACATGAACTTGCCCGGAGGTGAGTCTGGAGGCATGACTGGAACTGGAAGTGAGCCAG 
TGAAAAAGAGGAGAGGTAGACCGAGG7VAATATGGGCCTGATAGTGGTGAAATGTCACTTG 
GTTTGAATCCTGGAGCTCCTTCTTTCACTGTCAGCCAACCTAGTAGCGGCGGCGATGGAG 
GAGAGAAGAAGAGAGGAAGACCTCCTGGTTCTTCTAGCAAAAGGCTCAAGCTTCAAGCTT 
TAGGCTCGACTGGAATCGGATTTACGCCTCATGTACTTACCGTGCTGGCTGGAGAGGATG 
TATCATCCAAGATAATGGCGTTT^ACTCATAATGGACCCCGTGCTGTGTGTGTCTTGTCTG 
CAAATGGAGCCATCTCCAATGTGACTCTCCGCCAGTCTGCCACATCCGGTGGAACTGTTA 
CATATGAGGGGAGATTTGAGATTCTGTCTTTATCGGGATCTTTCCATTTGCTGGAGAACA 
ATGGTCAAAGAAGCAGGACGGGAGGTCTAAGCGTGTCATTATCAAGTCCGGATGGTAATG 
TCCTCGGTGGCAGTGTAGCTGGTCTTCTTATAGCAGCATCACCTGTTCAGATTGTTGTTG 
GGAGTTTCTTACCAGACGGAGAAAAAGAACCAAAACAGCATGTGGGACAAATGGGACTGT 
CGTCACCCGTATTACCGCGTGTGGCCCCAACGCAGGTGCTGATGACTCCAAGTAGCCCAC 
AATCTCGAGGCACAATGAGTGAGTCATCTTGTGGAGGAGGACATGGAAGCCCTATTCATC 
AGAGCACTGGAGGACCTTACAATAACACCATTAACATGCCCTGGAAGTAGCCAAGTGATC 
TGTGTCGGCTTAAAACCAACAACTTCCCGTTATTAGAGTGATTTATTTCTACATTTGGTT 
TAGACTTTCTAGTTCTGATGGTTATTTCTACAGTTGGTTTAGACTTTCTAGTTCTGTTCA 
GACAAAAGGAGTTTGATAAATTGACCGACCTATTTTGTGTGTTTGAGGTACTTTCAGAAC 
CATAGGTGTTGAGAAATTAGAATGTTCTGTTTAAAAAA 

>G597 Amino Acid Sequence (domain in AA coordinates: 97-104,137-144) 

MSGSETGLMAATRESMQFTMALHQQQQHSQAQPQQSQNRPLSFGGDDGTALYKQPMRSVS 

PPQQYQPNSAGENSVLNI^LPGGESGGMTGTGSEPVKKRRGRPRKYGPDSGEMSLGLNPG 

APSFTVSQPSSGGDGGEKKRGRPPGSSSKRLKLQALGSTGIGFTPHVLTVLAGEDVSSKI 

MALTHNGPRAVCVLSANGAI SNVTLRQSATSGGTVTYEGRFEI LSLSGS FHLLENNGQRS 

RTGGLSVSLSSPDGNVLGGSVAGLLIAASPVQIWGSFLPDGEKEPKQHVGQMGLSSPVL 

PRVAPTQVLMTPSSPQSRGTMSESSCGGGHGSPIHQSTGGPYWNTINMPWK* 

>G1009 (28.. 1704) 

AAAAAAAAAAAAAACCTATTCCCAAAGATGAAGAACAATAACAACAAATCTTCTTCTTCT 

TCTAGCTATGATTCTTCTTTGTCTCCTTCTTCTTCATCCTCCTCCCACCAGAACTGGCTC 

TCTTTCTCTCTCTCCAACAATAACAACAACTTCAATTCTTCCTCAAACCCTAATCTCACT 

TCCTCCACATCAGATCATCATCATCCTCACCCTTCTCACCTCTCTCTCTTTCAAGCTTTC 

TCCACTTCTCCAGTCGAACGGCAAGATGGGTCACCGGGAGTTTCACCCAGCGATGCCACG 

GCGGTTCTTTCCGTATACCCCGGCGGTCCTAAACTTGAGAACTTCCTCGGCGGAGGAGCC 

TCAACGACGACAACAAGACCAATGCAACAAGTGCAATCTCTTGGCGGCGTTGTCTTCTCT 

TCCGACCTACAGCCACCGCTTCATCCTCCGTCCGCCGCCGAGATCTACGACTCTGAGCTC 

AAGTCAATAGCCGCTAGCTTCCTAGGAAACTACTCCGGTGGACACTCGTCGGAGGTCTCT 

AGCGTACATAAACAACAACCGAATCCTCTAGCTGTCTCAGAGGCTTCGCCTACTCCGAAG 

AAGAACGTAGAGAGTTTTGGACAACGTACCTCGATTTATAGAGGAGTCACAAGACATAGA 

TGGACTGGAAGATACGAAGCTCATCTATGGGATAATAGTTGCCGAAGAGAAGGCCAAAGC 

AGAAAAGGAAGACAAGTTTATTTAGGTGGTTATGATAAGGAAGATAAAGCAGCTAGAGCT 

TACGACCTTGCAGCTCTTAAGTATTGGGGTCCTACAACTACGACTAATTTCCCGATATCA 

AATTACGAATCTGAACTTGAAGAAATGAAACACATGACTCGACAAGAGTTCGTTGCTTCT 

TTAAGACGGAAAAGCAGTGGATTCTCTAGGGGTGCCTCCATGTACAGAGGCGTCACTAGA 

CATCATCAGCATGGTCGATGGCAGGCACGAATTGGAAGAGTTGCAGGCAACAAAGACCTT 

TATCTTGGCACATTTAGCACTCAAGAGGAAGCTGCAGAAGCTTATGATATAGCAGCGATC 

AAATTCCGCGGTCTAAATGCAGTCACCAATTTCGACATCAGTCGATATGATGTCAAATCA 

ATTGCTAGCTGTAATCTCCCTGTGGGTGGACTAATGCCTAAACCTTCTCCAGCAACCGCA 

GCGGCTGACAAAACCGTTGATCTTTCTCCATCCGACTCTCCATCTCTAACCACACCGTCC 

CTCACGTTCAATGTGGCAACACCGGTCAATGACCATGGAGGAACTTTTTACCAC^ 

ATACCAATCAAACCAGACCCGGCTGATCATTATTGGTCCAACATCTTTGGATTCCAGGCA 

AACCCGAAAGCAGAAATGCGACCATTAGCAAACTTTGGGTCGGATCTTCATAACCCTTCT 

CCTGGTTATGCTATAATGCCGGTAATGCAGGAAGGTGAAAACAACTTTGGTGGTAGTTTT 

GTTGGGTCTGATGGGTATAACAATCATTCCGCTGCATCGAACCCGGTCTCAGCAATTCCG 

CTGTCCTCGACAACTACAATGAGTAACGGTAACGAAGGGTATGGTGGAAACATAAACTGG 

ATTAATAACAACATTTCAAGTTCTTACCAAACTG 
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ACACCGGTTTTTGGGTTGGAATGAGTATTCACATCTTAGTGAGAACTAAAATAAATATGT 

AGGAAAAAAATAAGGCTCTGTTTGAAGAAATCAGATATTTTCTTCTTAGATTATTTAAGT 

AGTTTAAAAAAAATATTTTTTAAGTGTTTCACTTTTACGTTTGTCTGCTGACCACGAATT 

TTGCTGGATCTGACAGTACTAACTCTTTGTTTAATGACCTTATGGGTTCCTTTTTTACTT 

TCCAGAACTTTTATTTACTTTTTTCTTCATTTTTCTTCATTTTTTTTGTTGTGGGACAAT 

ATGAATGATTGAAGATGGAAACTGCTTGCATGTGAATAAACGAAAATCAAACNATCTTCG 
GTAACTTAAAAA 

>G1009 Amino Acid Sequence (domain in aa coordinates: 201-277, 303-371) 
MKNNNNKSSSSSSYDSSLSPSSSSSSHQIW^ 

HPSHLSLFQAFSTSPVERQDGSPGVSPSDATAVLSVYPGGPKLENFLGGGASTTTTRPMQ 

QVQSLGGWFSSDLQPPLHPPSAAEIYDSELKSIAASFLGNYSGGHSSEVSSVHKQQPNP 

LAVSEASPTP KKNVESFGQRTS I YRGVTRHRWTGRYEAHLWDNSCRREGQSRKGRQVYLG 

GYDKEDKAARAYDLAALKYWGPTTTTNFPISNYESELEEMKHMTRQEFVASLRRKSS 

RGASMYRGVTRHHQHGRWQARIGRVAGNKDLYLGTFSTQEEAAEAYDIAAIKFRGLNAVT 

NFDI SRYDVKS I ASCNLPVGGLMPKPS PATAAADKTVDLS PSDSPSLTTPSLTFNVATPV 

NDHGGTFYHTGIPIKPDPADHYWSNIFGFQANPKAEMRPLANFGSDLHNPSPGYAIMPVM 

QEGENNFGGSFVGSDGYl^SAASNPVSAIPLSSTTTMSNGNEGYGGNINWINNNISSSY 

QTAKSNLSVLHTPVFGLE* 

>G170 (1..1107) 

ATGGGGATGAAGAAGGTGAAGCTATCTTTGATAGCTAATGAAAGATCAAGGAAAACATCC 
TTCATAAAGAGGAAAGACGGGATTTTTAAGAAACTCCACGAGTTGTCAACTCTGTGTGGT 
GTCCAAGCTTGTGCTCTCATCTACAGTCCATTCATACCGGTTCCAGAGTCATGGCCGTCA 
AGGGAAGGTGCTAAAAAGGTGGCTTCAAGGTTTCTGGAGATGCCGCCGACAGCCCGAACC 
AAGAAGATGATGGATCAAGAGACTTACCTTATGGAGAGGATTACCAAAGCAAAAGAGCAA 
CTAAAGAACCTGGCTGCTGAGAACCGAGAGTTACAGGTTAGACGATTTATGTTTGATTGT 
GTTGAAGGCAAAATGTCCC^GTATCATTATGATGCAAAAGACCTTCAAGATTTGCAATCT 
TGTATAAATCTATATCTCGATCAGCTTAACGGAAGGATCGAGTCCATTAAAGAAAATGGT 
GAGTCGTTGTTGTCTTCCGTCTCTCCTTTTCCTACTAGAATTGGTGTTGACGAAATTGGT 
GATGAGTCATTTTCCGACTCTCCTATTCATGCTACAACTGGGGTTGTAGATACTCTTAAT 
GCTACCAATCCTCATGTTCTTACGGGCGATATGACTCCTTTTCTTGATGCGGACGCAACT 
GCGGTAACTGCTTCCAGTAGATTTTTTGATCATATTCCATATGAAAATATGAATATGAGT 
CAAAATCTGCATGAACCGTTTCAACACCTTGTTC^ 

AATCAGAATATGAATCAGGTTCAATACCAGGCTCCTAATAATCTGTTTAATCAGATTCAA 
CGAGAATTCTACAACATAAATTTGAATCTGAATTTGAATCTGAATTCGAATCAGTATCTG 
AATCAACAACAATCATTCATGAATCCGATGGTGGAACAACATATGAATCATGTTGGAGGG 
CGTGAAAGCATTCCTTTCGTGGACGGAAACTGCTACAACTACCATCAACTACCATCCAAT 

CAACTACCAGCCGTTGATCATGCTTCCACCAGTTACATGCCTTCCACCACCGGTGTCTAT 
GATCCTTACATCAACAATAATCTCTAA 

>G170 Amino Acid Sequence (domain in aa coordinates: 2-57) 

MGMKKVKLSLIANERSRKTSFIKRKDGIFKKLHELSTLCGVQACALIYSPFIPVPESWPS 
REGAKKVASRFLEMPPTARTKKMMDQETYL^^ 

VEGKMSQYHYDAKDLQDLQSCINLYLDQLNGRIESIKENGESLLSSVSPFPTRIGVDEIG 
DES FSDSP IHATTGVVDTLNATNPHVLTGDMTPFLDADATAVTAS SRFFDHI PYENMNMS 
QNLHEPFQHLVPTNVOTFFQNQN^ 

NQQQSFMNPMVEQHMNHVGGRESIPFVDGNCYNYHQLPSNQLPAVDHASTS 

DPYINNNL* 

>G1768 (185.. 1426) 

CTTCCTTTTGCTTCAGCTGCGAGCTTTGGTTGGATCTCTCACTTGCAAAACCAAATC 

TATCGACTTCCACCGAAAGATCACTTCTTAACCTACACAAGGTGTTTGTTATGAAGATCA 

GATAAATAAAAGGTCATTTGAGGATAATGGTTGATGTTCAAAGATTCTTACTTGCTTATT 

TGTGATGGACAATGTAAGAGGTTCAATAATGTTGCAGCCACTGCCAGAGATAGCTGAGAG 

TATCGATGATGCTATCTGCCATGAACTCTCCATGTGGCCTGATGATGCTAAAGATTTGTT 

ATTGATAGTGGAGGCAATATCAAGGGGAGACTTGAAGTTGGTACTTGTTGCTTGTGCAAA 

AGCTGTTTCTGAGAATAATCTTCTAATGGCACGATGGTGTATGGGTGAGTTGCGCGGTAT 

GGTTTCGATTTCTGGTGAGCCAATCCAGAGATTGGGAGCTTATATGTTAGAAGGGCTTGT 

TGCTAGGCTT GCTG CTTCTGGTAGTTCGATATATAAGTCTCTCCAGTCCAGAGAACCAGA 

GAGTTATGAATTTTTATCTTATGTGTATGTTCTGC7VTGAGGTTTGTCCATATTO 
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TGGATACATGTCAGCGAATGGTGCGATTGCAGAAGCAATGAAGGATGAAGAGAGGATTCA 
CATTATTGACTTCCAAATTGGACAAGGGAGCCAGTGGATAGCACTTATCCAGGCTTTTGC 
AGCTAGGCCTGGTGGGGCTCCAAATATTCGAATTACCGGAGTTGGTGATGGATCTGTCTT 
GGTTACAGTCAAGAAGAGACTAGAGAAACTTGCAAAGAAGTTTGATGTTCCATTCAGGTT 
CAATGCGGTTTCAAGGCCAAGTTGTGAAGTTGAAGTGGAAAATCTTGATGTCCGAGATGG 
CGAAGCCCTTGGAGTGAACTTTGCTTACATGCTGCATCATTTGCCAGATGAGAGTGTAAG 
CATGGAAAACCACAGGGACCGGTTGCTGAGGATGGTGAAGAGTCTATCACCTAAAGTAGT 
CACTCTTGTGGAACAAGAATGCAACACGAACACTTCCCCTTTCCTTCCTAGGTTCCTTGA 
GACATTAAGTTATTACACGGCAATGTTCGAATCTATCGATGTTATGCTTCCGAGAAATCA 
CAAGGAAAGGATCAATATCGAGCAGCACTGCATGGCAAGGGATGTCGTCAACATCATAGC 
TTGTGAAGGAGCCGAGAGGATCGAAAGACACGAGCTTCTCGGGAAATGGAAGTCAAGGTT 
TTCCATGGCGGGTTTTGAGCCATACCCCTTGAGCTCAATCATTTCAGCCACCATTAGAGC 
CCTCTTGAGAGATTACAGCAACGGGTATGCGATTGAAGAAAGAGATGGTGCTCTGTACCT 
TGGTTGGATGGACCGAATCTTGGTCTCATCTTGTGCATGGAAGTGAAAGAATAAACGTCT 
CCAAGAATGTAATGCAAAAGACAGAACTGGAAGTAATAGATAGTTTTGTCTCATAACCAT 
TAATAAGGTTGAATCAAATCATATACATCCCCATGCTACAACTATTACACAGGCTCCATC 
AACAAAGAAGGGCTCTTGTTGTGTTACCTTCTCTTCCTGTAACTCTTATTTGAACCAAAT 
GGAAGTGGTTACAT 

>G1768 Amino Acid Sequence (domain in AA coordinates: 54-413) 
MDNTOGSIMLQPLPEIAESIDDAICHELSMWPDDAKDLLLIVEAISRGDLKLVLVACAKA 
VSENNLLMARWCMGELRGMVS I SGEPIQRLGAYMLEGLVARLAASGSS I YKSLQSREPES 
YEFLSYVYVLHEVCPYFKFGYMSANGAIAEAMKDEERIHIIDFQIGQGSQWIALIQAFAA 
RPGGAPN IRI TGVGDGS VLVTVKKRLEKliAKKFDVP FRFNAVSRPS CEVEVENLDVRDGE 
ALGWFAYMLHHLPDESVSMENHRDRLLRMVKSLSPKVVTLVEQECNTNTSPFLPRFLET 
LSYYTAMFESIDVMLPRNHKERINIEQHCMARDVVNIIACEGAERIERHELLGKWKSRFS 
MAGFEPYPLSSIISATIRALLRDYSNGYAIEERDGALYLGWMDRILVSSCAWK* 
>G185 (77.. 988) 

ATGCAAAAATAAACATAGTAACAATACTTTAAACTATTTACACCACTTTAATCTTATTCT 
CCACTCTTTGAACGTAATGGAGAAGAACCATAGTAGTGGAGAGTGGGAGAAGATGAAGAA 
CGAGATCAACGAGCTAATGATAGAAGGAAGAGACTATGCACACCAGTTTGGATCAGCTTC 
ATCTCAAGAAACACGTGAACATTTAGCCAAAAAGATTCTTCAATCTTACCACAAGTCTCT 
CACCATCATGAACTACTCCGGCGAACTTGACCAAGTTTCTCAGGGTGGAGGAAGCCCCAA 
GAGCGATGATTCCGATCAAGAACCACTTGTCATCAAGAGTTCGAAGAAGTCAATGCCAAG 
GTGGAGTTCAAAAGTCAGAATTGCCCCTGGAGCTGGTGTTGATAGAACGCTGGACGATGG 
ATT(^GTTGGAGAAAGTACGGC(^GAAGGATATTCTCGGAGCCAAATTTCCAAGAGGATA 
CTATAGATGCACGTATAGAAAGTCTCAAGGATGTGAAGCCACTAAACAAGTCCAAAGATC 
TGATGAAAATCAGATGCTCCTTGAGATCAGTTACCGAGGAATACATTCTTGCTCTCAAGC 
TGCAAATGTCGGTACAACAATGCCGATACAAAACCTCGAACCGAACCAGACCCAAGAACA 
CGGAAATCTTGACATGGTAAAGGAAAGTGTAGA(2AACTACAATCACCAAGCACATTTGCA 
T(^CAACCTTCACTATCCATTGTCATCTACCCCAAATCTAGAGAATAACAATGCCTATAT 
GCTTCAAATGCGAGATCAAAAC^TCGAATATTTTGGATCTACGAGCTTCTCTAGTGATCT 
AGGAACTAGTATCAACTACAATTTTCCAGCATCTGGCTCGGCTTCTCACTCAGCATCAAA 
CTCTCCGTCCACCGTCCCTTTGGAATCCCCGTTTGAAAGCTATGATCCAAATCATCCATA 
TGGAGGATTTGGTGGGTTCTATTCTTAGTTATCTACTTAAGGGAGGGACGGAACTTTTTA 
CATGACCTCTTGATTAAAGAGAGAGTTTTCATAATAGCTAATCAATTTCCTATTCAAATA 
TCCGAGTTTTTTTTCTAATCATGTTTATCAATTGTCTTATTACAGAAGGCTTATTTTCAG 
GTCTATGTTGAAATAAATGGATTTGTACTCGTAGGTATGATCCTTGTTATCTAAAAAAAA 
AAAAA — 

>G185 Amino Acid Sequence (domain in AA coordinates: 113-172) 
MEKimSSGEWEKMKNEINELMIEGRDYAHQFGSASSQETREHLAKKILQSYHKSLTIMNY 
SGELDQVSQGGGSPKSDDSDQEPLVIKSSKKSMPRWSSKVRIAPGAGVDRTLDDGFSWRK 
YGQKDILGAKFPRGYYRCTYRKSQGCEATKQVQRSDENQMLLE I SYRGIHSCSQAANVGT 
TMPIQNLEPNQTQEHGNLDIWKESVD^^ 

QNIEYFGSTSFSSDLGTSINYNFPASGSASHSASNSPSTVPLESPFESYDPNHPYGGFGG 
FYS* 

>G1931 (5.. 592) 

ATCAATGGAAGGGGTTGACAACACAAATCCTATGTTAACCCTAGAAGAAGGCGAAAACAA 
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CAATCCTTTTTCTTCCTTAGATGACAAAACATTAATGATGATGGCTCCTTCGTTAATCTT 

TTCGGGCGATGTAGGTCCATCTTCTTCTTCTTGTACTCCAGCAGGTTATCATCTATCTGC 

TCAGCTGGAGAACTTTCGAGGAGGTGGAGGAGAGATGGGAGGATTAGTGAGTAATAATAG 

CAATAATAGTGATCATAATAAGAATTGCAACAAAGGAAAAGGGAAGAGAACTTTGGCAAT 

GCAGAGGATAGCTTTTCATACAAGGAGTGATGATGATGTTCTTGATGATGGTTATCGTTG 

GCGAAAGTACGGTCAGAAATCTGTCAAGAACAATGCTCATCCCAGGAGCTATTATAGATG 

TACATACCACACATGCAACGTGAAGAAACAAGTGCAAAGACTGGCAAAAGATCCAAACGT 

TGTCGTAACAACCTACGAAGGTGTTCATAATCATCCTTGTGAGAAGCTCATGGAGACTCT 

TAGCCCTCTCCTTAGGCAACTTCAGTTCCTCTCAAGAGTTTCTGATCTGTAATTATTGAA 

TGTTAATTAGTGGTGTAATACATTAATTATGCTTTAATCTCTCCATTGACCCTCAATC 

>G1931 Amino Acid Sequence (domain in AA coordinates: 114-170) 

MEGVDNTNPMLTLEEGENNNPFSSLDDKTLMMMAPSLIFSGDVGPSSSSCTPAGYHLSAQ 
LENFRGGGGEMGGLVSNNSNNSDHNKNCNKGKGKRTLAMQRIAFHTRSDDDVLDDGYRWR 

KYGQKSVK^AHPRSYYRCTYHTCNVKKQVQRLAKDPNVWTTYEGVHNHPCEKLM 

PLLRQLQFLSRVSDL* 

>G2543 (1..2169) 

ATGAGTTTCGTCGTCGGCGTCGGCGGAAGTGGTAGTGGAAGCGGCGGAGACGGTGGTGGT 

AGTCATCATCACGACGGCTCTGAAACTGATAGGAAGAAGAAACGTTACCATCGTCACACC 

GCTCAACAGATTCAACGCCTTGAATCGAGTTTCAAGGAGTGTCCTCATCCAGATGAGAAA 

CAGAGGAACCAGCTTAGCAGAGAATTGGGTTTGGCTCCAAGACAAATCAAGTTCTGGTTT 

CAGAACAGAAGAACTCAGCTTAAGGCTCAACATGAGAGAGCAGATAATAGTGCACTAAAG 

GCAGAGAATGATAAAATTCGTTGCGAAAACATTGCTATTAGAGAAGCTCTCAAGCATGCT 

ATATGTCCTAACTGTGGAGGTCCTCCTGTTAGTGAAGATCCTTACTTTGATGAACAAAAG 

CTTCGGATTGAAAATGCACACCTTAGAGAAGAGCTTGAT^AGAATGTCTACCATTGCATCA 

AAGTACATGGGAAGACCGATATCGCAACTCTCTACGCTACATCCAATGCACATCTCACCG 

TTGGATTTGTCAATGACTAGTTTAACTGGTTGTGGACCTTTTGGTCATGGTCCTTCACTC 

GATTTTGATCTTCTTCCAGGAAGTTCTATGGCTGTTGGTCCTAATAATAATCTGCAATCT 

CAGCCTAACTTGGCTATATCAGACATGGATAAGCCTATTATGACCGGCATTGCTTTGACT 

GCAATGGAAGAATTGCTCAGGCTTCTTCAGACAAATGAACCTCTATGGACAAGAACAGAT 

GGCTGCAGAGACATTCTCAATCTTGGTAGCTATGAGAATGTTTTCCCAAGATCAAGTAAC 

CGAGGGAAGAACCAGAACTTTCGAGTCGAAGCATCAAGGTCTTCTGGTATTGTCTTCATG 

AATGCTATGGCACTTGTCGACATGTTCATGGATTGTGTCAAGTGGACAGAACTCTTTCCC 

TCTATCATTGCAGCTTCTAAAACACTTGCAGTGATTTCTTCAGGAATGGGAGGTACCCAT 

GAGGGTGCATTGCATTTGTTGTATGAAGAAATGGAAGTGCTTTCGCCTTTAGTAGCAACA 

CGCGAATTCTGCGAGCTACGCTATTGTCAACAGACTGAACAAGGAAGCTGGATAGTTGTA 

AACGTCTCATATGATCTTCCTCAGTTTGTTTCTCACTCTCAGTCCTATAGATTTCCATCT 

GGATGCTTGATTCAGGATATGCCCAATGGATATTCCAAGGTTACTTGGGTTGAACATATT 

GAAACTGAAGAAAAAGAACTGGTTCATGAGCTATACAGAGAGATTATTCACAGAGGGATT 

GCTTTTGGGGCTGATCGTTGGGTTACCACTCTCCAGAGAATGTGTGAAAGATTTGCTTCT 

CTATCGGTACCAGCGTCTTCATCTCGTGATCTCGGTGGAGTGATTCTATCACCGGAAGGG 

AAGAGAAGCATGATGAGACTTGCTCAGAGGATGATCAGCAACTACTGTTTAAGTGTCAGC 

AGATCCAACAACACACGCTCAACCGTTGTTTCGGAACTGAACGAAGTTGGAATCCGTGTG 

ACTGCACATAAGAGCCCTGAACCAAACGGCACAGTCCTATGTGCAGCCACCACTTTCTGG 

CTTCCCAATTCTCCTCAAAATGTCTTCAATTTCCTCAAAGACGAAAGAACCCGTCCTCAG 

TGGGATGTTCTTTCAAACGGAAACGCAGTGCAAGAAGTTGCTCACATCTCAAACGGATCA 

CATCCTGGAAACTGC^TATCGGTTCTACGTGGATCCAATGCAACACATAGCAACAACATG 

CTTATTCTGCAAGAAAGCTCAACAGACTCATCAGGAGCATTTGTGGTCTACAGTCCAGTG 

GATTTAGCAGCATTGAACATCGCAATGAGCGGTGAAGATCCTTCTTATATTCCTCTCTTG 

TCCTCAGGTTTCACAATCTCACCAGATGGAAATGGCTCAAACTCTGAACAAGGAGGAGCC 

TCGACGAGCTCAGGACGGGCATCAGCTAGCGGTTCGTTGATAACGGTTGGGTTTCAGATA 

ATGGTAAGCAATTTACCGACGGCAAAACTGAATATGGAGTCGGTGGAAACGGTTAATAAC 

CTGATAGGAACAACTGTACATCAAATTAAAACCGCCTTGAGCGGTCCTACAGCTTCAACT 
ACAGCTTGA 

>G2543 Amino Acid Sequence (domain in AA coordinates: 31-91) 
MSFWGVGGSGSGSGGDGGGSHHHDGSETDRKKKRYHRHTAQQIQRLESSFKECPHPDEK 
QRNQLSRELGLAPRQIKFWFQNRRTQIjKAQHERADNSALKAENDKIRCENIAIREALKHA 
ICPNCGGPPVSEDPYFDEQKLRIENAHLREELERMSTIASKYMGRPISQLSTIiHPMHISP 
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LDLSMTSLTGCGPFGHGPSLDFDLLPGSSMAVGPNNNLQSQPNLAISDMDKPIMTGIALT 
AMEELLRLLQTNEPLWTRTDGCRDILNLGSYENVFPRSSNRGKNQNFRVEASRSSGIVFM 
NAMALVDMFMDCVKWTELFPSIIAASKTLAVISSGMGGTHEGALHLLYEEMEVLSPLVAT 
REFCEL.RYCQQTEQGSWIWNVSYDLPQFVSHSQSYRFPSGCLIQDMPNGYSKVTWVEHI 
ETEEKELVHELYREIIHRGIAFGADRWVTTLQRMCERFASLSVPASSSRDLGGVILSPEG 
KKSMMRLAQRMISNYCLSVSRSNNTRSTWSELNEVGIRVTAHKSPEPNGTVLCAATTFW 
LPNSPQNVFNFLICDERTRPQWDVLSNGNAVQEVAHISNGSHPGNCISVLRGSNATHSNWM 
LILQESSTDSSGAFWYSPVDLAALNIAMSGEDPSYIPLLSSGFTISPDGNGSNSEQGGA 
STSSGRASASGSLITVGFQIMVSNLPTAKLNMESVETVNNLIGTTVHQIKTALSGPTAST 
TA* 

>G264 (30.. 1430) 

CTTGTACCAGTTTCTGATTAGATTCAACAATGAACGGCGCATTAGGTAACTCCTCCGCCT 

CCGTTAGCGGCGGAGAAGGAGCCGGAGGACCAGCGCCTTTCTTGGTGAAAACCTACGAGA 

TGGTCGACGATTCATCAACGGACCAGATCGTATCGTGGAGCGCTAACAACAACAGCTTCA 

TCGTTTGGAATCATGCCGAATTTTCACGCCTCCTTCTTCCAACCTACTTCAAACACAATA 

ACTTCTCTTCCTTCATTCGTCAGCTCAATACCTATGGGTTTAGGAAGATTGATCCAGAGA 

GGTGGGAGTTTTTGAATGATGATTTTATTAAGGATCAGAAGCATCTTCTCAAGAATATAC 

ATAGAAGGAAACCTATACACAGCCACAGTCATCCACCTGCTTCGTCGACTGATCAAGAAA 

GAGCAGTGTTGCAAGAGCAAATGGACAAGCTTTCACGTGAGAAAGCTGCAATTGAAGCTA 

AGCTTTTAAAGTTCAAACAACAGAAGGTTGTAGCAAAGCATCAGTTTGAAGAAATGACTG 

AGCATGTTGATGATATGGAGAATAGGCAGAAGAAGCTGCTGAATTTTTTGGAAACTGCGA 

TTCGGAATCCTACTTTTGTTAAGAATTTTGGTAAGAAAGTCGAGCAGTTGGATATTTCAG 

CTTACAACAAAAAGCGAAGGCTCCCTGAAGTTGAGCAATCAAAGCCACCTTCAGAAGATT 

CTCATCTGGATAATAGTAGTGGTAGCTCGAGACGCGAGTCTGGAAACATTTTTCATCAAA 

ATTTCTCTAATAAATTGCGACTAGAGCTTTCTCCAGCTGATTCAGATATGAACATGGTTT 

CACACAGTATACAAAGTTCCAATGAAGAAGGTGCGAGTCCCAAAGGGATACTGTCAGGAG 

GTGATCCAAATACTACACTAACAAAAAGAGAAGGCCTACCATTTGCACCTGAAGCTCTAG 

AGCTTGCGGATACCGGGACATGCCCGAGGAGATTACTGTTAAATGATAATACAAGGGTGG 

AGACCTTGCAGCAGAGGCTAACTTCTTCAGAGGAGACTGATGGTAGCTTTTCATGTCATT 

TAAATCTAACCCTGGCTTCTGCTCCGTTACCGGACAAAACAGCTTCACAGATAGCTAAGA 

CGACTCTTAAAAGTCAGGAGTTAAACTTTAACTCAATAGAAACAAGTGCAAGTGAGAAAA 

ATCGGGGTAGACAAGAGATTGCAGTTGGAGGTAGCCAAGCAAATGCAGCTCCTCCAGCAA 

GAGTGAATGATGTATTCTGGGAACAGTTCCTAACAGAAAGGCCAGGGTCTTCAGATAATG 

AGGAGGCAAGTTCGACTTATAGAGGTAACCCATACGAAGAGCAAGAGGAGAAAAGAAACG 

GGAGTATGATGTTACGTAATACAAAGAATATCGAGCAGCTGACCTTATAAACTATTTGGA 

CGGTTACATCAACGAGAGTACGAACTGAGGTTTTGGTAAGAAGTATGGGTGAGTAAGTAA 

TGAAACATTGGACTGAAAAAGCGTAAGTAGCTTTGTTGTAAACACTTGCGTCTCTGTCTA 

CACAAGTAATTTGACTGTAAATGTAAGTGTACAGGATTTAAATTGAATAAGCA 

>G264 Amino Acid Sequence (domain in AA coordinates: 24-114) 

MNGALGNSSASVSGGEGAGGPAPFLVKTYEMTO^ 

LLLPTYFKHNNFSSFIRQLNTYGFRKIDPERWEFLNDDFIKDQKHLLKNIHRRKPIHSHS 
HPPASSTDQERAVLQEQOTKLSREKAAIEAKbLKFKQQ 

KKLLNFLETAIRNPTFVKIJFGKKVEQLDISAYNKK31RLPEVEQSKPPSEDSHLDNSSGSS 

RRESGNIFHQNFSNKLRLELSPADSDMNMVSHSIQSSNEEGASPKGILSGGDPNTTLTKR 

EGLPFAPEALELADTGTCPRRLLLNDNTRVETLQQRLTSSEETDGSFSCHLNLTLASAPL 

PDKTASQIAKTTLKSQELNFNSIETSASEKNRGRQEIAVGGSQANAAPPARVNDVFWEQF 

LTERPGSSDNEEASSTYRGNPYEEQEEKRNGSMMLRNTKNIEQIiTL* 

>G32 (101.. 736*- 

AACACACATTCCCTCTCTTCCTTCAACTAGAAAAAAGATAGATATATCGGACATTTATTG 
ATCTGTGTATGCATAAAGGTATAGTATCATTATTAGAAAGATGAACACAACATCATCAAA 
GAGCAAGAAGAAGCAAGACGATCAGGTTGGTACAAGGTTTCTTGGGGTGAGAAGAAGGCC 
TTGGGGAAGATACGCAGCTGAGATTAGAGACCCAACTACGAAGGAGCGTCACTGGCTTGG 
CACTTTCGATACGGCGGAAGAAGCTGCCTTGGCCTACGATAGAGCTGCTCGGTCCATGCG 
TGGCACACGTGCCAGAACCAACTTTGTTTACTCAGACATGCCTCCTTCCTCATCCGTCAC 
CTCCATTGTTTCTCCTGACGATCCTCCTCCTCCTCCACCTCCTCCTGCTCCTCCTAGCAA 
TGATCCTGTCGATTACATGATGATGTTTAACCAATACTCATCCACTGACTCGCCAATGCT 
TC^GCCTCATTGTGATCAAGTGGACAGTTACATGTTTGGTGGCTCTCAATCTTCGAATTC 
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TTATTGCTATTCTAATGACAGTAGTAATGAGCTGCCTCCTCTCCCGAGCGACTTGTCGAA 
TTCGTGTTATAGCCAACCACAGTGGACCTGGACCGGTGACGACTACTCGTCTGAGTACGT 
ACATAGTCCAATGTTCAGCAGAATGCCTCCGGTTTCTGACTCTTTCCCTCAAGGTTTCAA 
CTACTTTGGCTCCTAATTCTTTCTCATCGTCCATATTTAATACCTTCCTCATTTGTACCT 
TTTCCTTCTTCTTCTTTTTTGGGTTTATCTATGTTTCGCCGTCCTTGATCTCTGCCTATG 
TGATCAAAGTGACTGTTTGTCATTAGTTTTTCAATAACAAGTTATCATTTGTATCTTGAA 
AAAAAAAAAAA 

>G32 Amino Acid Sequence (domain in aa coordinates: 17-84) 

MNTTSSKSKKKQDDQVGTRFLGVRRRPWGRYAAEIRDPTTKERHWLGTFDTAEEAALAYD 

RAARSMRGTRARTNFVYSDMPPSSSVTSIVSPDDPPPPPPPPAPPSNDPVDYMMMFNQYS 

STDSPMLQPHCDQVDSYMFGGSQSSNSYCYSNDSSNELPPLPSDLSNSCYSQPQWTWTGD 

DYSSEYVHSPMFSRMPPVSDSFPQGFNYFGS* 

>G436 (1..2157) 

ATGGATTTTACTCGCGATGACAACTCAAGTGATGAACGGGAAAATGATGTAGACGCCAAC 
ACCAACAACCGTCACGAGAAGAAGGGTTACCATCGCCACACTAATGT^ACAAATTCATAGG 
CTTGAAACGTATTTCAAGGAATGTCCTCATCCAGACGAATTTCAGCGACGTCTGTTGGGT 
GAAGAACTGAATCTGAAACCAAAACAAATCAAATTTTGGTTTCAAAACAAAAGAACTCAA 
GCTAAGAGTCACAATGAAAAAGCAGACAATGCAGCGCTTAGGGCAGAAAATATTAAGATT 
AGACGTGAGAACGAATCAATGGAAGATGCACTGAATAATGTGGTTTGCCCTCCATGTGGT 
GGTCGTGGTCCTGGGAGAGAAGACCAACTTCGACATCTCCAAAAACTCCGTGCACAAAAC 
GCTTATCTCAAAGATGAGTATGAAAGAGTCTCAAACTACCTAAAACAGTACGGAGGTCAC 
TCAATGCATAACGTCGAGGCCACACCCTATCTCCATGGTCCATCAAACCATGCATCAACG 
TCCAAGAACCGTCCAGCATTGTACGGAACCTCTTCTAACCGTCTCCCCGAGCCTTCAAGC 
ATATTTAGAGGACCATACACTCGTGGAAACATGAACACCACCGCACCGCCTCAGCCGCGA 
AAGCCGCTGGAAATGCAGAATTTCCAACCACTATCTCAACTGGAGAAAATTGCAATGTTG 
GAAGCAGCGGAAAAAGCGGTGTCAGAGGTTTTGAGCCTCATTCAAATGGATGATACAATG 
TGGAAAAAGTCGTCTATTGATGATAGGCTCGTCATTGATCCAGGGCTCTATGAGAAATAT 
* TTTACTAAGACTAACACAAATGGTCGTCCTGAGTCTTCTAAAGATGTCGTGGTGGTTCAA 
ATGGATGCTGGAAACTTGATCGACATCTTCTTAACTGCGGAGAAATGGGCGAGGCTTTTT 
CCAACAATTGTGAACGAAGCTAAAACGATTCACGTCTTGGATTCCGTTGACCATCGAGGA 
AAAACTTTCTCAAGAGTGATTTATGAGCAACTGCACATACTGTCACCATTGGTGCCACCG 
AGGGAATTTATGATCCTAAGGACTTGCCAACAAATTGAAGACAATGTCTGGATGATTGCT 
GATGTGTCGTGTCATCTCCCAAACATTGAGTTTGATCTTTCGTTTCCCATTTGCACCAAA 
CGTCCCTCAGGTGTGCTCATTCAAGCCTTGCCCCACGGCTTCTCTAAGGTGACGTGGATA 
GAGCATGTGGTAGTGAATGATAATAGAGTGCGGCCACATAAGCTTTACAGAGACCTCTTA 
TACGGCGGCTTTGGCTACGGAGCTCGACGTTGGACCGTTACTCTTGAGAGGACGTGTGAG 
AGGCTGATTTTCTCCACCTCCGTCCCTGCCTTGCCCAACAATGACAATCCCGGAGTTGTG 
CAAACAATACGAGGCAGAAATAGCGTAATGCATTTGGGAGAAAGAATGTTGAGGAACTTT 
GCATGGATGATGAAAATGGTTAACAAACTCGACTTCTCGCCACAGTCTGAAACTAACAAC 
AGCGGAATTAGGATTGGGGTGCGGATAAACAATGAGGCGGGTCAACCGCCCGGTCTCATT 
GTCTGTGCTGGTTCATCTTTATCCCTCCCTCTCCCTCCTGTCCAAGTGTACGATTTCCTT 
AAGAATCTGGAGGTTCGTCACCAGTGGGACGTTCTGTGCCATGGGAATCCAGCGACTGAG 
GCTGCTCGTTTCGTCACCGGATCAAACCCAAGGAACACTGTGTCTTTTCTCGAGCCTTCA 
ATTAGGGATATTAATACTAAGCTAATGATACTCCAAGATAGCTTCAAAGATGCATTGGGA 
GGAATGGTGGCCTACGCTCCAATGGATCTAAACACCGCCTGCGCTGCCATTTCAGGCGAT 
ATCGATCCTACCACCATTCCAATCCTCCCTTCCGGTTTTATGATCTCCCGTGACGGCCGT 
CCXTCCGAGGGCGAAGCCGAGGGTGGCAGCTATACACTCCTCACCGTGGCTTTCCAGATC 
CTTGTCTCCGGTCCQAGTTACTCTCCTGATACCAACCTGGAAGTTTCTGCCACCACAGTC 
AATACCTTGATTAGCTCCACCGTTCAAAGGATCAAAGCCATGCTCAAGTGCGAATGA 
>G436 Amino Acid Sequence (domain in AA coordinates: 22-85) 
MDFTRDDNSSDERENDVDANTNNIUffi 

EELNLKPKQIKFWFQNKRTQAKSHNEKADNAALRAENIKIRRK^ 

GRGPGREDQLRHLQKLRAQNAYLKDEYERVSNYLKQYGGHSMHNVEATPYI^ 

SKNRPALYGTSSNM,PEPSSIFRGPYTRG1^TTAPPQPRK^ 

EAAEKAVS EVLSL I QMDDTMWKKS S IDDRLVIDPGLYEKYFTKTNTNGRPES SKDVVVVQ 
MDAGNLIDIFLTAEKWARLFPTIVNEAKTIHVL 

REFMILRTCQQIEDNVWMIADVSCHLPNIEFDLSFPICTKRPSGVIjIQALPHGFSKVTWI 
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EHVWNDNRVRPHKLYRDLLYGGFGYGARRWTVTLERTCERLIFSTSVPALPNNDNPGW 
QTIRGRNSVMHLGERMLRNFAWMMKMWKLDFSPQSETNNSGIRIGVRINNEAGQPPGLI 
VCAGSSLSLPLPPVQVYDFLKNLEVRHQWDVLCHGNPATEAARFVTGSNPRNTVSFLEPS 
IRDINTKLMILQDSFKDALGGMVAYAPMDLNTACAAISGDIDPTTIPILPSGFMISRDGR 
P S EGE AEGG S YTLLTVAFQ I L VSG PS Y S PDTNLEVS ATT VNTL I S S TVQR I KAMLKCE * 
>G556 (50.. 1144) 

CTTTTTTGAAGCCCTTTTGACACAAAAGACCAGAACAAGTTGAAGAAATATGAATACAAC 

CTCGACACATTTTGTTCCACCGAGAAGGTTTGAAGTTTACGAGCCTCTCAACCAAATCGG 

TATGTGGGAAGAAAGTTTCAAGAACAATGGAGACATGTATACGCCTGGCTCTATCATAAT 

CCCGACTAACGAAAAACCAGACAGCTTGTCAGAGGATACTTCTCATGGGACAGAAGGAAC 

TCCTCACAAGTTTGACCAAGAGGCTTCCACATCTAGACATCCTGATAAGATACAGAGAAG 

GCTAGCACAGAATCGAGAGGCAGCTAGGAAAAGTCGTTTGCGCAAGAAAGCTTATGTTCA 

GCAGCTAGAGACTAGCCGGTTAAAGCTAATTCATTTAGAGCAAGAACTCGATCGTGCTAG 

ACAACAGGGTTTCTATGTGGGGAACGGAGTAGATACCAATGCTCTTAGTTTCTCAGATAA 

CATGAGCTCAGGGATTGTTGCATTTGAGATGGAATATGGACATTGGGTGGAAGAACAGAA 

CAGGCAAATATGTGAACTAAGAACGGTTTTACATGGACAAGTTAGTGATATAGAGCTTCG 

TTCTCTAGTCGAGAATGCCATGAAACATTACTTTCAACTCTTCCGAATGAAGTCAGCCGC 

TGCAAAAATCGATGTTTTCTATGTCATGTCCGGAATGTGGAAAACTTCAGCAGAGCGGTT 

TTTCTTGTGGATAGGCGGATTTAGACCCTCAGAGCTTCTCAAGGTTCTGTTACCGCATTT 

TGATCCTTTGACGGATCAACAACTTTTGGATGTATGTAATCTGAGGCAATCATGTCAACA • 

ATCAGAAGATGCGTTATCCCAAGGTATGGAGAAACTGCAACATACATTAGCAGAGAGTGT 

AGCAGCCGGGAAACTTGGTGAAGGAAGTTATATTCCTCAAATGACTTGTGCTATGGAGAG 

ATTGGAGGCTTTGGTCAGCTTTGTAAATCAAGCTGATCATCTGAGACATGAGACATTGCA 

ACAGATGCATCGGATCTTAACCACGCGACAAGCGGCTAGAGGTTTGTTAGCATTAGGGGA 

GTATTTCCAAAGGCTTCGAGCTTTGAGTTCGAGTTGGGCGGCTAGGCAACGTGAACCAAC 

GTAATTAAGGTGTTTAGATGTCAAGAAAGGTTTGAGACCTTAACAATCAAGAATGGAGTT 

TGCTGGTGAGTGGATTTTTGGGTCAAGAACAAGAGCAATAACACAAGCTGCTGTGTGATG 

ATGAATCTTGTCTTGCGGCTAAAGGAAATGTTTGAGGAAAGTTGTACATATGATCAGCAA 

CGTAAAGTTTATAGCTTTTTAGAAACCAACTTTTCGATGGTTGTTCTTTTTTTTTTGTAT 

GTAATATTATAGATAAGCTTGTGGTATATATGATTTTAATGTGACATTACGAACTTGATT 

TATAACCATGGTAAAAT 

>G556 Amino Acid Sequence (domain in AA coordinates: 83-143) 
MNTTSTHFVPPRRFEVYEPLNQIGMWEES FKNNGDMYTPGS III PTNEKPDSLS EDTSHG 
TEGTPHKFDQEASTSRHPDKIQRRLAQNREAARKSRLRKKAYVQQLETSRLKLIHLEQEL 
DRARQQGFYVGNGVDTNALSFSDNMSSGIVAFEMEYGHWVEEQNRQICELRTVLHGQVSD 
IELRSLVENAMKHYFQLFRMKSAAAKIDVFYVMSGMWKTSAERFFLWIGGFRPSELLK^ 
LPHFDPLTDQQLLDVCNLRQS CQQSEDALSQGMEKLQHTLAESVAAGKLGEGS YI PQMTC 
AMERLEALVSFWQADHLRHETLQQMHRILTTRQAARGLLALGEYFQRLRALSSSWAARQ 
REPT* 

>G1420 (39.. 1238) 

AAAGTATCATCTCATAGATTCCATCTTTTCTCTATTACATGGAGAAGAAAAAAGAAGAGG 
ATCATCATCATCAACAACAACAACAACAACAAAAGGAGATCAAGAACACAGAGACAAAGA 
TCGAGCAAGAACAAGAACAAGAACAAAAACAAGAAATCTCTCAAGCATCATCATCATCAA 
ACATGGCGAATCTAGTTACGTCATCAGATCATCATCCGTTGGAGCTAGCTGGAAATCTCT 
CAAGCATCTTCGATACTTCATCTTTACCTTTTCCTTATTCTTATTTCGAAGATCACTCTT 
CTAATAATCCTAATTCTTTCCTAGACTTGCTCCGACAAGATCATCAGTiTGCTTCTTCGT 
CTAATTCCTCTTCTTTTTCATTCGATGCCTTTCCTCTCCCCAATAACAACAACAACACCT 
CTTTTTTTACGGATTTGCCCTTACCTCAAGCTGAGTCATCAGAAGTCGTGAACACAACAC 
CGACTTCTCCAAACTCAACCTC^GTCTCATCTTCCTCCAACGAAGCTGCAAATGATAACA 
ACAGTGGTAAAGAAGTTACTGTTAAAGATCAAGAAGAAGGAGATCAACAACAAGAGCAAA 
AGGGTACTAAG C CACAGTTGAAGGCAAAGAAGAAGAATCAAAAGAAAGCTAGAGAAGCTA 
GGTTTGCGTTTCTGACGAAGAGCGATATTGATAATCTTGACGACGGTTATAGGTGGAGAA 
AATACGGCCAAAAAGCTGTCAAAAACAGTCCTTATCCCAGAAGCTATTACCGTTGCACCA 
CAGTGGGTTGCGGAGTGAAGAAGAGAGTGGAGAGATCCTCCGATGATCCTTCGATCGTCA 
TGACAACCTACGAAGGTCAGCATACCCATCCTTTCCCCATGACGCCACGTGGACACATCG 
GAATGCTCACGTCACCAATCCTAGACCACGGTGCAACCACCGCGTCATCATCATCATTCT 
CCATCCCTCAGCCACGTTACTTGCTGACTCAACATCACCAGCCCTACAACATGTACAACA 
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ACAACTCTCTAAGTATGATCAATAGAAGATCATCCGATGGCACTTTCGTAAATCCAGGTC 

CATCATCATCATTCCCCGGCTTTGGTTATGATATGTCTCAAGCTTCTACTTCAACTTCTT 

CTTCCATTAGAGATCATGGATTGCTTCAAGATATTCTTCCTTCGCAGATCAGATCCGATA 

CTATTAACACTCAAACCAATGAAGAGAATAAGAAATGAAGAAGTTTTTTTTCCCGGGGCA 

ATTGTTTTTTTCTTTAGGCCGGATCCGGTAGGTAGGTTTCATGAGC 

>G1420 Amino Acid Sequence (domain in AA coordinates: 221-280) 

ME KKKEEDHHHQQQQQQQKE I KNTETKI EQEQEQEQKQE I S QAS S S SNMANL VTS SDHHP 

LELAGNLSSIFDTSSLPFPYSYFEDHSSNNPNSFLDLLRQDHQFASSSNSSSFSFDAFPL 

PNTSTNNNTS FFTDLPLPQAE S SE WNTTPTS PNSTS VS SS SNEAANDNNSGKE VTVKDQEE 

GDQQQEQKGTKPQLKAKKKNQKKAREARFAFLTKSDIDNLDDGYRWRKYGQI^AVKNSPYP 

RSYYRCTTVGCGVKKRVERSSDDPSIVMTTYEGQHTHPFPMTPRGHIGMLTSPILDHGAT 

TASSSSFSIPQPRYLLTQHHQPYNMYNNNSLSMINRRSSDGTFVNPGPSSSFPGFGYDMS 

QASTSTSSSIRDHGLLQDILPSQIRSDTINTQTNEENKK* 

>G1412 (115. .1008) 

CCCACGCGTCCGCCCACGCGTCCGAAACAAAAACATATAATTTGGGTTTTTAGAGTTCGA 

AACTTGAAATCTTTTTTTTTTTGGTTGCTGAGGAATCGAAGTAGAAGAGTATAAATGGGT 

GTTAGAGAGAAAGATCCGTTAGCCCAGTTGAGTTTGCCACCAGGTTTTAGATTTTATCCG 

ACAGATGAAGAGCTTCTTGTTCAGTATCTATGTCGGAAAGTTGCAGGCTATCATTTCTCT 

CTCCAGGTCATCGGAGACATCGATCTCTACAAGTTCGATCCTTGGGATTTGCCAAGTAAG 

GCTTTGTTTGGAGAGAAGGAATGGTATTTCTTTAGCCCAAGAGATCGGAAATATCCGAAC 

GGGTCAAGACCCAATAGAGTAGCCGGGTCGGGTTATTGGAAAGCAACGGGTACTGACAAA 

ATTATCACGGCGGATGGTCGTCGTGTCGGGATTAAAAAAGCTCTGGTCTTTTACGCCGGA 

AAAGCTCCCAAAGGCACTAAAACCAACTGGATTATGCACGAGTATCGCTTAATAGAACAT 

TCTCGTAGCCATGGAAGCTCCAAGTTGGATGATTGGGTGTTGTGTCGAATTTACAAGAAA 

ACATCTGGATCTCAGAGACAAGCTGTTACTCCTGTTCAAGCTTGTCGTGAAGAGCATAGC 

ACGAATGGGTCGTCATCGTCTTCTTCATCACAGCTTGACGACGTTCTTGATTCGTTCCCG 

GAGATAAAAGACCAGTCTTTTAATCTTCCTCGGATGAATTCGCTCAGGACGATTCTTAAC 

GGGAACTTTGATTGGGCTAGCTTGGCAGGTCTTAATCCAATTCCAGAGCTAGCTCCGACC 

AATGGATTACCGAGTTACGGTGGTTACGATGCGTTTCGAGCGGCGGAAGGTGAGGCGGAG 

AGTGGGCATGTGAATCGGCAGCAGAACTCGAGCGGGTTGACTCAGAGTTTCGGGTACAGC 

TCX3AGTGGGTTTGGTGTTTCGGGTCAAACATTCGAGTTTAGGCAATGAGAGAGATGTGAA 

GTTACTGATGGGTGAAAAAAGTAAAAAAAAAACTTGGAGATAGTAGAGTGGCAATTGATG 

TAAATAATAGGGATTTATATGGGGCTTTTACCGATTCGGTGAGGCTTAGGATTCCCCAAA 

GGAAAAAGGCTCGACTGGGGACTAGTTTGATCCAACTTGACGGCCCCCAAATGTGTAATG 

TTTCTCAACGGAGAGAAAAATAAATGGTTACCAATATTTTTCCAAAAAAAAAAAAAAAAA 

>G1412 Amino Acid Sequence (domain in AA coordinates: 17-159) 

MGVREKDPLAQLSLPPGFRFYPTDEELLVQYLCRKVAGYHFSLQVIGDIDLYKFDPWDLP 

SKALFGEKEWYFFSPRDRKYPNGSRPNRVAGSGYWKATGTDKIITADGRRVGIKKALVFY 

AGKAPKGTKTNWIMHEYRLIEHSRSHGSSKIjDDOTLCRIYKKTSGSQRQAVTPVQACREE 

HSTNGSSSSSSSQLDDVLDSFPEIKDQSFNLPRMNSLRTILNGNFDWASIiAGLNPIPELA 

PTNGLPSYGGYDAFRAAEGEAESGHVNRQQNSSGLTQSFGYSSSGFGVSGQTFEFRQ* 

>G738 (1..885) 

ATGGACCATCATCAGTATCATCATCATGATCAATACCAACATCAGATGATGACTAGTACT 

AACAATAATTCCTATAACACCATCGTCACAACACAACCACCACCAACAACAACAACAATG 

GATTCAACAACAGCAACAACTATGATAATGGATGACGAGAAGAAGTTGATGACGACAATG 

AGCACTAGGCCGC^GAACCAAGAAACTGTCCAAGATGCAACTCAAGCAACACCAAGTTT 

TGTTATTACAACAACTACAGCTTAGCACAGCCTAGGTACTTGTGTAAGTCTTGTCGGAGA 

TATTGGACTGAAGGTGGCTCTCTCCGTAACGTCCCCGTAGGCGGAGGTTCTAGAAAGAAC 

AAGAAGCTTCCATTTCCTAATTCCTCTACTTCTTCTTCCACCAAGAACCTCCCGGATCTC 

AACCCTCCTTTCGTCTTCACATCATCAGCTTCATCATCTUUVCCCTAGC^GACGCATCAA 

AACAATAATGACCTCT^GCCTATCCTTCTCCTCCCCTATGCAAGACAAGCGAGCTCAAGGG 

CATTACGGTCATTTCAGTGAGCAAGTTGTGACAGGAGGGC^GAACTGTCTTTTCC^GCT 

CCTATGGGAATGATTCAGTTTCGTCAAGAGTATGATCATGAGCACCCCAAAAAGAATCTT 

GGGTTTTCATTAGACAGGAACGAGGAAGAGATTGGTAATCATGATAACTTCGTTGTTA^ 

GAGGAAGGAAGTAAGATGATGTATCCTTATGGAGATCATGAAGACCGTCAACAACATCAC 

CATGTGAGACACGATGATGGTAATAAGAAGAGAGAAGGTGGTTCAAGCAATGAGCTATGG 

AGCGGAATCATCCTAGGTGGTGATAGTGGTGGACCAACATGGTGA 
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>G73 8 Amino Acid Sequence (domain in aa coordinates: 351-3 93) 

MDHHQYHHHDQYQHQMMTSTNNNSYNTIVTTQPPPTTTTMDSTTATTMIMDDEKIO.MTTM 

STRPQEPRNCPRCNSSNTKFCYYNNYSLAQPRYLCKSCRRYWTEGGSLRNVPVGGGSRKN 

KKLPFPNSSTSSSTKNLPDLNPPFVFTSSASSSNPSKTHQNNNDLSLSFSSPMQDKRAQG 

HYGHFSEQWTGGQNCLFQAPMGMIQFRQEYDHEHPKKNLGFSLDRNEEEIGNHDNFWN 

EEGSKMMYPYGDHEDRQQHHHVRHDDGNKKREGGSSNELWSGIILGGDSGGPTW* 

>G2426 (1..1038) 

ATGGGCAGATCGCCATGTTGTGATAAGGCCGGGTTGAAGAAAGGGCCTTGGACTCCAGAA 

GAGGATCAGAAACTTTTGGCTTATATTGAAGAACATGGCCATGGAAGCTGGCGTTCTTTG 

CCTGAGAAAGCCGGTCTCCAAAGGTGTGGAAAGAGTTGCAGACTCAGATGGACTAACTAC 

CTAAGACCTGACATCAAGAGAGGCAAATTCACTGTACAAGAAGAACAAACCATCATTCAA 

CTCCACGCTCTCCTCGGAAACAGGTGGTCAGCGATTGCAACTCATTTACCAAAGAGGACA 

GACAACGAGATCAAGT^ACTACTGGAACACACACTTGAAGAAACGTCTGATCAAAATGGGG 

ATAGATCCAGTGACTCACAAGCACAAAAACGAGACTCTTTCGTCTTCCACAGGACAATCA 

AAGAACGCAGCCACGCTTAGTCATATGGCTCAATGGGAGAGTGCAAGACTCGACGCTGAA 

GCAAGGCTAGCTAGAGAATCAAAGCTTCTCCATTTACAGCATTACCAAAACAATAACAAC 

CTTAACAAATCAGCAGCTCCTCAACAACATTGCTTCACTCAAAAAACATCAACAAACTGG 

ACTAAACCAAACCAAGGAAACGGAGACCAACAGCTTGAATCTCCGACATCGACGGTGACA 

TTCTCTGAGAATCTTCTGATGCCTTTAGGAATCCCTACGGATAGCAGCAGAAATAGAAAC 

AATAACAACAATGAGTCCTCGGCGATGATTGAATTGGCCGTATCTTCGTCAACCTCCTCC 

GATGTGAGTCTGGTCAAAGAACATGAACACGACTGGATTAGGCAGATCAACTGTGGTAGT 

GGAGGAATAGGAGAAGGATTCACGAGTCTATTGATCGGTGATTCGGTCGGCCGGGGTTTA 

CCCACCGGGAAAAACGAAGCGACGGCGGGCGTGGGGAATGAGAGTGAGTATAACTACTAT 

GAGGATAACAAGAATTACTGGAATAGCATTCTCAACTTGGTTGATTCTTCACCGTCCGAT 

TCCGCGACGATGTTCTGA 

>G2426 Amino Acid Sequence (conserved domain in AA coordinates : 14-114) 
MGRSPCCDKAGLKKGPWTPEEDQKLLAYIEEHGHGSWRSLPEKAGLQRCGKSCRLRWTNY 
LRPD I KRGKFTVQEEQTI I QLHALLGNRWS AI ATHLPKRTDNE IKNYWNTHLKKRL I KMG 
IDPVTHKHKNETLS S STGQSKNAATL SHMAQWES ARLDAEARLARES KLLHLQHYQNNNN 
LNKSAAPQQHCFTQKTSTNWTKPNQGNGDQQLESPTSTVTFSENLLMPLGIPTDSSRNRN 
Nl^ESSAMIEIAVSSSTSSDVSLVKEHEHDWIRQINCGSGGIGEGFTSLLIGDSVGRGL 
PTGKNEATAGVGNESE YNYYEDNKNYWNS I LNLVDS S PSDSATMF * 
>G1524 (1..825) 

ATGGGGAGAACTAAGGAGCAGGCAACATTAACTCGGTATCCACCCTGTCCTAGGAATCCT 
GCTAAATTCAATGATATAAACAAAGCACTCCAGGAAAAAGGATATGGTAAGGCTCTGAAA 
^GAAAACCTTGGACGGGTGTGACATGCCCTGTCTGTCTTGAGGTTCCTCACAACTCGGTC 
GTCCTCCTTTGTTCATCTTACCACAAAGGATGCCGTCCGTACATGTGTGCCACGGGAAAC 
CGTTTCTCAAATTGTCTAGAGCAGTACAAAAAGGCATATGCCAAGGATGAGAAAAGTGAC 
AAACCGCCAGAGCTATTGTGCCCGCTTTGTAGGGGTCAGGTGAAAGGCTGGACCGTTGTG 
GAAAAGGAACGTAAGTATCTGAATTCTAAGAAAAGGTCATGCATGAACGACGAGTGTTTG 
TTTTATGGAAGCTATAGACAGCTCAAGAAGCATGTTAAGGAGAACCATCCGAGAGCCAAG 
CCAAGAGCCATAGACCCTGTGCTGGAGGCGAAATGGAAGAAGCTTGAGGTTGAGAGGGAG 
AGGAGTGATGTAATCAGCACAGTCATGTCGTCAACACCTGGGGCTATGGTATTTGGAGAC 
TATGTGATTGAGCCATACAATGGTTATGATCATCAAGATGACAGTGACGATTACAGTGAT 
TCGTCGGATGACGAAATGGAAGGTGGGGTATTCGAGCTTGGAGCATTCGACCTGGGCCGT 
CTTCAACCGCGTTCGGCTGCCATCTCAAGCCGGGGAATTCGCGGTATGATCATAAGGAAC 
CGGTGGGCTCGAAGCAGAGGTGCGAGCAGAAGGCGACAAACATAA 

>G1524 Amino Aeid Sequence (conserved domain in AA coordinates :49-110) 
MGRTKEQATLTRYPPCPRNPAKFNDINKALQEKGYG 

vTjLCSSYHKGCRPYMCATGNRFSNCLEQYKKAYAKDEKSDK^PELIiCPLCRGQVKGWTVV 
EKERKYLNSKKRSCMNDECLFYGSYRQLKKHV1CENH 

RSDVISTViyiSSTPGAMVFGDYVIEPYNGYDHQDDSDDYSDSSDDEMEGGVFELGAFDLGR 
LQPRSAAI S SRG IRGMI I RNRWARSRGASRRRQT * 
>G1243 (1..3174) 

ATGGCGAGAAATTCGAATTCCGATGAGGCTTTCTCGTCAGAGGAGGAAGAAGAGCGGGTT 
AAGGATAATGAAGAAGAAGATGAGGAGGAGCTCGAGGCTGTTGCTCGTTCTTCTGGCTCC 
GACGATGACGAAGTAGCCGCCGCCGACGAATCACCAGTCTCCGACGGAGAGGCTGCTCCC 
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GTAGAAGATGATTACGAGGACGAAGAAGATGAGGAAAAAGCTGAAATCAGCAAACGTGAG 

AAAGCCAGACTTAAAGAGATGCAGAAGTTGAAGAAGCAGAAGATTCAAGAGATGCTGGAG 

TCGCAGAATGCTTCCATTGACGCGGATATGAACAATAAGGGAAAAGGGAGACTGAAGTAT 

CTTCTGCAGCAAACTGAGTTATTTGCCCACTTTGCTAAAAGTGATGGATCTTCTTCTCAG 

AAGAAGGCAAAAGGAAGGGGACGTCATGCTTCCAAAATAACTGAAGAGGAGGAAGACGAA 

GAGTATCTAAAGGAAGAAGAGGATGGCTTAACTGGATCTGGAAACACACGGTTACTCACA 

CAGCCCTCTTGTATTCAAGGGAAGATGAGAGATTACCAATTAGCTGGTTTGAACTGGCTC 

ATTCGTCTTTATGAGAATGGCATAAATGGAATTCTTGCTGATGAAATGGGTCTGGGGAAG 

ACGCTTCAAACGATTTCTTTGTTGGCATATCTTCATGAATACAGGGGAATCAATGGTCCC 

CATATGGTGGTTGCTCCAAAATCAACACTTGGTAATTGGATGAACGAAATTCGCCGGTTT 

TGTCCTGTCCTACGTGCTGTGAAGTTCCTTGGTAATCCTGAGGAGAGGAGACATATTCGA 

GAAGACCTGCTAGTTGCTGGGAAATTTGATATTTGTGTCACAAGCTTTGAGATGGCCATC 

AAAGAGAAGACAGCACTTCGTCGGTTTAGCTGGCGTTATATTATCATTGATGAAGCGCAT 

CGAATCAAGAACGAGAATTCACTCCTTTCTAAAACCATGAGACTTTTTAGCACCAATTAT 

CGGCTTCTTATCACGGGGACCCCCCTTCAGAATAATCTCCATGAACTGTGGGCTCTTCTA 

AATTTTCTTCTGCCTGAGATTTTTAGTTCAGCAGAGACTTTTGATGAATGGTTTCAAATT 

TCTGGTGAGAATGACCAGCAAGAAGTTGTGCAACAACTGCACAAGGTTCTTCGACCATTT 

CTTCTTCGAAGACTAAAGTCAGATGTTGAGAAAGGTTTGCCACCGAAGAAGGAGACCATA 

CTTAA^GTTGGTATGTCTCAGATGCAAAAGCAATACTACAAGGCTTTACTGCAGAAGGAT 

CTTGAAGCGGTTAATGCTGGTGGAGAACGCAAACGTCTGCTAAACATTGCAATGCAACTG 

CGTAAATGCTGCAATCACCCCTATCTCTTCCAGGGTGCAGAACCTGGTCCCCCATATACC 

ACAGGAGATCACCTTATAACAAATGCTGGTAAGATGGTTCTCTTGGATAAATTGCTTCCT 

AAGTTGAAAGAACGTGATTCAAGGGTGCTGATATTTTCTCAGATGACAAGACTTTTGGAT 

ATTCTTGAGGACTATTTAATGTATCGTGGTTACTTGTATTGCCGTATTGATGGAAACACT 

GGTGGTGACGAACGAGATGCCTCCATAGAAGCCTACAACAAGCCAGGAAGTGAGAAATTT 

GTTTTCTTGTTATCTACTAGAGCTGGAGGGCTTGGTATCAATCTTGCTACTGCAGATGTT 

GTGATCCTTTACGATAGTGATTGGAACCCACAAGTCGACTTGCAAGCTCAGGATCGTGCC 

CATAGGATTGGTCAAAAAAAAGAAGTTCAAGTGTTTCGATTCTGCACTGAGTCTGCTATT 

GAGGAGAAAGTGATTGAAAGAGCTTACAAGAAGTTAGCACTTGATGCTCTGGTTATTCAA 

CAAGGGAGATTGGCAGAACAGAAAAGTAAGTCTGTCAATAAGGATGAGTTGCTTCAAATG 

GTAAGATATGGTGCTGAGATGGTGTTCAGTTCTAAAGATAGCACAATCACAGACGAGGAT 

ATTGATAGAATCATTGCCAAAGGAGAAGAGGCAACAGCTGAACTTGATGCTAAGATGAAG 

AAATTCACAGAAGATGCTATACAGTTTAAAATGGATGACAGTGCTGACTTCTATGATTTT 

GATGATGACAATAAGGATGAAAACAAGCTCGATTTTAAAAAGATTGTAAGCGACAATTGG 

AATGATCCCCCCAAGCGGGAGAGAAAGCGCAACTACTCTGAATCTGAGTACTTTAAGCAA 

ACATTGCGGCAAGGTGCTCCAGCTAAACCTAAAGAGCCTAGAATTCCGCGCATGCCCCAG 

TTGCACGATTTCCAGTTCTTTAACATTCAGAGATTGACCGAGTTGTATGAAAAGGAAGTA 

CGTTATCTCATGCAAACACATCAGAAAAATCAGTTGAAAGACACAATTGATGTTGAAGAA 

CCAGAAGGTGGGGATCCCTTAACTACTGAAGAAGTAGAAGAAAAGGAGGGATTATTGGAG 

GAGGGTTTCTCAACATGGAGCAGAAGAGATTTTAATACTTTCCTCAGGGCTTGTGAGAAG 

TATGGCCGCAACGACATAAAAAGCATTGCCTCTGAGATGGAAGGGAAAACAGAGGAAGAA 

GTTGAAAGATATGCCAAAGTATTTAAAGAGCGGTACAAGGAGCTGAACGACTATGATAGA 

ATCATTAAGAACATTGAGAGGGGAGAGGCAAGGATCTCTAGGAAAGACGAAATCATGAAG 

GCCATAGGGAAGAAACTGGATCGCTACAGAAACCCTTGGCTGGAACTGAAGATTCAATAT 

GGTCAGAACAAAGGCAAGCTGTACAATGAAGAGTGTGACCGTTTCATGATCTGCATGATT 

CACAAACTTGGTTATGGGAATTGGGATGAGCTAAAGGCAGCATTTAGGACATCGTCTGTG 

TTCAGGTTTGACTGGTTTGTGAAATCCCGCACGAGTCAGGAACTTGCAAGAAGATGCGAC 

ACTCTGATTCGACTGATCGAGAAAGAGAACCAGGAGTTTGATGAAAGAGAGAGGCAAGCC 

CGCAAAGAGAAGAAGCTCGCGAAGAGTGCAACACCATCAAAGCGACCTTTAGGAAGACAA 

GCAAGTGAGAGTCCTTCATCGACGAAGAAGCGGAAGCACCTGTCGATGAGATGA 

>G1243 Amino Acid Sequence (domain in AA coordinates: 216-609) 

MARN SNSDKAFS SEBEEERVKDNEEEDEEELE AVARS SG SDDDEVAAADESPVSDGEAAP 

VEDD YEDEEDEEKAE I S KREKARLKEMQKLKKQKI QEMLESQNAS IDADMNNKGKGRLKY 

LLQQTELPAHFAKSDGSSSQKKAKGRGRHASKITEEEEDEEYLKEEEDGIiTGSGNTRLLT 

QPS C I QG KMRD YQLAGLNWL I RL YENG ING I LADEMGLGKTLQTI S LLAYLHE YRG INGP 

HMWAPKSTLGNWMNEI RRFCPVLRAVKFLGNPEERRHIREDLLVAGKFDI CVTS FEMAI 

KEKTALRRFSWRYI I IDEAHRIKNENSLLSICTMRLFSTNYRLLITGTPLQNNLHELWALIj 
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NFLLPEIFSSAETFDEWFQISGENDQQEWQQLHKVLRPFLLRRLKSDVEKGLPPKKETI 

liKVGMSQMQKQYYKALLQKDLEAVNAGGERKRLLNIAMQLRKCCNHPYLFQGAEPGPPYT 

TGDHLITNAGKJWLLDKLLPKLKERDSRVLIFSQMTRLLDILEDYLMYRGYLYCRIDGNT 

GGDERDASIEAYNKPGSEKFVFLLSTRAGGLGINLATADWILYDSDWNPQVDLQAQDRA 

HRIGQKKEVQVFRFCTESAIEEICVIERAYKKLALDALVIQQGRLAEQKSKSVNKDELLQM 

VRYGAEMVFSS KDS T I TDED I DR 1 I AKGEE ATAELDAKMKKFTEDAI QFKMDDS ADF YDF 

DDDNKDENKLDFKKIVSDNWNDPPKRERKRNYSESEYFKQTLRQGAPAKPKEPRIPRMPQ 

LHDFQFFNIQRLTELYEKEVRYLMQTHQKNQLKDTIDVEEPEGGDPLTTEEVEEKEGLLE 

EGFSTWSRRDFNTFLRACEKYGRNDIKSIASEMEGKTEEEVERYAKVFKERYKELNDYDR 

IIKNIERGEARISRKDEIMKAIGKKLDRYRNPWLELKIQYGQNKGKLYNEECDRFMICMI 

HKLG YGNWDEL KAAFRTSS VFRFDWFVKSRT S QEL ARRCDTL I RLI E KENQE FDERERQA 

RKEKKLAKSATPSKRPLGRQASESPSSTKKRKHLSMR* 

>G631 (190.. 1461) 

CTTCTTCTTCTTCTTCTTCTTCTTCTTCCTCCTCTCTCGTCGGATCTCTCTGATTTAGTG 

ATTTTTCAAATTTCAAGTTTTCTTCACCTTTAATTTTGTGTCTCGTTGATCTCTCTTTGG 

ACATTCTGCTTTGGATTCTGGAGGCTTCTCATTAGATCTCTATTAGTGGGTTTAGGTCAA 

GTTCTTGAAATGGATAAGGAGAAATCTCCTGCACCACCACCTAGTGGAGGTCTTCCTCCA 

CCATCGGGTCGTTACTCTGCGTTTTCACCTAATGGAAGTAGCTTTGCAATGAAAGCTGAA 

TCATCTTTTCCTCCTTTGACTCCAAGTGGAAGCAATAGCTCAGATGCTAACCGATTCAGC 

CATGATATTAGCCGAATGCCGGATAATCCACCTAAGAACCTAGGCCATCGCCGAGCTCAT 

TCAGAGATTCTTACTCTTCCTGATGACTTAAGCTTTGATAGTGATCTTGGTGTGGTTGGT 

GCTGCTGATGGACCTTCTTTCTCTGATGATACTGACGAGGACTTACTCTATATGTATCTT 

GATATGGAAAAATTCAATTCTTCTGCTACATCGACTTCTCAAATGGGTGAGCCATCAGAA 

CCGACTTGGAGGAATGAATTAGCCTCGACTTCTAACCTTCAGAGTACACCCGGTAGCTCT 

AGTGAAAGACCGAGAATTAGACACCAACACAGCCAATCGATGGATGGTTCAACAACTATC 

AAGCCTGAGATGCTTATGTCAGGGAATGAAGATGTGTCTGGAGTTGACTCTAAGAAAGCC 

ATCTCTGCTGCTAAACTTTCTGAGCTTGCTCTCATTGATCCAAAACGCGCCAAGAGGATA 

TGGGCAAACAGGCAGTCTGCTGCGAGGTCAAAAGAAAGGAAGATGAGATACATTGCAGAG 

CTCGAGAGAAAAGTACAGACTTTACAAACAGAGGCCACATCTCTCTCAGCCCAGTTGACT 

CTCTTACAGAGAGATACAAATGGCCTGGGTGTTGAAAACAATGAGCTTAAACTGCGAGTA 

CAAACTATGGAGCAACAGGTCCACCTACAGGATGCTTTAAATGATGCACTAAAGGAGGAA 

GTCCAGCATCTTAAGGTATTGACGGGGCAAGGTCCATCAAATGGTACATCAATGAACTAC 

GGTTCTTTTGGATCAAACCAGCAATTCTATCCCAATAATCAGTCGATGCACACTATCTTA 

GCCGCACAACAGTTACAGCAGCTCCAGATCCAGTCACAGAAACAGCAACAACAACAACAG 

CAACAC CAG CAACAAC AACAGCAGCAG CAGCAG C AATTTCACTTTC AACAGCAG CAACTG 

TACCAGCTTCAGCAG(^GCAACGGCTTCAACAACAGGAACAACAAAGCGGGGCTTCAGAG 

CTAAGAAGACCCATGCCTTCTCCTGGTCAGAAAGAGAGTGTGACATCGCCTGATCGTGAA 

ACTCCCTTGACAAAAGACTGAGTCTAGACTGTGCTAATGTCCAATTTAGTAAGTTACTCT 

TGGAAAATCTTCTTTTTC^TCGCAGGCTCATGGATTTGGGATTTACTGCATTATAGAGTT 

AAAAACAAGACAGCTTAGAAGTTGCGGATTTAGAAGTTGTTAGTGAAGCTTTTGTTCTCG 

TCTGTTGGTAGTTTACAATCTTCTCTTTGTATGATCCTAAG 

>G631 Amino Acid Sequence (domain in AA coordinates: TBD) 

MDKEKSPAPPPSGGLPPPSGRYSAFSPNGSSFAMKAESSFPPLTPSGSNSSDANRFSHDI 

SRMPDNPPKNLGHRRAHSEILTLPDDLSFDSDLGVVGAADGPSFSDDTDEDLLYWYLDME 

KFNSSATSTSQMGEPSEPTORWELASTSNLQSTPGSSSERPRIRHQHSQSMDGSTTIKPE 

MLMSGNEDVSGVDSKKAISAAKLSELALIDPKIUVKR^ 

KyQTLQTEATSLSAQLTLLQRDTNGLGVENNELKLRVQTMEQQVHLQDALNDALKEEVQH 
LKVLTGQGPSNGTSMNYGSFGSNQQFYPNNQSMHTILAAQQLQQLQIQSQKQQQQQQQHQ 
QQQQQQQQQFHFQQQQLYQLQQQQRLQQQEQQSGASELRRPMPSPGQKESVTSPDRETPL 
TKD* 

>G1909 (1..828) 

ATGGGTGGATCGATGGCGGAGAGAGCAAGGCAGGCCAACATTCCTCCACTAGCGGGACCC 
CTAAAGTGTCCTCGATGCGACTCCAGCIAACACTAAGTTCTGTTACTACAACAACTATAAC 
CTCACTCAGCCTCGTCACTTCTGCAAAGGTTGCCGTCGCTACTGGACACAAGGGGGCGCC 
CTGAGAAACGTCCCTGTAGGTGGAGGCTGCCGGAGGAATAACAAGAAGGGCAAAAATGGA 
AATTTAAAATCTTCTTCTTCTTCGTCCAAACAGTCTTCCTCGGTCAACGCTCAAAGTCCT 
AGCTCAGGACAGCTAAGGACAAATCATCAGTTCCCTTTTTCACCAACTCTTTACAATCTC 
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ACTCAACTCGGAGGTATTGGTTTGAACTTAGCCGCTACTAATGGCAACAACCAAGCTCAC 
CAGATCGGTTCCAGTTTGATGATGAGCGATCTAGGGTTTCTCCATGGACGAAATACTTCA 
ACTCCGATGACGGGAAACATTCATGAAAACAACAACAATAATAACAATGAAAACAACCTA 
ATGGCATCCGTTGGATCTTTGAGCCCGTTTGCTCTCTTCGATCCAACGACGGGGCTATAC 
GCTTTCCAGAACGACGGTAATATCGGGAACAACGTTGGGATATCTGGTTCTTCTACTTCC 
ATGGTTGATTCTAGGGTTTATCAGACGCCTCCGGTGAAGATGGAAGAACAACCTAATTTG 
GCTAACTTGTCTAGACCGGTCTCCGGTTTGACGTCTCCTGGGAATCAAACAAATCAGTAC 
TTTTGGCCTGGTTCGGATTTCTCGGGTCCTTCTAATGATCTCTTGTGA 

>G1909 Amino Acid Sequence (conserved domain in AA coordinates -23-51) 

MGGSMAERARQANIPPLAGPLKCPRCDSSNTKFCYYNNYNLTQPRHFCKGCRRYWTQGGA 

LRNVPVGGGCRRNNKKGKNGNLKSSSSSSKQSSSVNAQSPSSGQLRTNHQFPFSPTLYNL 

TQLGGIGLNLAATNGNNQAHQIGSSLMMSDLGFLHGRNTSTPMTGNIHENNNWNNNENNL 

MASVGSLSPFALFDPTTGLYAFQNDGNIGNNVGISGSSTSMVDSRVYQTPPVKMEEQPNL 

ANLSRPVSGLTSPGNQTNQYFWPGSDFSGPSNDLL* 

>G1663 (64.. 630) 

TTCTCTCTGTGAATCCTTGTTCATCGTCACTGAAATTAGTTTACAAAATCGACGAATTCG 

GAGATGATTTTTCAGAATGTGTGCAGAAATGAGTCCAACTTCAACGCTATAGCTTCCGAA 

TCGCGTTCCCAAACGCAGTTCGGTGTTTCGAAATCCTCCTCGAGCGGCGGCGGATGTATC 

TCCGCCAGGACTAAAGACCGTCACACGAAGGTTAACGGACGAAGCCGTCGAGTTACGATG 

CCGGCTCTCGCCGCCGCTAGGATTTTCCAGTTAACGCGTGAGCTCGGTCACAAAACTGAA 

GGAGAAACCATCGAATGGCTTCTTAGTCAAGCTGAACCGTCGATTATTGCCGCCACTGGC 

TACGGGACTAAGCTCATTTCGAATTGGGTTGATGTTGCGGCGGACGATTCCTCGTCGTCG 

TCGTCGATGACGTCGCCGCAAACGCAAACGCAAACGCCACAATCGCCGAGTTGTAGGTTG 

GATCTTTGTCAGCCAATCGGAATTCAGTATCCGGTGAATGGTTACAGTCATATGCCGTTC 

ACAGCGATGCTTTTAGAGCCGATGACCACGACGGCGGAATCTGAGGTTGAGATCGCGGAG 

GAGGAGGAACGTAGACGCCGTCACCATTAGTAAAATTAGGCTTTTGATTTAGAGTGTTAA 

AATTAGGATTTTAAAAGTTTAGGAGGTAACAGATAAGGATAATT 

>G1663 Amino Acid Sequence (domain in AA coordinates: TBD) 

MIFQNVCRNESNFNAIASESRSQTQFGVSKSSSSGGGCISARTKDRHTKVNGRSRRVTMP 
ALAAARI FQLTRELGHKTEGETI E WLLSQAEPS 1 1 AATGYGTKL I SNWVDVAADDSSS S S 

SMTSPQTQTQTPQSPSCRLDLCQPIGIQYPVNGYSHMPFTAMLLEPMTTTAESEVEIAEE 
EERRRRHH* 

>G1231 (103.. 870) 

CAAACCCAAATTCTCTCAGCGCCGGTCAAATACTTGTCTCTCTCTCTCTCTCTCTTTCAC 
TCTTGTCTTGTCTCCTTCGAAGCTGTTTGTTCTGTAAGAAAGATGGAAGCAGGTGGCGCG 
TACAATCCACGCACTGTTGAAGAGGTGTTTAGGGATTTTAAGGGTCGTAGAGCTGGCATG 
ATTAAGGCTTTAACCACTGATGTTCAGGAGTTTTTCCGACTTTGTGATCCCGAAAAGGAG 
AACCTTTGCCTTTACGGACATCCAAATGAGCACTGGGAAGTGAATTTGCCAGCTGAAGAG 
GTTCCTCCTGAGCTCCCAGAGCCTGTCTTGGGTATCAATTTTGCCAGAGACGGGATGGCG 
GAAAAGGATTGGTTGTCCCTTGTTGCTGTCCACAGTGATGCTTGGCTTCTTGCTGTTGCT 
TTCTTTTTTGGAGCCAGGTTTGGATTTGACAAAGCTGATAGGAAGAGGCTTTTCAATATG 
GTGAATGACCTCCCAACAATCTTTGAGGTTGTAGCTGGCACTGCTAAGAAACAAGGAAAA 
GATAAGTCCTCTGTTTCCAACAACAGCAGCAACAGATCCAAATCAAGCTCCAAGCGAGGA 
TCTGAATCCCGTGCCAAGTTCTCAAAGCCGGAGCCCAAAGATGATGAGGAGGAGGAAGAG 
GAAGGTGTGGAAGAGGAGGATGAGGATGAGCAAGGTGAAACACAGTGTGGAGCATGTGGT 
GAGAGCTATGCAGCTGATGAGTTCTGGATTTGCTGTGACCTCTGTGAGATGTGGTTTCAT 
GGAAAGTGTGTTAAGATAACACCAGCAAGAGCTGAGCACATCAAGCAATACAAGTGCCCT 
TCTTGCAGCAACAAAAGGGCTCGTTCCTAAATTTGTTGACCGCTCGCTTCTGTGTATCTA 
CCTTTGCATATGATGATGAACAGCTTAACTGTTTGGTTTAGATCAGATTTGTCATATGGA 
TTTGGTAATTTAGGAAGACATTTTAGTTTTTTCATTGTTACATTTTGGCGATTGAAGGGA 
TAACTCTTTGTTTAGGGGTAATGATCTTTTGCTCTGTTTTATGTTTGTTTATTAACATTC 
TTCAAACTCAATCAAAAGTATTTTGGTTAGTCTTAAAA 

>G1231 Amino Acid Sequence (domain in AA coordinates: TBD) 

MEAGGAYNPRTV^EVFRDFKGRRAGMIKALTTDVQEFFRLCDPEKENLCLYGHPNEHWEV 

NLPAEEVPPELPEPVLGINFARDGMAEKDWLSLVAVHSDAWLLAVAFFFGARFGFDKADR 

KRLFNMVNDLPTIFEWAGTAKKQGKDKSSVSNNSSNRSKSSSKRGSESRAKFSKPEPKD 

DEEEEEEGVEEEDEDEQGETQCGACGESYAADEFWICCDLCEMWFHGKCVXITPARAEHI 
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KQYKCPSCSNKRARS * 
>G227 (21. .983) 

GTACCGTCGACGATCCGGCGATGTCAAACCCGACCCGTAAGAATATGGAGAGGATTAAAG 
GTCCATGGAGTCCAGAAGAAGATGATCTGTTGCAGAGGCTTGTTCAGAAACATGGTCCGA 
GGAACTGGTCTTTGATTAGCAAATCAATCCCTGGACGTTCCGGCAAATCTTGTCGTCTCC 
GGTGGTGTAACCAGCTATCTCCGGAGGTAGAGCACCGTGCTTTTTCGCAGGAAGAAGACG 
AGACGATTATTCGAGCTCACGCTCGGTTTGGTAACAAGTGGGCTACGATCTCTCGTCTTC 
TCAATGGACGAACCGATAACGCTATCAAGAATCATTGGAACTCGACGCTGAAGCGAAAAT 
GCAGCGTCGAAGGGCAAAGTTGTGATTTTGGTGGTAATGGAGGGTATGATGGTAATTTAG 
GAGAAGAGCAACCGTTGAAACGTACGGCGAGTGGTGGTGGTGGTGTCTCGACTGGCTTGT 
ATATGAGTCCCGGAAGTCCATCGGGATCTGACGTCAGCGAGCAATCTAGTGGTGGTGCAC 
ACGTGTTTAAACCAACGGTTAGATCTGAGGTTACAGCGTCATCGTCTGGTGAAGATCCTC 
CAACTTATCTTAGTTTGTCTCTTCCTTGGACTGACGAGACGGTTCGAGTCAACGAGCCGG 
TTCAACTTAACCAGAATACGGTTATGGACGGTGGTTATACGGCGGAGCTGTTTCCGGTTA 
GAAAGGAAGAGCAAGTGGAAGTAGAAGAAGAAGAAGCGAAGGGGATATCTGGTGGATTCG 
GTGGTGAGTTCATGACGGTGGTTCAGGAGATGATAAGGACGGAGGTGAGGAGTTACATGG 
CGGATTTACAGCGAGGAAACGTCGGTGGTAGTAGTTCTGGCGGCGGAGGTGGCGGTTCGT 
GTATGCCACAAAGTGTAAACAGCCGTCGTGTTGGGTTTAGAGAGTTTATAGTGAACCAAA 
TCGGAATTGGGAAGATGGAGTAGGCGGCC 

>G227 Amino Acid Sequence (domain in AA coordinates: 13-112) 
MSNPTRKNMERIKGPWSPEEDDLLQRLVQKHGPRNWSLISKSIPGRSGKSCRLRWCNQLS 
PEVEHRAFSQEEDETI I RAHARFGNKWATI SRLLNGRTDNAI KNHWNSTLKRKCSVEGQS 
CDFGGNGGYDGNLGEEQPLKRTASGGGGVSTGIiYMSPGSPSGSDVSEQSSGGAHVFKPTV 
RSEVTASSSGEDPPTYLSLSLPWTDETVRVNEPVQLNQNTVMDGGYTAELFPVRKEEQVE 
VEEEEAKG I SGGFGGEPMTWQEMIRTEVRS YMADLQRGNVGGS S SGGGGGGSCMPQS VN 
SRRVGFREFIWQIGIGKME* 
>G1842 (219. .809) 

ACTATTACATGCCTCTTCCTCGCTTCAAAACGGCACCGTTTCCACTTGTTATTATTTTTC 
TCTCTATCGTCTAACAAAAAAAAAAACTGACTTGGGATTTTTTTTCATTTGTCTAGCCCA 

aaagaagaagatagaaacgaagaaaaaaagcaaacacattttgggtccccggtggttagg 
atcaaattagggcacaaaccttatcggagaaagaagccatgggaagaagaaaagtcgaga 
tcaagcgaatcgagaacaaaagcagtcgacaagtcactttctccaaacgacgcaaaggtc 
tcatcgaaaaagctcgacaactttcaattctctgtgaatcttccatcgctgttgtcgccg 
tctccggttccggaaaactctacgactctgcctccggtgacaacatgtcaaagatcattg 
atcgttatgaaatacatcatgctgatgaacttaaagccttagatcttgcagaaaaaattc 
ggaattatcttccacacaaggagttactagaaatagtccaaagcaagcttgaagaatcaa 
atgtcgataatgtaagtgtagattctctaatatctatggaggaacagctcgagactgctc 
tgtcagtaattagagctaagaagacagaactaatgatggaggatatg/uvgtcacttcaag 
aaagggagaagttgctgatagaagagaaccagattctggctagccaggtggggaagaaga 
cgtttctggttatagaaggtgacagaggaatgtcacgggaaaatggctccggcaacaaag 
taccggagactctttcgctgctcaagtaatcac(^tcatcaacggctgagctttcaccat 
aaacttactcacagcctgattcagaagcttttacaaaattgtaaattataaaaagctgca 
tmtaatctcaacctttttatct^ 

gaagctcttttcttttatgcgaaagaattgtaaaactaagataaagctaccgatctttgt 
tgtaccttagtagacaaatatcagagttcttgtgcttgt 

>G1842 Amino Acid Sequence (domain in AA coordinates: 2-57) 

MGRRKVEIKRIENKSSRQVTFSKRRKGLIEIO^QLSILCESSIAWAVSGSGKLYDSASG 

Dl^SKIIDRYEIHHADELKALDLAEKIRNYLPHKELLEIVQSKLEESNVDNV 

EEQLETALSVIRAKKTELMMEDMKSLQEREKLLIEENQ ILAS QVGKKTFLVIEGDRGMS R 

ENGSGNKVPETLSLLK* 

>G1505 (1..681) 

ATGGATGATATAGCGGAACTTGAATGGTTATCAAATTTCGTAGATGATTCTTCTTTCACG 
CCGTATTCTGCTCCGACGAATAAACCGGTTTGGTTAACCGGAAATCGGAGACATCTTGTA 
CAACCGGTTAAAGAGGAGACCTGCTTCAAATCCCAACATCCGGCCGTCAAAACCAGACCC 
AAACGAGCCAGAACCGGAGTCAGAGTCTGGTCTCATGGTTCGCAGTCGTTAACCGACTCA 
TCTTCAAGCTCTACAACATCTTCGTCGTCCTCTCCTCGTCCTTCAAGCCCTCTATGGCTC 
GCCAGCGGTCAGTTTCTTGATGAGCCAATGACTAAAACAG^^ 
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AAAAACGCTGGTCAGACGCAAACGCAAACGCAGACGCAGACGCGGCAGTGTGGTCATTGT 
GGAGTTCAGAAAACGCCGCAGTGGAGAGCAGGACCATTAGGAGCGAAGACGTTGTGTAAT 
GCGTGTGGTGTGCGTTACAAATCGGGTCGGTTACTACCCGAATATAGACCCGCTTGTAGC 
CCAACATTTTCGAGTGAGCTTCACTCAAACCACCACAGTA7UVGTCATTGAGATGCGTAGG 
AAGAAAGAGACTTCTGACGGTGCTGAAGAAACCGGTTTGAACCAGCCGGTTCAGACGGTT 
CAGGTTGTCTCGAGTTTTTGA 

>G1505 Amino Acid Sequence (domain in AA coordinates: TBD) 

MDDIAELEWLSNFVDDSSFTPYSAPTNKPVWLTGNRRHLVQPVKEETCFKSQHPAVKTRP 

KRARTGVRVWSHGSQSLTDSSSSSTTSSSSSPRPSSPLWLASGQFLDEPMTKTQKKKKVW 

KNAGQTQTQTQTQTRQCGHCGVQKTPQWRAGPLGAKTLCNACGVRYKSGRIiLPEYRPACS 

PTFS S ELHSNHHS KVI EMRRKKETSDGAEETGLNQP VQTVQ WS S F * 

>G657 (1..2331) 

ATGAAGCGTGAGATGAAAGCACCTACTACTCCACTAGAGAGTCTCCAAGGTGACCTCAAA 

GGAAAACAAGGGAGGACATCTGGCCCTGCTAGACGATCTACCAAAGGACAATGGACACCT 

GAAGAGGACGAAGTCTTGTGTAAAGCTGTTGAGCGTTTTCAAGGAAAGAACTGGAAGAAG 

ATAGCTGAATGTTTTAAGGATCGGACTGATGTTCAGTGTCTTCATAGATGGCAAAAGGTC 

TTGAACCCAGAGCTTGTGAAAGGACCGTGGTCAAAAGAGGAGGATAACACAATAATTGAC 

CTGGTTGAAAAATATGGGCCAAAGAAATGGTCTACTATATCTCAGCATTTACCTGGGCGC 

ATAGGAAAGCAATGTAGGGAAAGGTGGCATAACCATCTTAACCCTGGGATTAATAAAAAT 

GCATGGACTCAGGAAGAGGAACTGACTCTTATTCGTGCGCATCAAATTTATGGGAATAAA 

TGGGCAGAGCTTATGAAATTTTTGCCAGGAAGGTCAGATAATTCGATAAAAAATCATTGG 

AACAGCTCAGTTAAGAAGAAGTTGGATTCCTACTATGCATCAGGTCTTTTAGATCAGTGT 

CAAAGCTCGCCATTAATTGCCCTTCAGAACAAATCTATCGCTTCATCTTCCTCGTGGATG 

CACAGCAATGGAGATGAAGGTAGTTCAAGGCCAGGGGTTGATGCTGAGGAATCAGAATGC 

AGCCAAGCTTCAACTGTTTTCTCACAATCAACCAACGATTTACAAGATGAAGTTCAACGT 

GGAAATGAGGAATATTACATGCCTGAATTTCATTCAGGAACGGAGCAGCAAATCTCAAAC 

GCTGCATCTCATGCAGAACCGTACTACCCTTCCTTTAAAGATGTCAAAATTGTTGTCCCC 

GAAATTTCTTGTGAAACAGAATGTTCGAAGAAGTTTCAGAATCTTAATTGTTCTCACGAG 

CTAAGAACTACCACAGCTACGGAGGATCAATTGCCGGGTGTATCTAATGATGCTAAACAG 

GACCGTGGTCTAGAGTTATTGACCCATAACATGGACAACGGTGGAAAAAACCAAGCACTT 

CAACAAGATTTTCAAAGTTCAGTAAGATTAAGTGATCAACCTTT1TTGTCAAACTCG 

ACAGATCCAGAAGCTCAAACTTTGATCACGGATGAGGAGTGTTGTAGGGTTCTTTTTCCA 

GATAACATGAAAGATAGCAGTACATCTTCTGGTGAGCAAGGTCGGAATATGGTTGACCCT 

CAAAACGGCAAAGGATCTCTTTGTTCTCAGGCTGCAGAAACCCATGCTCATGAAACTGGA 

AAAGTTCCAGCTTTACCGTGGCATCCTTCAAGTTCTGAGGGCCTGGCGGGTCATAATTGT 

GTCCCTTTGTTGGATTCAGACTTGAAGGACTCACTTTTACCCCGTAATGATTCCAACGCT 

CCTATACAAGGTTGTCGCCTTTTTGGAGCTACCGAATTAGAATGTAAGACTGATACAAAT 

GACGGTTTCATCGATACTTACGGACATGTAACTTCCCATGGCAATGATGATAATGGTGGT 

TTCCCAGAACAACAGGGGCTGTCATATATTCCCAAGGATTCTTTGAAGCTAGTACCTTTG 

AATAGTTTTTCTTCTCCTTCTAGAGTGAACAAGATTTATTTTCCTATTGACGATAAGCCG 

GCTGAAAAAGACAAAGGAGCTCTTTGTTATGAACCTCC^CGTTTTCCAA^TGCAGATATT 

CCTTTCTTCAGCTGTGATCTTGTACCATCAAATAGTGACTTACGGCAAGAGTACAGTCCC 

TTTGGTATCCGTCAGTTGATGATTTCTTCAATGAATTGTACAACTCCGTTAAGGTTATGG 

GATTCACCGTGTCACGATAGGAGCCCTGATGTCATGCTTAATGATACTGCCAAAAGTTTT 

AGTGGTGCACCATCCATCTTAAAGAAGCGGCATCGAGACTTGCTTTCACCTGTGCTTGAT 

AGAAGAAAAGACAAAAAGCTTAAAAGGGCTGCGACTTCCTCCTTGGCTAATGATTTTTCG 

CGCTTAGATGTAATGCTTGATGAAGGAGATGATTGCATGACCTCTCGTCCGTCAGAGTCT 

CCTCAAGATAAAAATATATGTGCCTCCCCTTCCATAGCCAGAGATAACAGAAATTGTGCA 

TCAGCTCGK3TTATATCAAGAAATGATTCCGATAGATGAGGAACCAAAGGAAACCTTAGAA 

TCAGGTGGAGTGACTTCTATGCAAAATGAAAATGGATGTAATGACGGTGGTGCTTCAGCT 

AAAAATGTAAGTCCGTCTTTGTCCTTGCATATTATCTGGTATCAGTTATAA 

>G657 Amino Acid Sequence (domain in AA coordinates: TBD) 

MKREMKAPTTPLESLQGDLKGKQGRTSGPARRSTKGQWTPEEDEVLCKAVERFQGKNWKK 

IAECFKDRTDVQCLHRWQKVLNPELVKGPWSKEEDNTIIDLVEKYGPKIOT 

IGKQCRERWHNHLNPGINKNAWTQEEELTLIRAHQIY 

NSSVKKKLDSYYASGLLDQCQSSPLIALQNKSIASSSSWMHSNGDEGSSRPGVDAEESEC 
SQASTVFSQSTNDLQDEVQRGNEEYYMPEFHSGTEQQISNAASHAEPYYPSFKDVKIVVP 
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EISCETECSKKFQNLNCSHELRTTTATEDQLPGVSNDAJCQDRGLELLTHNMDNGGKNQAL 

QQDFQSSVRLSDQPFLSNSDTDPEAQTLITDEECCRVLFPDNMKDSSTSSGEQGRNMVDP 

QNGKGSLCSQAAETHAHETGKVPALPWHPSSSEGLAGHNCVPLLDSDLKDSLLPRNDSNA 

PIQGCRLFGATELECKTDTNDGFIDTYGHVTSHGNDDNGGFPEQQGLSYIPKDSLKLVPL 

NSFSSPSRVNKIYFPIDDKPAEKDKGALCYEPPRFPSADIPFFSCDLVPSNSDLRQEYSP 

FGIRQLMISSMNCTTPLRLWDSPCHDRSPDVMLNDTAKSFSGAPSILKKRHRDLLSPVLD 

RRKDKKLKRAATSSLANDFSRLDVMLDEGDDCMTSRPSESPEDKNICASPSIARDNRNCA 

SARLYQEMIPIDEEPKETLESGGVTSMQNENGCNDGGASAKNVSPSLSLHIIWYQL* 

>G1959 (141. .1028) 

CGTCGACTGTCCATAAATCCGGAGCCTGACCCGACGTTTGACCCGGATCCGAAACTCCCA 
CAATCTCCATACCACCCAAATTCATCTCCCCTAAAGCTTTCTCTCACTTTCCCGGGAAAA 
TCGGCGACCAAAATTGGAAAATGTACTCAGCGATTCGCTCGCTTCCACTCGATGGTGGAC 
ACGTTGGTGGTGACTACCATGGACCTCTTGACGGAACCAATCTTCCCGGTGACGCTTGTT 
TGGTTTTAACGACTGACCCTAAACCTCGTCTCCGGTGGACAACTGAGCTTCATGAGAGAT 
TCGTTGACGCCGTTACTCAGCTCGGTGGTCCTGACAAAGCGACTCCCAAAACTATTATGA 
GAACAATGGGAGTGAAGGGTCTCACTCTCTACCACCTCAAATCACATCTTCAGAAATTCC 
GCCTAGGGAGGCAAGCTGGCAAAGAATCAACTGAGAACTCTAAAGATGCTTCTTGTGTAG 
GGGAGAGTCAGGACACAGGTTCATCTTCGACATCATCAATGAGAATGGCGCAGCAGGAGC 
AGAACGAGGGTTACCAAGTCACCGAAGCTCTACGTGCTCAGATGGAAGTCCAAAGAAGAC 
TACACGATCAATTGGAGGTGCAACGGAGGCTCCAGCTGAGGATAGAGGCACAAGGAAAAT 
ACCTGCAATCGATTCTTGAAAAAGCTTGCAAGGCCTTTGACGAGCAAGCTGCTACTTTTG 
CTGGACTTGAGGCTGCTAGGGAAGAGCTATCAGAGCTAGCCATCAAAGTCTCCAATAGCT 
CTCAAGGAACATCAGTCCCGTACTTCGATGCAACAAAGATGATGATGATGCCATCGTTGT 
CAGAGCTTGCAGTAGCAATAGACAACAAAAACAACATCACAACCAACTGTTCAGTAGAA^ 
GCTCTCTGACTTCCATCACACATGGGAGCTCTATATCTGCTGCATCAATGAAGAAGCGTC 
AACGTGGAGACAATTTGGGCGTAGGGTATGAATCAGGCTGGATTATGCCTAGTAGCACCA 
TTGGATAAAGTTTAGGAGAGGGAAAAAGTTCATTATGGGAAAGGTAGAGATAAGATTTAA 
CTGTTCTTTACTTGCTTTGAGGGGCCTGCGGCCGCT 

>G1959 Amino Acid Sequence (conserved domain in AA coordinates : 46-^7) 
MYSAIRSLPLDGGHVGGDYHGPLDGTNLPGDACLVXTTDPKPRLRWTTELHERFVDAVTQ 
LGGPDKATPKTIMRTMGVKGLTLYHLKSHLQKFRLGRQAGKE S TENSKDAS CVGES QDTG 
S S STS SMRMAQQEQNEG YQ VTE ALRAQME VQRRLHDQLE VQRRLQLRI EAQGKYLQS I LE 
KACKAFDEQAATFAGLE AARE ELSELAI KVSNS S QGTS VP YFD ATKMMMMP S LS ELAVAI 
DNKNNITTNCSVESSLTSITHGSSISAASMKKRQRGDNLGVGYESGWIMPSSTIG* 

>G2180 (1. .1440) 

ATGGCTCCTGTCTCGTTACCTCCAGGTTTCCGATTCCATCCAACAGACGAGGAACTAATT 

ACTTACTATCTAAAAAGAAAGATCAACGGTCTAGAAATCGAACTTGAAGTTATCGCTGAA 

GTTGATCTTTACAAGTGTGAGCCATGGGACTTACCAGGGAAGTCCTTGCTTCCGAGCAAA 

GACCAAGAATGGTACTTCTTCAGCCCACGAGACCGGAAGTATCCCAACGGCTCAAGGACA 

AACCGGGCAACTAAAGGCGGTTATTGGAAGGCTACAGGTAAAGACCGCCGAGTTAGTTGG 

AGAGACCGAGCCATAGGAACCAAGAAGACATTGGTTTACTACCGTGGGCGCGCGCCACAT 

GGCATAAGAACTGGTTGGGTCATGCACGAATATCGACTTGATGAAACAGAATGTGAGCCT 

TCTGCATACGGCATGCAGGACGCATATGCACTTTGTCGTGTGTTCAAAAAGATTGTTATT 

GAAGCTAAGCCAAGAGATCAACATCGGTCATATGTCCACGCGATGTCGAATGTGAGTGGT 

AATTGCTCATCGAGTTTTGACACTTGTTCGGATCTCGAAATCAGTTCAACTACTCATCAA 

GTTCAAAACACATTCCAACCGCGATTTGGCAACGAGCGATTTAACTCCAACGCAATCAGC 

AACGAGGATTGGTCACAATACTACGGTTCTTCTTATAGACCGTTCCCTACTCCATATAAG 

GTTAACACAGAGAT€GAATGTTCAATGTTACAACACAATATATATCTACCACCGTTGCGT 

GTAGAGAACTCTGCGTTTAGTGATTCCGATTTCTTCACGAGTATGACTCACAACAACGAC 

CATGGCGTTTTCGATGACTTTACTTTTGCTGCAAGTAACTCCAACCACAATAATAGCGTT 

GGTGATCAAGTGATCCACGTTGGCAATTATGATGAACAATTAATAACATCTAACCGTCAT 

ATGAACCAGACTGGTTATATAAAAGAGCAGAAGATCAGATCGAGTTTGGATAATACTGAC 

GAAGATCCAGGATTTCATGGTAACAATACCAATGACAACATAGATATCGATGATTTTCTC 

TCGTTTGATATATATAACGAGGACAACGTGAATCAAATAGAAGATAATGAAGACGTGAAT 

ACAAATGAAACCCTTGATTCATCGGGATTCGAGGTGGTTGAAGAAGAAACTAGATTTAAC 

AACCAAATGCTCATCTCGACATATCAAACGACAAAGATTCTATATCACCAAGTCGTACCT 

TGTCACACGTTGAAAGTTCACGTCAATCCTATTAGTCACAATGTGGAAGAGAGAACATTG 
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TTCATTGAAGAGGACAAAGATTCTTGGTTACAAAGAGCTGAGAAGATCACGAAGACAAAA 

CTAACACTTTTTAGTTTAATGGCTCAGCAATACTACAAATGTCTTGCTATTTTTTTCTGA 

>G2180 Amino Acid Sequence (conserved domain in AA coordinates : 7-156) 

MAPVSLPPGFRFHPTDEELITYYLKRKINGLEIELEVIAEVDLYKCEPWDLPGKSLLPSK 

DQEWYFFSPRDRKYPNGSRTNRATKGGYWKATGKDRRVSWRDRAIGTKKTLVYYRGRAPH 

GIRTGWVMHEYRLDETECEPSAYGMQDAYALCRVFKKIVIEAKPRDQHRSYVHAMSNVSG 

NCSSSFDTCSDLEISSTTHQVQNTFQPRFGNERFNSNAISNEDWSQYYGSSYRPFPTPYK 

WTEIECSMLQHNIYLPPLRVENSAFSDSDFFTSMTHN1TOHGVFDDFTFAASNSNHNNSV 

GDQVIHVGNYDEQLITSNRHMNQTGYIKEQKIRSSLDNTDEDPGFHGNNTNDNIDIDDFL 

S FD I YNEDNVNQ I EDNEDVNTNETLDS SGFE WEEETRFNNQML I ST YQTTKI L YHQWP 

CHTLKVHVNP I SHNVEERTLF I EEDKDS WLQRAEKI TKTKLTLFSLMAQQYYKCLAI FF * 

>G1817 (1..1308) 

ATGAAGGACGCAGAGAAGCGAGAGGTGATTGCATCATCATCATTACAAAGAAAGAGAAAC 
AGAGGAAGAAGACTAAGGAAAAGAAGAAGAAGAAACGAGAAGCGAGTACTAATGGTTCCA 
TCATCATTACCAAACGACGTGCTAGAGGAGATCTTTTTAAGATTTCCGGTTAAAGCCCTA 
ATCCGACTCAAGTCTCTCTCGAAACAATGGAGATCGACGATCGAATCTCGCAGTTTTGAA 
GAGAGACACTTGACGATCGCTAAGAAAGCCTTCGTGGATCATCCCAAGGTCATGCTCGTA 
GGAGAAGAAGATCCCATAAGAGGAACCGGGATTCGTCCAGACACTGACATTGGTTTTAGG 
TTATTCTGCTTGGAATCGG CTTCTCTTCTATC CTTTACTCGTCTCAATTTCCCTCAAGGG 
TTCTTCAACTGGATCTACATATCTGAAAGCTGTGATGGCCTTTTCTGCATCCATTCCCCA 
AAATCACATTCCGTATATGTAGTGAATCCGGCTACACGGTGGCTCCGCCTACTTCCTCCG 
GCAGGGTTTCAGATTTTGATCCACAAGTTTAACCCCACTGAACGTGAGTGGAATGTAGTG 
ATGAAATCAATCTTTCATCTAGCATTCGTGAAGGCCACCGATTACAAATTAGTGTGGTTG 
TACAATTGTGATAAGTACATTGTTGATGCGTCGAGTCCAAACGTGGGAGTCACAAAGTGC 
GAGATTTTTGACTTTAGGAAAAATGCTTGGAGGTACTTGGCTTGCACTCCAAGTCATCAG 
ATATTCTATTACCAAAAGCCAGCATCTGCAAACGGGTCGGTTTATTGGTTTACAGAACCA 
TATAATGAAAGAATCGAAGTAGTGGCTTTTGATATTCAGACCGAAACATTCCGGTTGCTG 
CCTAAGATTAATCCGGCTATTGCTGGTTCAGATCCTCACCATATTGACATGTGCACTCTG 
GATAATAGTTTGTGTATGTCGAAAAGGGAGAAAGATACTATGATCCAAGATATTTGGAGG 
TTGAAACCATCAGAAGACACATGGGAAAAGATTTTTAGCATAGACTTGGTTTCCTGTCCT 
TCTTCTCGGACTGAGAAGCGTGATCAATTTGATTGGAGCAAGAAGGATAGGGTTGAGCCA 
GCCACACCCGTCGCGGTTTGTAAGAATAAGAAGATCCTTCTCTCACATCGCTATTCCCGA 
GGTTTGGTAAAGTACGATCCCCTAACAAAATCTATCGATTTTTTTTCCGGACATCCTACC 
GCTTACAGAAAAGTTATTTATTTTCAAAGTTTGATATCTCATCTATAA 

>G1817 Amino Acid Sequence (conserved domain in AA coordinates : 47-331) 
MKDAEKREVIASSSLQRKRNRGRRLRKRRRRNEKRVLMTO 

IRLKSLSKQWRSTIESRSFEERHLTIAKKAFVX>HPKvmVGEEDPIRGTGIRPDTDIGFR 

LFCLESASLLSFTRLNFPQGFFNWIYISESCDGLFCIHSPKSHSVYVVNPATRWLRLLPP 

AGFQILIHKFNPTEREWNWMKS I FHLAFVKATDYKLVWL YNCDKYIVDASS PNVGVTKC 

E IFDFRKNAWRYLACTPSHQI FYYQKPASANGSVYWFTEPYNERIEWAFDIQTETFRLL 

PKINPAIAGSDPHHIDMCTLDNSLCMSKREKDTMIQDIWRLKPSEDTWEKIFSIDLVSCP 

SSRTEKRDQFDWSKKDRV^PATPVAVCKNKKILLSHRYSRGLVlCYDPIiTKSIDFFSGHPT 
AYRKVIYFQSLISHL* 

>G1649 (61.. 1311) 

ATTCACAAAAACCGGAAAAAAAAAAAGACAAGTAAAGAAAGCTTTGTTCAGTTTACTTCA 
ATGGAAGCAAAACCCTTAGCATCATCAT(^TCTGAACCAAACATGATTTCTCCATCATCA 
AACATTAAACCAAAATTAAAAGATGAAGATTATATGGAGCTGGTGTGTGAAAATGGGCAG 
ATTCTTGCAAAGAT^CGAAGACCAAAGAACAACGGTTCTTTTCAAAAGCAACGTAGGCAA 
TCTCTCCTGGATTTGTATGAGACCGAGTACAGCGAGGGTTTCAAGAAAAACATCAAGATT 
CTTGGAGACACACAAGTTGTTCCGGTGAGTCAGTCTAAGCCACAACAAGATAAAGAAACC 
AATGAACAAATGAACAAGAATAAGAAGAAGCTAAAGTCCTCCAAAATCGAATTTGAGAGA 
AATGTTTCGAAAAGCAAC^AATGTGTTGAATCATCAACATTAATTGATGTTTCTGCTAA^ 
GGTCCAAAGAATGTTGAAGTTACTACAGCTCCTCCTGATGAGCAATCTGCAGCTGTTGGT 
AGATCCACGGAATTGTATTTTGCTTCTTCATCGAAGTTTTCTCGAGGAACTTCGAGAGAT 
CTAAGTTGTTGTTCTTTAAAGAGGAAGTATGGAGATATTGAAGAAGAAGAATCAACCTAT 
TT7UVGTAATAATTCAGATGATGAATCAGATGATGCGAAGACACAAGTTCATGCGAGAACA 
AGAAAGCCGGTGACTAAAAGAAAACGAAGCACAGAAGTCCATAAGTTATATGAAAGAAAA 
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CGAAGAGATGAATTCAACAAGAAAATGCGTGCTTTGCAGGACCTACTACCAAATTGTTAC 
AAGGATGATAAGGCTTCATTGTTGGATGAGGCTATCAAATATATGCGGACCCTTCAACTT 
CAAGTTCAGATGATGAGTATGGGAAATGGATTAATAAGACCACCTACGATGTTGCCAATG 
GGTCATTACTCTCCCATGGGTCTAGGAATGCATATGGGTGCAGCAGCAACACCAACATCA 
ATACCGCAATTCCTGCCTATGAATGTTCAAGCAACCGGTTTTCCGGGGATGAACAATGCA 
CCACCACAAATGCTAAGCTTTCTTAATCACCCAAGTGGACTAATTCCAAACACTCCTATC 
TTTTCTCCATTGGAAAATTGCTCTCAGCCATTCGTGGTGCCTTCGTGTGTTTCTCAGACT 
CAGGCTACTTCTTTTACTCAATTCCCAAAGTCTGCGTCCGCCTCAAACTTAGAAGATGCA 
ATGCAATATAGAGGAAGCAACGGTTTTAGTTATTATCGCTCGCCAAACTAATGATTTGTA 
GAAAGTTGATGTTTTCTCCAACTAACTAACTTTAAGCAAAAAAAAATGATCGTCTACTCT 
GTGTTGTTAGTCTATGGGCTTTTGGGCCTTGATTCTTGGAACGATTTGAACTTAATTCCA 
ACTATTTTCAAAGTGGATGTACAAAGTAAAA 

>G1649 Amino Acid Sequence (conserved domain in AA coordinates : 225-295) 

MEAKPLASSSSEPNMISPSSNIKPIOiKDEDYMELVCENGQILAKIRRPKNNGSFQKQRRQ 

SLLDLYETEYSEGFKKNIKILGDTQVVPVSQSKPQQ 

NVSKSNKCVESSTLIDVSAKGPKNVEVTTAPPDEQSAAVGRSTELYFASSSKFSRGTSRD 1 

LSCCSLKRKYGDIEEEESTYLSNNSDDESDDAKTQVHARTRICPVTKIIKRSTEVBKLYERK 

RRDEFNKKMRALQDLLPNCYKDDKASLLDEAIKYMRTLQLQVQMMSMGNGLIRPPTMLPM 

GHYSPMGLGMHMGAAATPTSIPQFLPMNVQATGFPGMNNAPPQMLSFLNHPSGLIPNTPI 

FSPLENCSQPFWPSCVSQTQATSFTQFPKSASASNLEDAMQYRGSNGFSYYRSPN* 

>G2131 (69.. 1010) 

GTCTCTCATTTTCATAATTCCATTTTCAGGATTGTCTCTCAATCTTTTATTCTTCTCATT 
CACCGGTAATGGCAAAAGTCTCTGGGAGGAGCAAGAAAACAATCGTTGACGATGAAATCA 
GCGATAAAACAGCGTCTGCGTCTGAGTCTGCGTCCATTGCCTTAACATCCAAACGCAAAC 
GTAAGTCGCCGCCTCGAAACGCTCCTCTTCAACGCAGCTCCCCTTACAGAGGCGTCACAA 
GGCATAGATGGACTGGGAGATACGAAGCGCATTTGTGGGATAAGAACAGCTGGAACGATA 
CACAGACCAAGAAAGGACGTCAAGTTTATCTAGGGGCTTACGACGAAGAAGAAGCAGCAG 
CACGTGCCTACGACTTAGCAGCATTGAAGTACTGGGGACGAGACACACTCTTGAACTTCC 
CTTTGCCGAGTTATGACGAAGACGTCAAAGAAATGGAAGGCCAATCCAAGGAAGAGTATA 
TTGGATCATTGAGAAGAAAAAGTAGTGGATTTTCTCGCGGTGTATCAAAATACAGAGGCG 
TTGCAAGGCATCACCATAATGGGAGATGGGAAGCTAGAATTGGAAGGGTGTTTGGTAATA 
AATATCTATATCTTGGAACATACGCCACGCAAGAAGAAGCAGCAATCGCCTACGACATCG 
CGGCAATAGAGTACCGTGGACTTAACGCCGTTACCAATTTCGACGTCAGCCGTTATCTAA 
ACCCTAACGCCGCCGCGGATAAAGCCGATTCCGATTCTAAGCCCATTCGAAGCCCTAGTC 
GCGAGCCCGAATCGTCGGATGATAACAAATCTCCGAAATCAGAGGAAGTAATCGAACCAT 
CTACATCGCCGGAAGTGATTCCAACTCGCCGGAGCTTCCCCGACGATATCCAGACGTATT 
TTGGGTGTCAAGATTCCGGCAAGTTAGCGACTGAGGAAGACGTAATATTCGATTGTTTCA 
ATTCTTATATAAATCCTGGCTTCTATAACGAGTTTGATTATGGACCTTAATCGTATTTTC 
TACAAGTTTTGTTTTGATTATCTACACAATACATCAATATATTCT 

>G2131 Amino Acid Sequence (conserved domain in AA coordinates : 50-186, 112-183) 

MAKVSGRSKKTIVDDEISDKTASASESASIALTSKRKRKSPPRNAPLQRSSPYRGVTRHR 

WTGRYEAHLWDKNSVmDTQTKKGRQVTLGAYD^ 

S YDEDVKEMEGQSKEE Y I GS LRRKS SGFSRGVS KYRGVARHHHNGRWEARIGRVFGNKYL 
YLGTYATQEEAAI AYDI AAI EYRGLNAVTNFDVSRYLNPNAAADKADSD S KP IRS PSREP 
ESSDDNKSPKSEEVIEPSTSPEVIPTRRSFPDDIQTYFGCQDSGKLATEEDVIFDCFNSY 
INPGFYNEFDYGP* 
>G215 (1..1110) 

ATGACTCGTCGGTGTTCGCATTGTAGCAACAATGGGCACAATTCACGCACGTGTCCAACG 
CGTGGGTCTGGTTCCTCCTCCGCCGTGAAGTTATTTGGTGTGAGGTTAACGGATGGCTCG 
ATTATTAAAAAGAGTGCGAGTATGGGTAATCTCTCGGCATTGGCTGTTGCGGCGGCGGCG 
GCAACGCACCACCGTTTATCTCCGTCGTCTCCTCTGGCGACGTCAAATCTTAATGATTCG 
CCGTTATCGGATCATGCCCGATACTCTAATTTGCATCATAATGAAGGGTATTTATCTGAT 
GATCCTGCTCATGGTTCTGGGTCTAGTCACCGTCGTGGTGAGAGGAAGAGAGGTGTTCCT 
TGGACTGAAGAGGAACATAGACTATTCTTAGTCGGTCTTCAGAAACTCGGGAAAGGAGAT 
TGGCGCGGTATTTCGAGAAACTATGTAACGTCAAGAACTCCTACACAAGTGGCTAGTCAT 
GCTCAAAAGTATTTTATTCGACATACTAGTTCAAGCCGCAGGAAAAGACGGTCTAGCCTC 
TTCGACATGGTTACAGATGAGATGGTAACCGATTCATCGCCAACACAGGAAGAGCAGACC 
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TTAAACGGTTCCTCTCCAAGCAAGGAACCTGAAAAGAAAAGCTACCTTCCTTCACTTGAG 
CTCTCACTCAATAATACCACAGAAGCTGAAGAGGTCGTAGCCACGGCGCCACGACAGGAA 
AAATCTCAAGAAGCTATAGAACCATCAAATGGTGTTTCACCAATGCTAGTCCCGGGTGGC 
TTCTTTCCTCCTTGTTTTCCAGTGACTTACACGATTTGGCTCCCTGCGTCACTTCACGGA 
ACAGAACATGCCTTAAACGCTGAGACTTCTTCTCAGCAGCATCAGGTCCTAAAACCAAAA 
CCTGGATTTGCTAAAGAACGTGTGAACATGGACGAGTTGGTCGGTATGTCTCAGCTTAGC 
ATAGGAATGGCGACAAGACACGAAACCGAAACTTCCCCTTCCCCGCTATCTTTGAGACTA 
GAGCCCTCAAGGCCATCAGCGTTTCACTCGAATGGCTCGGTTAATGGTGCAGATTTGAGT 
AAAGGCAACAGCGCGATTCAGGCTATCTAA 

>G215 Amino Acid Sequence (domain in AA coordinates: TBD) 

MTRRCSHCSNNGHNSRTCPTRGSGSSSAVKLFGVRLTDGSIIKKSASMGNLSALAVAAAA 

ATHHRLSPSSPLATSNLNDSPLSDHARYSNLHHNEGYLSDDPAHGSGSSHRRGERKRGVP 

WTEEEHRLFLVGLQKLGKGDWRGI SRNYVTSRTPTQVASHAQKY F I RHTS S SRRKRRS SL 

FDMVTDEMVTDSSPTQEEQTLNGSSPSKEPEKKSYLPSLELSLNNTTEAEEVVATAPRQE 

KSQEAIEPSNGVSPMLVPGGFFPPCFPVTYTIWLPASLHGTEHALNAETSSQQHQVLKPK 

PGFAKERVWMDELVGMSQLS IGMATRHETETS PS PLSLRLEPSRPS AFHSNGS VNGADLS 

KGNSAIQAI* 

>G1508 (1. .420) 

ATGCTAGATCACAGTGAAAAGGTCTTATTGGTTGATTCAGAAACCATGAAAACAAGAGCT 

GAAGATATGATCGAACAGAACAACACTAGTGTTAACGACAAGAAGAAGACTTGTGCTGAT 

TGTGGAACCAGTAAAACTCCTCTTTGGCGTGGTGGTCCTGTTGGTCCAAAGTCGTTGTGT 

AACGCGTGTGGGATCAGAAACAGAAAGAAGAGAAGAGGAGGAACAGAAGATAATAAGAAA 

TTAAAGAAATCGAGTTCTGGCGGCGGAAACCGTAAATTTGGTGAATCGTTAAAACAGAGT 

TTGATGGATTTGGGGATAAGGAAGAGATCAACGGTGGAGAAGCAACGACAGAAGCTTGGT 

GAAGAAGAACAAGCCGCTGTGTTACTC^TGGCTCTTTCTTATGGCTCTGTTTACGCTTAG 

>G1508 Amino Acid Sequence (domain in AA coordinates: 38-63) 

MLDHSEKVLLVDSETMKTRAEDMIEQ1WTS 

NACGIRNRKKRRGGTEDNKKLKKSSSGGGNRKFGESLKQSLMDLGIRKRSTVEKQRQKLG 

EEEQAAVLLMALSYGSVYA* 

>G2110 (36.. 1622) 

GAGAGCTAATAAAAAATTTATCAAAGAAGACTAATATGGAGAAGGACGATTTCTTGAGGA 
GTGGTCATGGAAGAGAAGAAAGCCATGATGAGATGAGAAAACTTGATTCATCTCACGATG 
ATTCTCATCAAGAACACGACCATATTATAAGATCCAAGTTGGACTCAACTAAAGTCGAAA 
TGGATGAGGCTAAAGAGGAAAATCGAAGACTAAAGTCATCATTGAGTAAAATCAAGAAAG 
ATTTTGACATCCTTCAAACACAATACAACCAATTAATGGCCAAACATAACGAACCAACCA 
AGTTCCAATCAAAAGGGCATCATCAAGACAAAGGCGAAGATGAAGACAGAGAAAAAGTTA 
ACGAACGTGAAGAACTTGTCTCGTTGAGCCTAGGCAGACGGTTAAATTCAGAGGTTCCAA 
GTGGTTCGAATAAAGAAGAAAAAAATAAAGATGTTGAAGAAGCGGAAGGTGACAGAAATT 
ATGATGATAATGAAAAAAGCAGTATTCAAGGGTTGAGTATGGGGATTGAATACAAGGCTT 
TGAGTAATCCTAATGAGAAGTTAGAGATTGATCATAATCAAGAAACCATGTCGTTGGAGA 
TTAGTAACAATAATAAGATCAGATCACAAAATAGTTTTGGGTTTAAGAATGATGGAGATG 
ATCATGAAGATGAAGATGAGATTTTGCCTCAAAACCTTGTTAAGAAAACTAGGGTTTCGG 
TGAGATCAAGATGTGAGACACCAACGATGAACGACGGATGTCAATGGAGGAAATATGGCC 
AGAAAATAGCTAAAGGCAATCCATGTCCCCGAGCTTACTATCGTTGCACCATTGCAGCTT 
CTTGTCCAGTAAGAAAACAGGTGCAAAGATGTTC^GAAGATATGTCTATACTTATCTC^ 
CGTACGAAGGAACACATAACCATCCACTTCCCATGTCAGCAACTGCCATGGCCTCTGCCA 
CTTCCGCTGCCGCCTCCATGCTTCTCTCCGGCGCCTCCTCCTCCTCATCCGCCGCAGCTG 

ATCTTC^TGGCCTTAACTTCTCTCTTTCCGGCAACAACATCACTCCAAAACCTAAAACTC 
ATTTCCTCCAATCCCCI^^ 

CCTCCTCGTCGCAGCAACCGTTCTTATCAATGCTCAATAGATTCAGCTCTCCTCCAAGTA 

ATGTCTCACGATCTAATAGTTATCCTTCAACCAATCTCAACTTTTCAAACAACACCAA^ 

CATTGATGAATTGGGGTGGTGGTGGTAATCCCAGTGATCAATACCGTGCAGCTTACGGCA 

ACATTAACACCCATCAGCAATCACCTTACCACAAAATCATTCAAACCCGAACCG^ 

CATCTTTCGATCCGTTTGGAAGATCATCTTCATCACATTCTCCACAAATAAATCTTGATC 

ATATCGGAATCAAGAACATCATCAGTC^CCAAGTGCCATCTTTACCGGCTGAAACAATCA 

AGGCAATCACGACAGATCCAAGTTTCCAATCGGCTTTGGCGACAGCT 

TGGGCGGCGATTTAAAGATTGATCACAATGTGACTAGAAATGAAGCTGAGAAGAGCCCTT 
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AAAGAGAATTGTTATATATATGTTCTTATATACTCAGTACATTGGTAAATGGGTTTAGAC 
TTTCACTAGTTTCCTAGTTCATCTATATATTGGTTGTTTAATCACAAGTTTATTTTGTTG 
TTGGAGTTTATGGAACTAATGTGTACATATGAAACTTTAGAACGAATAAATAAAACTTGG 
AATTCCTTTTTAAAAAAAAAAAAAAAAA 

>G2110 Amino Acid Sequence (conserved domain in AA coordinates : 239-298) 

MEKDDFLRSGHGREESHDEMRKLDSSHDDSHQEHDHIIRSI<CLDSTKVEMDEAKEENRRLK 

SSLSKIKKDFDILQTQYNQLMAKHNEPTKFQSKGHHQDKGEDEDREKVNEREELVSLSLG 

RRLNSEVPSGSNKEEKNKDVEEAEGDRNYDDNEKSSIQGLSMGIEYKALSNPNEKLEIDH 

NQETMSLEISNNNKIRSQNSFGFK2TOGDDHEDEDEILPQNLVKKTRVSVRSRCETPTMND 

GCQWRKYGQKIAKGNPCPRAYYRCTIAASCPVRKQVQRCSEDMSILISTYEGTHNHPLPM 

SATAMASATSAAASMLLSGASSSSSAAADLHGLNFSLSGNNITPKPKTHFLQSPSSSGHP 

TVTLDLTTSSSSQQPFLSMLNRFSSPPSNVSRSNSYPSTNLNFSJJNTNTLMNWGGGGNPS 

DQYKAAYGNINTHQQSPYHKIIQTRTAGSSFDPFGRSSSSHSPQINLDHIGIKNIISHQV 

PSLPAETI KAITTDP S FQS ALATALS S IMGGDLKIDHNVTRNEAEKS P * 

>G2442 (71. .997) 

TCGACCAATTTAGACCATTCCAAATTCGTCGTCCTTTTCTCTGTGTAGTCTAATTATATA 

TTACAAGTAGATGAATTGGTTACCTGAAGCTGAAGCTGAGGAGCACTTGAAAGGTATTCT . 

CTCTGGTGATTTCTTTGATGGTCTCACCAATCACCTTGATTGCCCACTTGAAGACATCGA 

TTCCACCAATGGTGAGGGAGATTGGGTCGCCAGGTTTCAAGACCTTGAGCCTCCTCCCTT 

GGATATGTTCCCTGCTTTGCCTTCTGACCTCACCTCTTGTCCCAAGGGCGCCGCTCGTGT 

GCGGATTCCCAACAACATGATTCCTGCTTTGAAGCAiGTCCTGTTCTTCTGAAGCCTTGTC 

CGGCATTAATAGCACTCCCCACCAATCTTCAGCTCCTCCTGATATCAAAGTTTCATATCT 

ATTTCAGTCTCTAACTCCAGTGTCAGTTCTCGAGAACAGTTATGGTTCTCTCTCCACCCA 

AAACTCCGGATCTCAGAGATTGGCTTTCCCTGTGAAAGGCATGAGAAGCAAGCGCAGACG 

CCCCACAACAGTGAGACTTAGCTACCTTTTCCCCTTTGAACCCAGAAAGTCAACTCCGGG 

TGAATCAGTAACCGAGGGTTACTATTCTTCTGAGCAACATGCCAAGAAGAAGCGCAAGAT 

TCATCTGATCACCCACACCGAGTCTTCCACTTTGGAGTCAAGTAAGTCGGATGGGATAGT 

CCGGATATGCACTCATTGTGAGACAATCACGACCCCACAGTGGAGGCAAGGACCCAGTGG 

ACCCAAGACCCTCTGCAACGCTTGCGGAGTCCGGTTCAAATCTGGTCGCCTAGTTCCAGA 

ATACCGGCCAGCCTCAAGCCCGACCTTCATCCCATCTGTGCATTCAAACTCACACAGGAA 

GATCATTGAGATGAGAAAGAAGGACGACGAGTTTGATACCAGCATGATTCGCAGTGATAT 

CCAGAAGGTAAAGCAGGGGAGGAAGAAAATGGTATAAAAGTA 

>G2442 Amino Acid Sequence (domain in aa coordinates: 220-246) 

MNWLPEAEAEEHLKGILSGDFFDGLTNHIiDCPLEDIDSTNGEGDWVARFQDLEPPPLDM^ 

PALPSDLTSCPKGAARWIPNNMIPALKQSCSSEALSGINSTPHQSSAPPDIKVSYLFQS 

LTPVSVIiENSYGSLSTQNSGSQRLAFPWGMRSKRRRPTTVRLSYLFPFEPRKSTPGESV 

TEGYYSSEQHAKKKRKIHLITHTESSTLESSKSDGIVRICTHCETITTPQWRQGPSGPKT 

LCNACGVRFKSGRLVPEYRPASSPTFIPSVHSNSHRKIIEMRKKDDEFDTSMIRSDIQKV 

KQGRKKMV* 

>G1051 (66. .1031) 

CCTGTAAATTCAGATTTGCTTTCTTTGGTAATCTTTTGGATCAAGATCCATCTATTTTTT 

CTTCAATGGCACAACTCCCTCCTAAAATCCCCAACATGACACAACATTGGCCTGATTTCT 

CTTCCCAAAAGCTCTCTCCTTTCTCTACCCCAACCGCAACCGCTGTCGCCACCGCTACAA 

CCACCGTACAAAACCCCTCATGGGTCGACGAATTCCTCGACTTCTGAGCGTCTCGCCGTG 

GCAACCACCGTCGTTCCATCAGCGACTCTATCGCATTCCTCGAAGCTCCAACAGTCAGCA 

TCGAAGACCACCAATTCGACAGGTTCGATGACGAACAGTTCATGTCGATGTTCACCGACG 

ACGACAACCTTCATAGCAATCCTTCCCATATCAACAACAAAAATAACAATC 

CGGGATCTTCCTCGAACACATCCACGCCGTCCAATAGCTTCAACGACGATAACAAAGAAT 

TACCACCGTCCGATCATAACATGAACAATAATATCAACAACAACTATAACGATGAAGTCC 

AAAGCCAATGCAAGATGGAGCCAGAAGATGGTACGGCGTCGAATAACAATTCCGGTGATA 

GCTCCGGCAACCGGATTCTCGATCCCAAAAGGGTTAAGAGAATATTAGCAAATCGGCAAT 

CAGCACAGAGATCAAGGGTGAGGAAACTGCAATACATATCAGAGCTCGAACGTAGCGTCA 

CTTCGTTGCAGGCGGAAGTGTCAGTGTTATCGCCAAGAGTTGCATTCTTGGATCATCAAC 

GTTTGCTTCTTAACGTTGACAACAGCGCTCTCAAGCAACGAATCGCTGCT^ 

ACAAGCTTTTCAAAGACGCACATCAAGAAGCATTGAAGAGAGAAATAGAGAGACTTCGAC 

AAGTGTATAATCAACAAAGCCTCACGAATGTGGAAAATGCAAATCATTTATCGGCGACCG 

GAGCCGGTGCTACTCCGGCCGTCGACATCAAGTCGTCCGTTGAAACAGAGCAGCTCCTCA 
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ATGTCTCATAAATTAACCATCATGCATCATCATCAACATTTCTCTCTTTTAGCTTCTTGG 
CAAAAGTTCTTGACTATAAAATCTCTTTCGGGTAAGAAATTCAGGAGATATACATTTTTT 
ATTCTAATCACATTGTTTTTAAGTTGTGATGAATTCAGTTTGATGTATCTTATTTATTTT 

AAAGAACTAGTTGAATTTTTTTTTTTTTTTT 

>G1051 Amino Acid Sequence (domain in AA coordinates 189-250) 
MAQLPPKIPNMTQHWPDFSSQKLSPFSTPTATAVATATTTVQNPSWVDEFLDFSASRRGN 

HRRSISDSIAFLEAPTVSIEDHQFDRFDDEQFMSMFTDDDNLHSNPSHINNKNNNVGPTG 
SSSNTSTPSNSFNDDNKELPPSDHNMNm^ 

GNRILDPKRVKRILANRQSAQRSRVRKLQYISELERSVTSLQAEVSVLSPRVAFLDHQRL 

LLNVDNSALKQRIAALSQDKLFKDAHQEALKREIERLRQVYNQQSLTNVENANHLSATGA 
GATPAVDI KS SVETEQLLNVS * 

>G1052 (138. .1127) 

TGATCATCTAAAACTTTCAATTTCTCTCTTGATCCTCACTTGAATTTTTTGTTGTTTCTC 
TCAAATCTTTGATCCTTTCCTTTGTTTTTCATTTGACCTCTTACAAAAAAATCTGGTGTG 
CCATTAAATCTTTATTAATGGCACAACTTCCTCCGAAAATCCCAACCATGACGACGCCAA 
ATTGGCCTGACTTCTCCTCCCAGAAACTCCCTTCCATAGCCGCAACGGCGGCAGCCGCAG 
CAACCGCTGGACCTCAACAACAAAACCCTTCATGGATGGATGAGTTTCTCGACTTCTCAG 
CGACTCGCCGTGGGACTCACCGTCGTTCTATAAGCGACTCCATTGCTTTCCTTGAACCAC 
CTTCCTCCGGCGTCGGAAACCACCACTTCGATAGGTTTGACGACGAGCAATTCATGTCCA 
TGTTCAACGACGACGTACACAACAATAACCACAATCATCATCATCATCACAGCATCAACG 
GCAATGTGGGTCCCACGCGTTCATCCTCCAACACCTCCACGCCGTCCGATCATAATAGCC 
TTAGCGACGACGACAACAACAAAGAAGCACCACCGTCCGATCATGATCATCACATGGACA 
ATAATGTAGCCAATCAAAACAACGCCGCCGGTAACAATTACAACGAATCAGACGAGGTCC 
AAAGCCAGTGCAAGACGGAGCCACAAGATGGTCCGTCGGCGAATCAAAACTCCGGTGGAA 
GCTCCGGTAATCGTATTCACGACCCTAAAAGGGTAAAAAGAATTTTAGCAAATAGGCAAT 
CAGCACAGAGATCAAGGGTGAGGAAATTGCAATACATATCAGAGCTTGAAAGGAGCGTTA 
CTTCATTGCAGACTGAAGTGTCAGTGTTATCGCCAAGAGTTGCGTTTTTGGATCATCAGC 
GATTGCTTCTCAACGTCGACAATAGTGCTATCAAGCAACGAATCGCAGCTTTAGCACAAG 
ATAAGATTTTCAAAGACGCTCATCAAGAAGCATTGAAGAGAGAAATAGAGAGACTTCGAC 
AAGTATATCATCAAGAAAGCCTCAAGAAGATGGAGAATAATGTCTCCGATCAATCTCCGG 
CCGATATCAAACCGTCCGTTGAGAAGGAACAGCTCCTCAATGTCTAAAGCTGTTCGTTCA 
CTAAGATCTTTCTTTTCATGGCGAAAAGATTCTTGACTATAAAACCTCTTTGTGTCAAGA 
AATTAATTTATCAAAGAAGATGGCCTTTTTTATTTGATCTAATCACATTTTTTTAAGTTG 
TGATGAATTTGCTTTTGATGTATCTGTTTTTTTTTTTTTTTTTT 

>G1052 Amino Acid Sequence (domain in AA coordinates 201-261) 
MAQLPPKIPTMTTPNWPDFSSQKLPSIAATAAAAATAGPQQQNPSWMDEFLDFSATRRGT 
HRRS I SDS I AFLEPPS SGVGNHHFDRFDDEQFMSMFNDDVHNlJNHNHffi INGNVGPT 
RSSSNTSTPSDHNSLSDDDl^KEAPPSDHDHHMDNNVANQNNAAGNNYN 
EPQDGPSANQNSGGSSGNRIHDPKRVKRILAmQSAQRSRVRKLQYISELERSVTSLQTE 
VS VLSPRVAFLDHQRLLLNVDNSAI KQRIAALAQDKI FKDAHQEALKRE IERLRQVYHQQ 
SLKKMENNVSDQSPADIKPSVEKEQLLNV* 
>G1079 (1..1995) 

ATGGGTTGTGCTGCTTCAAGAATTGATAATGAAGAAAAGGTTTTAGTGTGTAGGCAGAG 

AAGAGGCTAATGAAAAAGTTATTAGGGTTCAGGGGAGAATTTGCAGATGCACAGTTGGCT 

TATCTTAGAGCTTTGAGGAACACTGGTGTTACTCTTAGGCAATTCACTGAGTCTGAGACC 

TTGGAGCTTGAAAACACTAGTTATGGTTTAAGTTTGCCTTTGCCTCCTTCGCCTC 

ACATTGCCTCCTTCACCTCCACCACCTCCTCCATTTAGCCCGGATTrGAGAAATCCTGAG 

ACTAGTCATGACTTGGCTGATGAGGAGGAAGAGGGTGAAAATGATGGTGGTAATGATGGA 

AGTGGTGCAGCTCCTeCGCCTCCATTGCCGAATTCTTGGAACATTTGGAACCCTTTTGAG 

TC^CTTGAGCTGCATAGTCATCCAAATGGTGAGAATGTAGTTACACAAGTTC 

AAGAAACAACAAATTCAGCAAGCTGAAGAGGAAGATTGGGCGGAGACGAAGTCTCAATTT 
GAGGAAGAAGATGAGCAACAAGAAGCAGGAG GTACTTG CCTTG ATTTGAGTGTTCATCAA 

ATAGAGGCTGTTAGTGGCTGTAACATGAAGAAGCCACGTCGTCTGAAGTTTAAGCTGGGA 
GAAGTTATGGACGGTAACTCATCTATGA(^GCTGCTCCGGTAAAGATCTTGAG 
CATGTGACTGATTGTAGAATCAGGAGGACCTTAGAAGGAATCATCAGAGAGTTGGATGAT 
TATTTTCTTAAAGCATCGGGTTGCGAGAAGGAGATAGCTGTGATAGTAGACATCAACAGT 
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AGGGATACTGTTGATCCTTTCAGGTACCAGGAAACAAGAAGGAAGAGAAGCAGCTCGGCA 
AAGGTATTCAGTGCATTGTCATGGAGTTGGTCTTCAAAGTCTCTTCAGTTGGGCAAAGAT 
GCTACAACAAGCGGGACTGTTGAACCCTGTAGGCCTGGAGCTCACTGCAGCACACTTGAG 
AAGCTATACACAGCTGAGAAGAAACTTTACCAGCTAGTCAGAAACAAAGAGATTGCCAAA 
GTGGAGCATGAGAGGAAGTCTGCATTACTGCAAAAGCAAGATGGGGAAACCTATGATTTG 
AGCAAAATGGAGAAAGCACGCTTGTCTTTGGAGAGTTTGGAAACCGAGATACAGCGTCTA 
GAAGATTCCATAACTACAACACGCTCATGTTTGCTTAACTTGATCAATGATGAGCTGTAT 
CCGCAGCTAGTTGCTTTAACTTCAGGGCTAGCACAGATGTGGAAAACAATGCTCAAGTGT 
CATCAAGTTCAAATTCATATATCCCAGCAACTGAACCATCTTCCGGATTACCCGAGTATA 
GATCTCAGTTCGGAATACAAACGCCAGGCGGTTAATGAACTAGAGACCGAGGTTACTTGC 
TGGTACAATAGCTTTTGCAAGTTAGTAAATTCCCAGCGAGAATACGTGAAAACACTCTGT 
ACGTGGATCCAACTTACTGATCGCCTCTCTAACGAAGACAACCAAAGAAGTAGCTTGCCT 
GTTGCTGCTCGTAAGCTCTGCAAAGAGTGGCAGCTTGAATACAACCTGCGTAGGAAATGC 
AATAAACTTGAGAGGAGGCTTGAGAAAGAGCTAATTTCACTGGCTGAGATTGAAAGAAGG 
CTCGAGGGGATTTTAGCAATGGAAGAGGAGGAAGTAAGCTCAACGAGTTTGGGCTCTAAG 
CATCCGTTGTCAATCAAACAAGCCAAGATCGAAGCCTTGAGAAAACGAGTGGATATTGAG 
AAAACTAAGTACTTAAACTCGGTCGAGGTTAGTAAGAGAATGACACTAGACAACCTCAAA 
TCAAGCCTTCCCAATGTCTTTCAGATGTTGACTGCTCTAGCTAATGTCTTTGCCAATGGG 
TTTGAATCCGTTAATGGCCAAACCGGTACAGATGTTTCCGACACATCCCAACATTCCGAT 
GAATCTCAACCCTAA 

>G1079 Amino Acid Sequence (conserved domain in AA coordinates : 1-50) 

MGCAASRIDNEEKVTjVCRQRKRLMKKLLGFRGEFADAQLAYLRALRNTGVTLRQFTESET 

LELENTSYGLSLPLPPSPPPTLPPSPPPPPPFSPDLRNPETSHDLADEEEEGENDGGNDG 

SGAAPPPPLPNSmiWNPFESLELHSHPNGDNVVTQV^LKKKQQIQQAEEEDWAETKSQF 

EEEDEQQEAGGTCLDLSVBQIEAVSGCNMKKPRRLKFKLGEVMDGNSSMTSCSGKDLEKT 

HVTDCRIRRTLEGIIRELDDYFLKASGCEKEIAVIVDINSRDTVDPFRYQETRRKRSSSA 

CTFSAIjSWSWSSKSLQLGKDATTSGTVEPCRPGAHCSTLEK^ 

VEHERKSALLQKQDGETYDLSKMEKARLSLESLETEIQRLEDSITTTRSCLLNLINDELY 
PQLVALTSGLAQMWKTMLKCHQVQIHISQQLNHLPDYPSIDLSSEYKRQAWELETEVTC 
WYNS FCKLVNS QRE YVKTLCT WI QLTDRLSNEDNQRS S LP VAARKLCKEWQLE YNLRRKC 
NKLERRLEKELISLAEIERRLEGILAMEEEEVSSTSLGSKHPLSIKQAKIEALRKRVDIE 
KTKYLNSVEVS KRMTLDNLKS SLPNVFQMLTAIiANVFANGFE SWGQTGTDVSDTSQHSD 
ESQP* 

>G1335 {56.. 667) 

TTTTTTTTTAAAAGATTTAGAGAGAAAAGTGAGTTATTAAGAGATTCCAATCAAAATGAG 
CGGAGACAACGGCGGTGGTGAGAGGCGCAAAGGCTCCGTCAAGTGGTTTGATACCCAGAA 
GGGTTTCGGCTTCATCACTCCTGACGACGGTGGCGACGATCTCTTCGTTCACCAGTCCTC 
CATCAGATCTGAGGGTTTCCGTAGCCTCGCTGCCGAAGAAGCCGTAGAGTTCGAGGTTGA 
GATCGACAACAACAACCGTCCCAAGGCCATCGATGTTTCTGGACCCGACGGCGCTCCCGT 
CCAAGGAAACAGCGGTGGTGGTTCATCTGGCGGACGCGGCGGTTTCGGTGGAGGAAGAGG 
AGGTGGACGCGGATCTGGAGGTGGATACGGCGGTGGCGGTGGTGGATACGGAGGAAGAGG 
AGGTGGTGGTCGAGGAGGCAGCGACTGCTACAAGTGTGGTGAGCCCGGTCACATGGCGAG 
AGACTGTTCTGAAGGCGGTGGAGGTTACGGAGGAGGCGGCGGTGGCTACGGAGGTGGAGG 
CGGATACGGCGGAGGAGGTGGTGGTTACGGAGGTGGTGGCCGTGGAGGTGGTGGCGGCGG 
GGGAAGCTGCTACAGCTGTGGCGAGTCGGGACATTTCGCCAGGGATTGCACCAGCGGTGG 
ACGTTAAAACCAACGCCGGTTACGCGGTGGAGAAGAGTGAGTTGGTTATCTCACAAGTGA 
TCGGTTCTTTCTCCCGCCGCCTTCTATCTCTCTATTATCCACTTTTTGCTTATTATGATG 
GATCTCTATCTTTOTTAGTTGGTTTTTTCTTGATGGTTTCGGATTAGGACTCTTCTTTTG 
GTTTTGCTACTTATGGTTGGTTTTATTTATGGTACTTGTGATATGGGTGAAATGCTCTAC 
TTGTTGCTCTGTTTCAAGTGTTCATAATATGCGAACAAATATTCTGGGTTTTGTTTCAAA 
AAAAA 

>G1335 Amino Acid Sequence (domain in AA coordinates: 24-43, 131-144, 185-203) 

MSGDNGGGERRKGSVKWFDTQKGFGFITPDDGGDDLFVHQSSIRSEGFRSLAAEEAVEFE 

VEIDNNNRPKAIDVSGPDGAPVQGNSGGGSSGGRGGFGGGRGGGRGSGGGYGGGGGGYGG 

RGGGGRGGSDCYKCGEPGHMARDCSEGGGGYGGGGGGYGGGGGYGGGGGGYGGGGRGGGG 

GGGSCYSCGESGHFARDCTSGGR* 

>G157 (31.. 621) 
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GGGCATAACCCTTATCGGAGATTTGTyVGCCATGGGAAGAAGAAAAATCGAGATCAAGCGA 

ATCGAGAACAAAAGCAGTCGACAAGTCACTTTCTCCAAACGACGCAATGGTCTCATCGAC 

AAAGCTCGACAACTTTCGATTCTCTGTGAATCCTCCGTCGCTGTTGTCGTCGTATCTGCC 

TCCGGAAAACTCTATGACTCTTCCTCCGGTGACGACATTTCCAAGATCATTGATCGTTAT 

GAAATACAACATGCTGATGAACTTAGAGCCTTAGATCTTGAAGAAAAAATTCAGAATTAT 

CTTCCACACAAGGAGTTACTAGAAACAGTCCAAAGCAAGCTTGAAGAACCAAATGTCGAT 

AATGTAAGTGTAGATTCTCTAATTTCTCTGGAGGAACAACTTGAGACTGCTCTGTCCGTA 

AGTAGAGCTAGGAAGGCAGAACTGATGATGGAGTATATCGAGTCCCTTAAAGAAAAGGAG 

AAATTGCTGAGAGAAGAGAACCAGGTTCTGGCTAGCCAGATGGGAAAGAATACGTTGCTG 

GCAACAGATGATGAGAGAGGAATGTTTCCGGGAAGTAGCTCCGGCAACAAAATACCGGAG 

ACTCTCCCGCTGCTCAATTAGCCACCATCATCAACGGCTGAGTTTTCACCTTAAACTCAA 

AGCCTGATTCATAATTAAGAGAATAAATTTGTATATTATAAAAAGCTGTGTAATCTCAAA 

CCTTTTATCTTCCTCTAGTGTGGAATTTAAGGTCAAAAAGAAAACGAGAAAGTATGGATC 

AGTGTTGTACCTCCTTCGGAGACAAGATCAGAGTTTGTGTGTTTGTGTCTGAATGTACGG 

ATTGGATTTTTAAAGTTGTGCTTTCTTTCTTCAAAAAAAAAAA 

>G157 Amino Acid Sequence (domain in AA coordinates: 2-57) 

MGRRKIEIKRIEHKSSRQVTFSKRRNGLIDKARQLSILCESSVAVWVSASGKLYDSSSG 

DDISKIIDRYEIQHADELRALDLEEKIQNYLPHKELLETVQSKLEEPNVDNVSVDSLISL 

EEQLETALSVSRARKAELMMEYIESLKEKEKLLREENQVLASQMGKNTLLATDDERGMFP 

GSSSGNKIPETLPLLN* 

>G1895 (1..954) 

ATGAATAACCAATCTGTTACTGACAATACAAGTCTTAAGCTGTCATCTAATCTTAACAAC 

GAGTCAAAAGAAACATCTGAGAACAGTGATGACCAACACAGCGAGATCACAACAATTACA 

TCGGAAGAAGAGAAAACAACTGAACTGAAGAAACCAGACAAGATTCTTCCATGTCCGAGA 

TGCAACAGCGCAGACACCAAATTCTGTTACTACAACAACTACAACGTTAACCAGCCACGT 

CACTTCTGTAGAAAATGCCAGAGGTATTGGACCGCTGGTGGATCCATGAGGATCGTCCCG 

GTTGGCTCAGGCCGTCGCAAGAACAAGGGATGGGTTTCTTCAGACCAGTACCTGCACATC 

ACTTCCGAGGATACTGACAATTACAATAGCTCCTCAACAAAGATTCTAAGCTTCGAGTCT 

TCGGACTCTTTGGTAACTGAGAGGCCTAAGCATCAATCAAACGAAGTGAAGATAAACGCT 

GAACCTGTTTCACAAGAACCCAACAACTTCCAAGGGTTACTTCCTCCCCAAGCATCCCCT 

GTTTCGCCTCCTTGGCCTTACCAATACCCTCCAAACCCTAGTTTCTACCACATGCCCGTC 

TACTGGGGCTGCGCGATACCGGTTTGGTCTACCCTCGACACTTCTACATGTCTTGGGAAA 

AGGACAAGAGACGAAACTTCTCATGAAACTGTTAAAGAGAGTAAAAATGCTTTTGAGAGA 

ACAAGCTTGCTTTTGGAATCTCAGAGCATCAAAAATGAAACAAGTATGGCTACAAATAAC 

CATGTGTGGTATCCAGTACCGATGACCCGCGAGAAGACACAAGAATTCAGCTTTTTCAGT 

AATGGAGCTGAAACAAAGAGCAGCAACAACAGATTCGTCCCTGAAACGTATCTTAACCTG 

CAAGCAAACCCTGCAGCCATGGCAAGATCTATGAACTTCAGAGAGAGCATATAA 

>G1895 Amino Acid Sequence (domain in AA coordinates: 55-110) 

MNNQSVTDNTSLKLSSNLNNESK^ 

CNSADTKFCYYN1TYNWQPRHFCRKCQRYWTAGGSMRIW 

TSEDTDNYNSSSTKILSFESSDSLVTERPKHQSNEVKINAEPVSQEPNNFQGLLPPQASP 

VSPPWPYQYPPNPSFYHMPVYWGCAIPVWSTLDTSTCLGKRTRDETSHETVKESKNAFER 

TSLLLESQSIKJSTETSMATNNHVWYPVPMT 

QANPAAMARSMNFRES I * 

>G1900 (1..897) 

ATGCTGGAAACTAAAGATCCTGCGATAAAGCTCTTTGGTATGAAAATTCCTTTCCCGACG 

GTTTTAGAGGTTGCTGATGAAGAAGAAGAAAAGAACGAAAACAAGACATTAACTGATCAA 

TCGGAGAAAGACAAAACCCTAAAGAAACCAACCAAGATTCTTCCATGTCCAAGATGCAAC 

AGCATGGAGACTAAGTTCTGTTACTACAACAACTACAACGTAAACCAACCTCGCCATTTT 

TGTAAAGCTTGTCAGAGATATTGGACCTCAGGTGGGACCATGAGAAGTGTTCCAATCGGA 

GCAGGACGGCGCAAGAACAAGAACAACTCACCAACTTCACATTACCACCATGTGACTATC 

TCCGAAAC^AATGGTCCGGTCCTTAGTTTCAGCCTCGGAGATGATCAAAAGGTCTCGAGT 

AATAGGTTTGGTAATCAAAAGCTAGTTGCTAGGATAGAGAACAATGACGAGCGCTCTAAT 

AACAACACTTCGAACGGTTTGAATTGTTTTCCGGGAGTTTCGTGGCCGTACA 

CCTGCGTTTTACCCGGTTTACCCTTAITGGAGCATGCCAGTGTTOTCTTCTCCGGTAAGT 

TCAAGTCCTACTTCTACTCTTGGTAAGCATTCGAGAGACGAAGACGAGACGGTGAAGCAA 

AAACAGAGGAATGGATCTGTATTGGTTCCAAAGACTTTGAGAATTGATGATCCTAATGAA 
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GCTGCAAAGAGTTCGATATGGACAACACTTGGGATCAAGAACGAAGTTATGTTCAATGGG 
TTTGGTTCGAAGAAAGAGGTTAAGCTCAGTAACAAAGAAGAAACAGAGACCTCACTTGTT 
CTTTGTGCAAACCCTGCTGCGTTATCAAGATCAATCAATTTCCATGAGCAGATGTGA 
>G1900 Amino Acid Sequence (domain in AA coordinates: 54-106) 
MLETKDPAIKLFGMKIPFPTVLEVADEEEEKNQNKTLTDQSEKDKTLKKPTKILPCPRCN 
SMETKFCYYNNYNVBQPRHFCI<ACQRYWTSGGTMRSVPIGAGRRKNK1^WSPTSHYHHVTI 
S ETNGPVLS FS LGDDQKVS SNRFGNQKL VAR I ENNDERSNWNTSNGLNCFPGVS W P YT WN 
PAFYPVYPYWSMPVLSSPVSSSPTSTLGKHSRDEDETVKQKQRNGSVLVPKTLRIDDPNE 
AAKS S I WTTLG I KNE VMFNGFGS KKE VKLSNKEETET S LVLCANPAALS RS INFHEQM * 
>G2007 (1..861) 

ATGGGAAGGCAGCCATGTTGTGACAAGCTCATGGTGAAGAAGGGGCCGTGGACGGCGGAG 
GAAGACAAGAAACTGATAAACTTTATCTTGACCAACGGCCACTGTTGCTGGAGGGCTTTG 
CCGAAGCTGGCCGGTCTCCGTCGCTGTGGGAAGAGCTGCCGTCTACGGTGGACCAATTAT 
CTCCGACCTGACTTGAAGAGAGGTCTTCTCTCCGACGCCGAGGAACAGCTTGTCATCGAC 
CTTCATGCTCTTCTCGGCAACAGATGGTCCAAGATCGCTGCAAGATTACCAGGAAGAACA 
GACAACGAAATAAAAAATCATTGGAATACTCATATCAAGAAGAAGCTCCTTAAGATGGAA 
ATCGATCCTTCGACCCATCAACCTTTAAACAAAGTATTTACCGATACAAACTTAGTCGAT 
AAATCTGAAACTTCATCGAAAGCCGACAATGTAAATGATAATAAAATCGTAGAGATCGAT 
GGGACAACGACAAATACAATAGATGATAGCATTATCACTCATCAAAATAGTTCAAATGAT 
GATTATGAATTACTTGGTGATATAATTCATAATTATGGAGATTTATTTAATATTCTATGG 
ACCAACGATGAACCTCCTCTAGTCGATGATGCATCATGGAGCAATCATAACGTTGGTATT 
GGAGGAACAGCTGCAGTTGCAGCCTCAGACAAGAACAACACTGCTGCCGAGGAAGATTTC 
CCGGAAAGATCATTTGAAAAACAGAACGGCGAAAGTTGGATGTTCTTGGATTATTGCCAA 
GAATTTGGTGTTGAAGATTTTGGGTTCGAGTGTTACCATGGTTTTGGTCAAAGCTCCATG 
AAGACGGGTCACAAGGACTAG 

>G2007 Amino Acid Sequence (domain in AA coordinates: TBD) 

MGRQPCCDKLMVKKGPWTAEEDKKLINFILTNGHCCWRALPKLAGLRRCGKSCRLRWTNY 

LRPDLKRGLLSDAEEQLVIDLHALLGNRWSKIAAR^ 

IDPSTHQPLNKVFTDTNLVDKSETS SKADNVNDNKI VEIDGTTTNTIDDS I ITHQNS SND 
DYELLGDIIHtmSDLFNILWTNDEPPLVDDA 

PERSFEKQNGESWMFLDYCQEFGVEDFGFECYHGFGQSSMKTGHKD* 
>G214 (238.. 2064) 

TGAGATTTCTCCATTTCCGTAGCTTCTGGTCTCTTTTCTTTGTTTCATTGATCAAAAGCA 
AATCACTTCTTCTTCTTCTTCTTCTCGATTTCTTACTGTTTTCTTATCCAACGAAATCTG 
GAATTAAAAATGGAATCTTTATCGAATCCAAGCTGATTTTGTTTCTTTCATTGAATCATC 
TCTCTAAAGTGGAATTTTGTAAAGAGAAGATCTGAAGTTGTGTAGAGGAGCTTAGTGATG 
GAGACAAATTCGTCTGGAGAAGATCTGGTTATTAAGACTCGGAAGCGATATACGATAACA 
AAGCAACGTGAAAGGTGGACTGAGGAAGAACATAATAGATTCATTG7^AGCTTTGAGGCTT 
TATGGTAGAGCATGGCAGAAGATTGAAGAACATGTAGCAACAAAAACTGCTGTCCAGATA 
AGAAGTCACGCTCAGAAATTTTTCTCCAAGGTAGAGAAAGAGGCTGAAGCTAAAGGTGTA 
GCTATGGGTCAAGCGCTAGACATAGCTATTCCTCCTCCACGGCCTAAGCGTAAACCAAAC 
AATCCTTATCCTCGAAAGACGGGAAGTGGAACGATCCTTATGTCAAAAACGGGTGTGAAT 
GATGGAAAAGAGTCCCTTGGATCAGT^AAAAGTGTCGCATCCTGAGATGGCCAATGAAGAT 
CGACAACAATCAAAGCCTGAAGAGAAAACTCTGCAGGAAGACAACTGTTCAGATTGTTTC 
ACTCATCAGTATCTCTCTGCTGCATCCTCCATGAATAAAAGTTGTATAGAGACATCAAAC 
GCAAGCACTTTCCGCGAGTTCTTGCCTTCACGGGAAGAGGGAAGTCAGAATAACAGGGTA 
AGAAAGGAGTCAAACTCAGATTTGAATGCAAAATCTCTGGAAAACGGTAATGAGCAAGGA 
CCTCAGACTTATCCeATGCATATCCCTGTGCTAGTGCCATTGGGGAGCTCAATAACAAGT 
TCTCTATCACATCCTCCTTCAGAGCCAGATAGTCATCCCCACACAGTTGCAGGAGATTAT 
CAGTCGTTTCCTAATCATATAATGTCAACCCTTTTACAAACACCGGCTCTTTATACTGCC 
GCAACTTTCGCCTCATCATTTTGGCCTCCCGATTCTAGTGGTGGCTCACCTGTTCCAGGG 
AACTCACCTCCGAATCTGGCTGCC^TGGCCGCAGCCACTGTTGCAGCTGCTAGTGCTTGG 
TGGGCTGCCAATGGATTATTACCTTTATGTGCTCCTCTTAGTTCAGGTGGTTTCACTAGT 
CATCCTCCATCTACTTTTGGACCATCATGTGATGTAGAGTACACAAAAGCAAGCACTTTA 
CAACATGGTTCTGTGCAGAGCCGAGAGCAAGAACACTCCGAGGCATCAAAGGCTCGATCT 
TCACTGGACTCAGAGGATGTTGAAAATAAGAGTAAACCAGTTTGTCATGAGCAGCCTTCT 
GCAACACCTGAGAGTGATGCAAAGGGTTCAGATGGAGCAGGAGACAGAAAACAAGTTGAC 
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CGGTCCTCGTGTGGCTCAAACACTCCGTCGAGTAGTGATGATGTTGAGGCGGATGCATCA 
GAAAGGCAAGAGGATGGCACCAATGGTGAGGTGAAAGAAACGAATGAAGACACTAATAAA 
CCTCAAACTTCAGAGTCCAATGCACGCCGCAGTAGAATCAGCTCCAATATAACCGATCCA 
TGGAAGTCTGTGTCTGACGAGGGTCGAATTGCCTTCCAAGCTCTCTTCTCCAGAGAGGTA 
TTGCCGCAAAGTTTTACATATCGAGAAGAACACAGAGAGGAAGAACAACAACAACAAGAA 
CAAAGATATCCAATGGCACTTGATCTTAACTTCACAGCTCAGTTAACACCAGTTGATGAT 
CAAGAGG AGAAGAG AAACAC AGGATTTCTTGGAATCGG ATTAGATG CTTCAAAG CTAATG 
AGTAGAGGAAGAACAGGTTTTAAACCATACAAAAGATGTTCCATGGAAGCCAAAGAAAGT 
AGAATCCTCAACAACAATCCTATCATTCATGTGGAACAGAAAGATCCCAAACGGATGCGG 
TTGGAAACTCAAGCTTCCACATGAGACTCTATTTTCATCTGATCTGTTGTTTGTACTCTG 
TTTTTAAGTTTTCAAGACCACTGCTACATTTTCTTTTTCTTTTGAGGCCTTTGTATTTGT 
TTCCTTGTCCATAGTCTTCCTGTAACATTTGACTCTGTATTATTCAACAAATCATAAACT 
GTTTAATCTTTTTTTTTCCA 

>G214 Amino Acid Sequence (domain in AA coordinates: 22-71) 

METNSSGEDLVIKTRKPYTITKQRERWTEEEHNRFIEALRLYGRAWQKIEEHVATKTAVQ 

IRSHAQKFFSKVEKEAEAKGVAMGQALDIAIPPPRPKRKPNNPYPRKTGSGTILMSKTGV 

NDGKESLGSEKVSHPEMANEDRQQSKPEEKTLQEDNCSDCFTHQYLSAASSMNKSCIETS 

NASTFREFLPSREBGSQNNRVRKESNSDLNAKSLENGNEQGPQTYPMHIPVLVPLGSSIT 

SSLSHPPSEPDSHPHTVAGDYQS FPNHIMSTLLQTPAIiYTAATFASS FWPPDS SGGSPVP 

GNSPPNL7y\MAAATVAAASAWWAANGLLPLCAPLSSGGFTSHPPSTFGPSCDVEYTKAST 

LQHGSVQSREQEHSEASKARSSLDSEDVENKSKPVCHEQPSATPESDAKGSDGAGDRKQV 

DRSSCGSNTPSSSDDVEADASERQEDGTNGEVKETNEDTNKPQTSESNARRSRISSNITD 

PWKSVSDEGRIAFQALFSREVLPQSFTYREEHREEEQQQQEQRYPMALDLNFTAQLTPVD 

DQEEKRNTGFLGIGLDASKLMSRGRTGFKPYraCSMEAKESRILNNNPIIHVEQKDPKRM 
RLETQAST* 

>G2155 (63.. 740) 

CTCATATATACCAACCAAACCTCTCTCTGCATCTTTATTAACACAAAATTCCAAAAGATT 
AAATGTTGTCGAAGCTCCCTACACAGCGACACTTGCACCTCTCTCCCTCCTCTCCCTCCA 
TGGAAACCGTCGGGCGTCCACGTGGCAGACCTCGAGGTTCCAAAAACAAACCTAAAGCTC 
CAATCTTTGTCACCATTGACCCTCCTATGAGTCCTTACATCCTCGAAGTGCCATCCGGAA 
ACGATGTCGTTGAAGCCCTAAACCGTTTCTGCCGCGGTAAAGCCATCGGCTTTTGCGTCC 
TCAGTGGCTCAGGCTCCGTTGCTGATGTCACTTTGCGTCAGCCTTCTCCGGCAGCTCCTG 
GCTCAACCATTACTTTCCACGGAAAGTTCGATCTTCTCTCTGTCTCCGCCACTTTCCTCC 
CTCCTCTACCTCCTACCTCCTTGTCCCCTCCCGTCTCCAATTTCTTCACCGTCTCTCTCG 
CCGGACCTCAGGGGAAAGTCATCGGTGGATTCGTCGCTGGTCCTCTCGTTGCCGCCGGAA 
CTGTTTACTTCGTCGCCACTAGTTTCAAGAACCCTTCCTATCACCGGTTACCTGCTACGG 
AGGAAGAGCAAAGAAACTCGGCGGAAGGGGAAGAGGAGGGACAATCGCCGCCGGTCTCTG 
GAGGTGGTGGAGAGTCGATGTACGTGGGTGGCTCTGATGTCATTTGGGATCCCAACGCCA 
AAGCTCCATCGCCGTACTGACCACAAATCCATCTCGTTCAAACTAGGGTTTCTTCTTCTT 
TAGATCATCAAGAATCAACAAAAAGATTGCATTTTTAGATTCT^ 

ACTCACTCTTTAATCTCTCTATCACTTCTTCTTTAGCTTTTTCTGCAGTGTCAAACTTCA 
CATATTTGTAGTTTGATTTGACTATCCCCAAGTTTTGTATTOT 

CTGTCTCTAATGGTTGTTTTTTCGTTTGTATAATCTTATGCATTGTTTATTGGAGCTCCA 
GAGATTGAATGTATAATATAATGGTTTAAT 

>G2155 Amino Acid Sequence (domain in AA coordinates : 18-38) 

MLSKLPTQRHLHLSPSSPSMETVGRPRGRPRGSKNKPKAPIFVTIDPPMSPYILEVPSGN 

DVVEALNRFCRGKAIGFCVIjSGSGSVADVTLRQPSPAAPGSTITFHGKFDLLSVSATFLP 

PLPPTSLSPPVSNFFTVS1AGPQGKVIGGFVAGPLVAAGTWFVATSFKNPSYHRLPATE 

EEQRNSAEGEEEGQSPPVSGGGGESMYVGGSDVIWDPNAKAPSPY* 

>G234 (106.. 1035) 

CACAACATCATACCCACCAACATATATAATGTTGATCATAGAGAGATAAACAGAGGCCGC 

TATCAAGAACAAGACTAAGAACAAGACTTCACTAGGAGTACAAGTATGGGAAGAGCACCG 

TGTTGTGACAAAGCAAACGTGAAGAAAGGGCCCTGGTCTCCTGAGGAAGA 

AAATCTTACATTGAAAATAGTGG(^CCGGAGGCAATTGGATCGCTTTGCCTCAAJ^ 

GGTTTAAAGAGATGTGGAAAGAGTTGCAGGCTGAGGTGGCTTAACTATCTTAGACCAAAC 

ATCAAACATGGTGGCTTCTCTGAGGAAGAAGAAAACATCATTTGTAGCCTTTACCTTACA 

ATTGGTAGCAGGTGGTCTATAATCGCTGCTCAATTGCCGGGACGAACAGACAACGATATA 
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AAAAACTATTGGAACACGAGGCTCAAGAAGAAACTCATTAACAAACAACGCAAGGAGCTT 

CAAGAAGCTTGTATGGAGCAGCAAGAGATGATGGTGATGATGAAGAGACAACACCAACAA 

CAACAAATCCAAACTTCTTTTATGATGAGACAAGACCAAACAATGTTCACATGGCCACTA 

CATCATCATAATGTTCAAGTTCCAGCTCTTTTCAGAATCAAACCAACTCGTTTTGCGACC 

AAGAAGATGTTAAGCCAGTGCTCATCAAGAACATGGTCAAGATCGAAGATCAAGAACTGG 

AGAAAACAAACCTCATCATCATCAAGATTCAATGACAACGCTTTTGATCATCTCTCTTTC 

TCTCAACTCTTGTTAGATCCTAATCATAACCACTTAGGATCAGGAGAGGGTTTCTCCATG 

AACTCTATCTTGAGCGCCAACACAAACTCTCCATTGCTTAACACAAGTAATGATAATCAG 

TGGTTCGGGAATTTCCAGGCCGAAACCGTAAACTTGTTCTCAGGAGCCTCCACAAGTACT 

TCGGCAGATCAAAGCACTATAAGTTGGGAAGACATAAGCTCTCTTGTTTATTCTGATTCA 

AAGCAATTTTTTTAATTATAATAATATATTATTCTTAAGATGAAACGTACATCATTATTA 

TTAATTGGGGGTACGTAACGTATATATGGAATAACGATCTAGTTTGTTTAAATTTAAAA 

>G234 Amino Acid Sequence (domain in AA coordinates: 14-115) 

MGRAPCCDKANVKKGP WS PEEDAKLKS Y I ENS GTGGN W I ALPQKI GliKRCGKS CRLRWLN 

YLRPNIKHGGFSEEEENIICSLYLTIGSRWSIIAAQLPGRTDNDIKNYWNTRLKKKLINK 

QRKELQEACMEQQEMMVMMKRQHQQQQIQTSFMMRQDQTMFTWPLHHHNVQVPALFRIKP 

TRFATKKMLSQCSSRTWSRSKIKNWRKQTSSSSRFNDNAFDHLSFSQLLLDPNHNHLGSG 

EGFSMNSILSANTNSPLLNTSNDNQWFGNFQAETVNLFSGASTSTSADQSTISWEDISSL 

VYSDSKQFF* 

>G361 (54.. 647) 

TCTGTCTCTCTCTCTCTCTTTGTAAATATACATATATAGATAAGCTCACATATATGGCGA 
CTGAAACATCTTCTTTGAAGCTCTTCGGTATAAACCTACTTGAAACGACGTCGGTTCAAA 
ACCAGTCATCGGAACCAAGACCCGGATCCGGATCAGGATCCGAGTCACGTAAGTACGAGT 
GTCAATACTGTTGTAGAGAGTTTGCTAACTCTCAAGCTCTTGGTGGTCACCAAAACGCTC 
ACAAGAAAGAGCGTCAGCTTCTTAAACGTGCACAGATGTTAGCTACTCGTGGTTTGCCAC 
GTCATCATAATTTTCACCCTCATACCAATCCGCTTCTCTCCGCCTTCGCGCCGCTGCCTC 
ACCTCCTCTCTCAGCCGCATCCTCCGCCGCATATGATGCTCTCTCCTTCTTCTTCGAGTT 
CTAAGTGGCTTTACGGTGAACACATGTCGTCACAAAACGCCGTTGGGTACTTTCATGGTG 
GAAGGGGACTTTACGGAGGTGGCATGGAGTCTATGGCCGGAGAAGTAAAGACTCATGGTG 
GTTCTTTGCCGGAGATGAGGAGGTTCGCCGGAGATAGTGATCGGAGTAGCGGAATTAAGT 
TAGAGAATGGTATTGGGCTGGACCTCCATTTAAGCCTTGGGCCATGAATGATTATAATTT 
TGGCC(^GTAAAGATCTGTAAAATACTACTAGGATTTCAT 

TCCTTAATTTCGGTTGAAATTGGTGAATATTTTTATCTCTTACTTACCAAATCTCATATT 
TCTATGTATGCGTTTGCTTTCACTTTTTTTTTTTATATAATTCTTCTTGTAAAAAATGCA 
ATGTGAGTTTTCTTCCCTATCATTCTGTCAAGCTTTGGTTCAATTATTTAGTAATCGAAT 
AATATAGGAATAGTGTTGAAAG 

>G361 Amino Acid Sequence (domain in AA coordinates: 43-63) 

MATETSSLKLFGINLLETTSVQNQSSEPRPGSGSGSESRKYECQYCCREFANSQALGGHQ 

NAHKKERQLLKRAQMLATRGLPRHHNFHPHTNPLLSAFAPLPHLLSQPHPPPHM PS S 

SSSKWLYGEHMSSQNAVGYFHGGRGLYGGGMESMAGEVKTHGGSLPEMRRFAGDSDRSSG 

IKLENGIGLDLHLSLGP* 

>G562 (137.. 1285) 

ATTTGAATTTCTGGGTTTCTCTCTGTTTAAGCTTCTTCTTCTTCATCTTCTGCTTACGTT 
TCTTCTTCAAGGAGCTTTCGGATTCTTGTAGAT^AGAGTCATTGTTCTCTTGAGTGGGAAA 
CCTTGA7^CCATTCCTATGGGAAATAGCAGCGAGGAACCAAAGCCTCCTACCAAATCAGA 
TAAACCATCTTCACCCCCGGTGGATCAAACAAATGTTCATGTCTACCCTGATTGGGCAGC 
TATGCAGGCATATTATGGTCCAAGAGTAGCAATGCCTCCTTATTACAATTCAGCTATGGC 
TGCATCTGGTCATCCTCCTCCTCCTTACATGTGGAATCCTCAGCATATGATGTCACCATC 
TGGAGCACCCTATGCTGCTGTTTATCCTCATGGAGGAGGAGTTTACGCTCATCCCGGTAT 
TCCCATGGGATCACTGCCTCAAGGTCAAAAGGATCCACCTTTAACAACTCCGGGGACGCT 
TTTGAGCATCGACACTCCTACTAAATCTACAGGGAACACAGACAATGGATTGATGAAGAA 
GCTGAAAGAGTTTGATGGGCTTGCTATGTCTCTAGGAAATGGGAATCCTGAAAATGGTGC 
AGATGAACATAAACGATCACGGAACAGCTCAGAAACTGATGGTTCTACTGATGGAAGTGA 
TGGGAATACAACTGGGGCAGATGAACCGAAACTTAAAAGAAGTCGAGAGGGAACTCCAAC 
AAAAGATGGG7U\ACAATTGGTTCAAGCTAGCTCATTTCATTCTGTTTCTCCGTC^GTGG 
TGATACCGGCGTAAAACTCATTCAAGGATCTGGAGCTATACTCTCTCCTGGTGTAAGTGC 
AAATTCCAACCCCTTCATGTCACAATCTTTAGCCATGGTTCCTCCTGAAACTTGGCTTCA 
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GAACGAGAGAGAACTGAAACGGGAGCGAAGGAAACAGTCTAATAGAGAATCTGCTAGAAG 

GTCAAG ATTAAGG AAACAG G CCGAGAC AG AAGAACTTGCTAGGAAAGTGG AAGCCTTGAC 

AGCCGAAAACATGGCATTAAGATCTGAACTAAACCAACTTAATGAGAAATCTGATAAACT 

AAGAGGAGCAAATGCAACCTTGTTGGACAAACTGAAATGCTCGGAACCCGAAAAGAGAGT 

CCCCGCAAATATGTTGTCTAGAGTTAAGAACTCAGGAGCTGGAGATAAGAACAAGAACCA 

AGGAGACAATGATTCTAACTCTACAAGCAAATTCCATCAACTGCTCGATACGAAGCCTCG 

AG CTAAAG CAGTAGCTGC AGGCTGAATCG ATGGTAATTCATGTCGATTTCTACTTAATTT 

GTCGACATAAACAAAGAAAATAAGTGCTACTAATTTCAGAAAAACTTGATAGATAGATAG 

TATAGTAGAGAGAGAGAGAGAGAGAGAGGTGTGATGATTATTGATCTATAAATTTTCGGA 

GAGAGAGAGGGAGAAAGAGAAACTTTTCCTCCAGATGAAAATTTGGTGTTATGGTTTGTT 

ACTGTTAATATAGAGAGGCTTTTCTTTTTTTATAAAATGGCTTCCTTTGTTGCA 

>G562 Amino Acid Sequence (domain in AA coordinates: 253-315) 

MGNSSEEPKPPTKSDKPSSPPVDQTNVHVYPDWAAMQAYTC 

PPPYMWNPQHMMSPSGAPYAAVYPHGGGVYAHPGIPMGSLPQGQKDPPLTTPGTLLSIDT 

PTKSTGNTDNGLMKKLKEFDGLAMSLGNGNPENGADEHKRSRNSSETDGSTDGSDGNTTG 

ADEPKLKRSREGTPTKDGKQLVQASSFHSVSPSSGDTGVKLIQGSGAILSPGVSANSNPF 

MSQSLAMVPPETWLQNERELKRERRKQSNRESARRSRLRKQA^TEELARK^EALTAENMA 

LRSELNQLNEKSDKIiRGANATLLDKLKCSEPEKRVPANMLSRVKNSGAGDKNKNQGDND 

NSTSKFHQLLDTKPRAKAVAAG* 

>G591 (88.. 1020) 

GTAAATCTCTCTTTGAAGGTTCCTAACTCGTTAATCGTAACTCACAGTGACTCGTTCGAG 
TCAAAGTCTCTGTCTTTAGCTCAAACCATGGCTAGTAACAACCCTCACGACAACCTTTCT 
GACCAAACTCCTTCTGATGATTTCTTCGAGCAAATCCTCGGCCTTCCTAACTTCTCAGCC 
TCTTCTGCCGCCGGTTTATCTGGAGTTGACGGAGGATTAGGTGGTGGAGCACCGCCTATG 
ATGCTGCAGTTGGGTTCCGGAGAAGAAGGAAGTCACATGGGTGGCTTAGGAGGAAGTGGA 
CCAACTGGGTTTCACAATCAGATGTTTCCTTTGGGGTTAAGTCTTGATCAAGGGAAAGGA 
CCTGGGTTTCTTAGACCTGAAGGAGGACATGGAAGTGGGAAAAGATTCTCAGATGATGTT 
GTTGATAATCGATGTTCTTCTATGAAACCTGTTTTCCACGGGCAGCCTATGCAACAGCCA 
CCTCCATCGGCCCCACATCAGCCTACTTCAATCCGTCCCAGGGTTCGAGCTAGGCGTGGT 
CAGGCTACTGATCCACATAGCATCGCTGAGCGGCTACGTAGAGAAAGAATAGCAGAACGG 
ATCAGGGCGCTGCAGGAACTTGTACCTACTGTGAACAAGACCGATAGAGCTGCTATGATC 
GATGAGATTGTCGATTATGTAAAGTTTCTCAGGCTCCAAGTCAAGGTTTTGAGCATGAAC 
CGACTTGGTGGAGCCGGTGCGGTTGCTCCACTTGTTACTGATATGCCTCTTTCATCATCA 
GTTGAGGATGAAACGGGTGAGGGTGGAAGGACTCCGCAACCAGCGTGGGAGAAATGGTCT 
AACGATGGGACTGAACGTCAAGTGGCTAAACTGATGGAAGAGAACGTTGGAGCCGCGATG 
CAGCTTCTTCAATCAAAGGCTCTTTGTATGATGCCAATCTCATTGGCAATGGCAATTTAC 
CATTCTCAACCTCCGGATACATCTTCAGTGGTCAAGCCTGAGAACAATCCTCCACAGTAG 
GATTTCTGCAATAAAGAGTTTGTACAGCTAATCCAACTGTCCAACATGGGTTTTTCTTCT 
GCTCTAATGACTCTGGTTTCTTCTCTCCTCTCTCACCGACTTGAAAGGTAAAAAAGTGAA 
AAAGGCTTTGTAGATGGAATCAATGTAGGATTTGCAGTAGAGGGCAAAAAAATGTCATAT 
AGCTCAATTGATCAAGTCTTAAAAAAAAAAAAAAAAAAAA 

>G591 Amino Acid Sequence (domain in AA coordinates: 143-240) 

MASNNPHDNLSDQTPSDDFFEQILGLPNFSASSAAGLSGVDGGLGGGAPPMMLQLGSGEE 

GSHMGGLGGSGPTGFHNQMFPLGLSLDQGKGPGFLRPEGGHGSGKRFSDDWDNRCSSMK 

PVFHGQPMQQPPPSAPHQPTSIRPRVRARRGQATDPHSIAERLRRERIAERIRALQELVP 

TVNKTDRAAMIDE I VDYVKFLRLQVKVLSMNRLGGAGAVAPLVTDMPLSS SVEDETGEGG 

RTPQPAWEKWSNDGTERQVAKLMEEWGAAMQLLQSKALCMMPISLAMAIYHSQPPDTSS 
WKPENNPPQ* *- 

>G8 (247.. 1596) 

AAAAAAAAATATCCGTCTCACTCTCTCGCCGCCGGTAA(^TTTCCCGGCGACAAAACTTC 

TCTACTCT(^CC71TTCCTCCATCGTAATCTCTAJ^TTCTTCTCCATTCTCTTCTTCCTCC 

CGATCATCTCGAGCTCTTCGTGAGAGATTATGTGATTATGTAATCGTTGTTGCTGTAGAA 

GACGATCTCTAACAACTGATTCCTTCATCATCACCTTCGCTAGATTTGTAATTTO 

CTTGAGATGTTGGATCTTAACCTCAACGCTGATTCTCCCGAGTCGACTCAGTACGGTGGT 

GACTCATACTTAGATCGGCAGACATCAGACAACTCCGCCGGGAATCGAGTGGAAGAGTCC 

GGTACATCGACGTCGTCAGTTATCAATGCCGATGGAGACGAAGACTCTTGCTCTACTCGA 

GCTTTCACTCTCAGTTTCGATATTTTAAAAGTCGGAAGTAGTAGCGGCGGAGACGAAAGC 
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CCCGCCGCTTCAGCTTCCGTTACTAAAGAGTTTTTTCCGGTGAGTGGAGACTGTGGACAT 
CTACGAGATGTTGAAGGATCATCAAGCTCTAGAAACTGGATAGATCTTTCTTTTGACCGT 
ATTGGTGACGGAGAAACGA/yVTTGGTAACTCCGGTTCCGACTCCGGCTCCGGTTCCGGCT 
CAGGTTAAAAAGAGTCGGAGAGGACCAAGGTCTAGAAGTTCACAGTATAGAGGAGTTACT 
TTTTATAGAAGAACTGGTCGATGGGAGTCACATATTTGGGATTGTGGGAAACAAGTTTAT 
TTAGGTGGTTTCGACACTGCTCATGCTGCAGCTAGAGCTTATGATCGAGCTGCTATTAAA 
TTTAGAGGTGTTGATGCTGATATCAACTTTACTCTTGGTGATTATGAGGAAGATATGAAA 
CAGGTACAAAACTTGAGTAAGGAAGAGTTTGTGCATATACTGCGTAGACAGAGCACGGGG 
TTTTCGCGGGGGAGTTCGAAGTATCGAGGGGTTACGTTACACAAATGTGGTAGATGGGAA 
GCTAGGATGGGGCAGTTTCTTGGTAAAAAGGCTTATGACAAGGCTGCAATCAACACTAAT 
GGTAGAGAAGCAGTCACGAACTTCGAGATGAGTTCATACCAAAATGAGATTAACTCTGAG 
AGCAATAACTCTGAGATTGACCTCAACTTGGGAATCTCTTTATCGACCGGTAATGCGCCA 
AAGCAAAATGGGAGGCTCTTTCACTTCCCTTCTAATACTTATGAAACTCAGCGTGGAGTT 
AGCTTGAGGATAGATAACGAATACATGGGAAAGCCGGTGAATACACCTCTTCCTTATGGA 
TCCTCGGATCATCGCCTTTACTGGAACGGAGCATGCCCGAGTTATAATAATCCCGCCGAG 
GGAAGAGCAACAGAAAAGAGAAGTGAAGCTGAAGGGATGATGAGTAACTGGGGATGGCAG 
AGACCGGGGCAAACAAGCGCCGTGAGACCGCAGCCACCGGGACCACAACCACCACCATTG 
TTCTCAGTTGCAGCAGCATCATCAGGATTCTCACATTTCCGGCCACAACCTCCCAATGAC 
AATGCAACACGTGGTTACTTTTATCCACACCCTTAACTTGTAAGGGGACATATGAGAGTT 
TTTTTACCATCTCTCTCTCTCTCAACACTCTAGTCCCCTTTCAAAAATGTCATTTGGGTT 
TTAGATTTTTCACATACAATGATCAATTTTTCC 

>G8 Amino Acid Sequence (domain in AA coordinates: 151-217, 243-296) 

MLDLNLNADSPESTQYGGDSYLDRQTSDNSAGNRVEESGTSTSSVINADGDEDSCSTRAF 

TLSFDILKVGSSSGGDESPAASASVTKEFFPVSGDCGHLRDVEGSSSSRNWIDIiSFDRIG 

DGETKLVTPVPTPAPVPAQVKKSRRGPRSRSSQYRGVTFYRRTGRWESHIWDCGKQVYLG 

GFDTAHAAARAYDRAAI KFRGVDAD INFTLGDYEEDMKQVQNLS KEEF VH I LRRQSTGFS 

RGSSKYRGVTLHKCGRWEARMGQFLGKKAYDKAAINTNGREAVTNFEMSSYQNEINSESN 

NSEIDLNLGISLSTGNAPKQNGRLFHFPSNTYETQRGVSLRIDNEYMGKPVNTPLPYGSS 

DHRLYWNGACPSYimPAEGRATEKRSEAEGMMSNWGWQRPGQTSAVRPQPPGPQPPPLFS 

VAAASSGFSHFRPQPPNDNATRGYFYPHP* 

>G859 (162. .752) 

GATTTGTCATTTTTTGTCTAGCCAAAAAAAAAAAAAAAAAAGGAGAGAGAGAGAGAGAGA " 

GAGAGAGAGAGAAACGAAGAAAAA7VAAAGAAGCAAAAAACATTGTGGGTCTCCGGTGATT 

AGGATCAAATTAGGGCACCAGCCTTATCGGAGGAAGAAGCCATGGGTAGAAAAAAAGTCG 

AGATCAAGCGAATCGAGAACAAAAGTAGTCGACAAGTCACTTTCTCCAAACGACGCAATG 

GTCTCATCGAGAAAGCTCGACAACTTTCAATTCTCTGTGAATCTTCCATCGCTGTTCTCG 

TCGTCTCCGGCTCCGGAAAACTCTACAAGTCTGCCTCCGGTGACAACATGTCAAAGATCA 

TTGATCGTTACGAAATACATCATGCTGATGAACTTGAAGCCTTAGATCTTGCAGAAAAAA 

CTCGGAATTATCTGCC^CTCAAAGAGTTACTAGAAATAGTCCAAAGCAAGCTTGAAGAAT 

CAAATGTCGATAATGCAAGTGTGGATACTTTAATTTCTCTGGAGGAACAGCTCGAGACTG 

CTCTGTCCGTAACTAGAGCTAGGAAGACAGAACTAATGATGGGGGAAGTGAAGTCCCTTC 

AAAAAACGGAGAACTTGCTGAGAGAAGAGAACCAGACTTTGGCTAGCCAGGTGGGGAAGA 

AGACGTTTCTGGTTATAGAAGGTGACAGAGGAATGTCATGGGAAAATGGCTCCGGCAACA 

AAGTACGGGAGACTCTTCCGCTGCTCAAGTAATCACCATCATCAACGGCTGAGCTTTCAC 

CTTAAACTTACAGCCTGATTCAGAAGTTTTTACAAATTTGTAAATTATAAAAAGCTTCAT 

AATAATCTCAACCTTTTTATCTTCCTCGCGCCAATGTGGAAATTAAGGTTAAAAATAAAA 

TAAAACAGAAGCTCATGCGAAAGAATTGTAAAACTAAGATAAAGCTATAGTAGATCTTTA 

TTGTACCTTCGTAGACGATATAAGATTTATTCGTGTGTTTGTCTTCCCCTCNAAAAAAAA 

AAAAAAAAAAAAAAAA 

>G859 Amino Acid Sequence (domain in AA coordinates: TBD) 

MGRKKVEIKRIENKSSRQVTFSKRRNGLIEKARQLSILCESSIAVLVVSGSGKLYKSASG 

DM4SKIIDRYT3IHHADELEALDLAEKTRNYLP 

EEQLETALS VTRARKXELMMGEVKS LQKTENLLREENQTLAS QVGKKTFLVIEGDRGMSW 

ENGSGNKVRETLPLLK* 

>G878 (197.. 1738) 

CAAAAAAAATCTCTCCCATTAAAAGACTGCCCAAAGAAATATTTTATACAAAATGAAA 
GAGAAACACGACACGAATTTTGTATAATTAAGATTACACAAAAAAAAGTGTTAGAAAGAG 
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AAATATCTTCTTCTTTTTTCTGTGTGAGTTGGGTTTGTTA?^AGTTTTATCCTTTTTGTTC 

TCAAAATCAAGAATCGATGGCGGAGAAGGAAGAAAAAGAACCATCGAAGTTAAAATCATC 

CACCGGAGTTTCACGGCCAACGATTTCACTACCTCCTCGACCGTTTGGTGAAATGTTTTT 

TAGCGGTGGCGTTGGATTTAGTCCTGGACCAATGACTCTCGTCTCAAATTTATTCTCTGA 

TCCTGATGAGTTCAAGTCTTTCTCTCAGCTTTTAGCTGGAGCTATGGCTTCTCCGGCGGC 

AGCTGCTGTTGCCGCCGCTGCTGTGGTTGCTACTGCTCATCATCAGACACCTGTGAGCTC 

TGTCGGTGATGGCGGTGGAAGCGGTGGTGATGTTGACCCGAGGTTTAAGCAGAGTAGACC 

AACGGGATTGATGATAACTCAACCACCGGGGATGTTTACTGTACCGCCGGGGTTAAGTCC 

GGCTACTCTTTTGGATTCTCCGAGCTTCTTTGGTCTTTTTTCACCTCTTCAGGGAACATT 

TGGTATGACACATCAACAAGCTTTAGCACAAGTCACTGCACAAGCAGTTCAAGGCAATAA 

TGTTCATATGCAGCAATCACAACAATCTGAATATCCTTCTTCTACACAACAACAACAACA 

ACAACAACAACAAGCTTCATTGACTGAGATTCCATCATTTTCTTCTGCACCTAGGTCTCA 

GATTCGAGCCTCGGTTCAAGAAACATCGCAGGGTCAGAGAGAGACTTCGGAAATATCTGT 

CTTTGAGCATCGGTCACAGCCTCAAAATGCTGACAAACCAGCTGATGATGGATACAACTG 

GCGGAAATATGGGCAGAAGCAAGTGAAGGGGAGCGATTTTCCTCGGAGTTATTACAAATG 

TACGCATCCAGCTTGTCCTGTCAAGAAGAAAGTGGAGAGGTCACTCGATGGACAAGTT^C 

GGAAATCATCTACAAGGGTCAACACAATCATGAGCTTCCTCAAAAGCGCGGTAACAATAA 

CGGGAGTTGTAAAAGTTCTGATATTGCAAATCAGTTTCAAACAAGTAATAGCAGTCTCAA 

CAAGAGTAAGAGGGACCAGGAAACAAGCCAAGTTACAACAACAGAGCAGATGTCTGAAGC 

AAGTGATAGCGAGGAGGTTGGGAATGCAGAGACTAGTGTGGGAGAAAGACATGAGGATGA 

GCCTGATCCCAAGCGAAGAAATACAGAAGTTCGGGTTTCAGAACCAGTTGCTTCATCGCA 

TAGAACTGTGACAGAGCCTAGGATTATTGTCCAAACGACGAGTGAAGTTGACCTCTTAGA 

TGATGGATATAGGTGGCGCAAGTATGGTCAGAAAGTAGTCAAAGGAAATCCTTATCCGAG 

GAGCTACTATAAGTGTACAACACCAGATTGCGGAGTAAGGAAACATGTAGAGAGAGCAGC 

AACTGACCCAAAAGCTGTTGTAACAACATATGAAGGTAAACATAACCATGATGTTCCAGC 

TGCTAGAACCAGCAGCCATCAGTTAAGACCAAACAATCAACACAACACCTCAACGGTTAA 

CTTCAATCATCAACAGCCTGTTGCACGTTTAAGGCTTAAAGAAGAGCAAATCACTTGACA 

GAGAAGAAGAATACGACGGCGCTTGAGCTTTTGTGAGTTTAATGAATCTTCTTTTTGGTT 

AATGAACCTGTTTTTGTTGCCTCAAAACACCACAGGTTTCTCTGGACAGAATCTCTGATA 

TTACAGTTTCAAAAGGTATGTTCTTTTATTTCATGTTGGAATCTTCTGTGTAATCTTAAG 

AAGCTTTAGGAGGTAATGTAAAAAACCAGATTCAAAGTTATGCCCTTATGTGAATTCTTT 

TGTACATGGGATAAACAAAATTTACAGGTATCCTTTTTGTTCTTGTTGTAAAAAAAAAAA 
AAAA 

>G878 Amino Acid Sequence (domain in AA coordinates : 250-305 , 415-475) 

MAEKEEKEPSKLKSSTGVSRPTISLPPRPFGEMFFSGGVGFSPGPMTLVSNLFSDPDEFK 

SFS QLLAGAMAS PAAAAVAAAAWATAHHQTPVS SVGDGGGSGGDVDPRFKQSRPTGLMI 

TQPPGMFTVPPGLS PATLLDS PS FFGLFS PLQGTFGMTHQQALAQVTAQAVQGNNVHMQQ 

SQQSEYPSSTQQQQQQQQQASLTEIPSFSSAPRSQIRASVQETSQGQRETSEISVFEHRS 

QPQNADKPADDGYNWRK^GQKQVKGSDFPRSYYKCTHPACPVKKKVERSLDGQVTEIIYK 

GQHNHELPQKRGl^GSCKSSDIANQFQTSNSSLNKSKRDQETSQVTTTEQMSEM 

VGNAETSVGERHEDEPDPKRRNTEVRVSEPVASSHRTVTEPRIIVQTTSEVDLLDDGYRW 

RKYGQKWKGNPYPRSYYKCTTPDCGWKHVERAATDPKAVVTTYEGKHNHDVPAA^ 

HQLRPNNQHNTSTVNFNHQQPVARLRLKEEQIT* 

>G971 (131.. 1171) 

TTTTTTTTCTTCCCTCTTTTAGAACTCTCTCTCTCTCTCGTTTTTGACACTTATCCTCTC 
TCTTTTTTCTCTCTCCCTCTCTCTCTGGCCGGAAAAAAGAACAACGTCGTTTATAGCTAA 
AGATTCGATCATGTTGGATCTTAACCTAAAGATCTTTTCTTCTTATAACGAAGATCAAGA 
TCGGAAAGTACCATTAATGATCTCAACCACCGGTGAAGAAGAATCTAACTCATCTTCCTC 
CTCCACAACAGACTCTGCAGCGAGAGATGCTTTCATCGCTTTTGGAATTCTCAAACGCGA 
CGATGACCTTGTTCCTCCTCCTCCTCCTCCTCCTCATAAAGAAACAGGAGATCTCTTTCC 
GGTGGTGGCTGATGCTCGTCGGAATATAGAATTCTCCGTGGAAGACAGTCACTGGTTGAA 
TCTTTCTTCTTTACAAAGAAATACACAGAAAATGGTGAAGAAGAGCAGAAGAGGACCAAG 
GTCTCGTAGCTCCCAATATCGTGGCGTCACTTTTTACCGTCGCACCGGTCGTTGGGAATC 
TCATATTTGGGATTGTGGAAAGCAAGTTTATTTGGGCGGGTTTGATACTGCTTACGCAGC 
AGCAAGGGCTTACGACCGAGCTGCTATCAAATTCCGTGGTCTCGATGCAGACATCAATTT 
CGTCGTGGATGATTATAGGCATGACATCGATAAGATGAAGAATTTAAATAAGGTGGAGTT 
CGTGCAAACACTTAGGCGAGAGAGTGCGAGTTTCGGAAGAGGAAGTTCCAAATACAAAGG 
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CTTGGCTCTTCAAAAATGCACCCAATTCAAAACTCATGATCAGATTCATCTCTTCCAAAA 
CAGGGGATGGGATGCAGCAGCAATAAAATACAATGAGTTGGGAAAGGGAGAAGGAGCCAT 
GAAGTTTGGTGCCCATATCAAAGGAAATGGTCACAATGATCTTGAACTAAGTCTCGGAAT 
TTCATCATCATCGGAAAGTATAAAGTTGACAACAGGCGATTACTATAAGGGTATCAATCG 
GTCCACGATGGGTTTATACGGTAAGCAATCATCGATATTTTTACCCATGGCAACCATGAA 
ACCTCTGAAGACAGTTGCAGCATCATCAGGATTCCCTTTTATCAGCATGACAAGTTCCTC 
TTCCTCCATGTCCAATTGTTTTGATCCATAGGATCGTTCTACACTCTCTTAACTAATATA 
TATTTTTACTCTATCTGATTATTGTATACAAGGATAAAATTTGATTCTTTCCTTAATGAG 
TGAGAAATATTGGAAGTGTTAAAAAAAAAAAAAAAAAAAAAAA 

>G971 Amino Acid Sequence (conserved domain in aa coordinates: 120-186) 

MLDLNLKI FS S YNEDQDRKVPLM I STTGEEESNS S S S STTDS AARDAFI AFG ILKRDDDL 

VPPPPPPPHKETGDLFPWADARRNIEFSVEDSHWLNLSSLQRNTQKMVKKSRRGPRSRS 

SQYRGVTFYRRTGRWESHIWDCGKQVYLGGFDTAYAAARAYDRAAIKFRGLDADINFWD 

DYRHDIDKMKNLNKVEFVQTLRRESASFGRGSSKYKGLALQKCTQFKTHDQIHLFQNRGW 

DAAAI KYNELGKGEGAMKFGAH I KGNGKNDLEIiSLGI S S SSES I KLTTGD YYKGINRSTM 

GL YGKQS S IFLPMATMKPLKTVAAS SGFPFISMTS S S S SMSNCFDP * 

>G975 (58.. 657) 

ATTACTCATCATCAAGTTCCTACTTTCTCTCTGACAAACATCACAGAGTAAGTAAGAATG 

GTACAGACGAAGAAGTTCAGAGGTGTCAGGCAACGCCATTGGGGTTCTTGGGTCGCTGAG 

ATTCGTCATCCTCTCTTGAAACGGAGGATTTGGCTAGGGACGTTCGAGACCGCAGAGGAG 

GCAGCAAGAGCATACGACGAGGCCGCCGTTTTAATGAGCGGCCGCAACGCCAAAACCAAC 

TTTCCCCTCAACAACAACAACACCGGAGAAACTTCCGAGGGCAAAACCGATATTTCAGCT 

TCGTCCACAATGTCATCCTCAACATCATCTTCATCGCTCTCTTCCATCCTCAGCGCCAAA 

CTGAGGAAATGCTGCAAGTCTCCTTCCCCATCCCTCACCTGCCTCCGTCTTGACACAGCC 

AGCTCCCATATCGGCGTCTGGCAGAAACGGGCCGGTTCAAAGTCTGACTCCAGCTGGGTC 

ATGACGGTGGAGCTAGGTCCCGCAAGCTCCTCCCAAGAGACTACTAGTAAAGCTTCACAA 

GACGCTATTCTTGCTCCGACCACTGAAGTTGAAATTGGTGGCAGCAGAGAAGAAGTATTG 

GATGAGGAAGAAAAGGTTGCTTTGCAAATGATAGAGGAGCTTCTCAATACAAACTAAATC 

TTATTTGCTTATATATATGTACCTATTTTCATTGCTGATTTACAGCCAAAATAATCAATT 

ATACCGTGTATTTTATAGATGTTTTATATTAAAAGGTTGTTAGATATA 

>G975 Amino Acid Sequence (domain in AA coordinates: 4-71) 

MVQTKKFRGTOQRHWGSWAEIRHPLLKRRIWLGTFETAEEAARAYDEAAVLMSGRNAKT 

NFPLNNNNTGETSEGKTDISASSTMSSSTSSSSLSSILSAKLRKCCKSPSPSLTCLRLDT 

ASSHIGWQKRAGSKSDSSWVMTVELGPASSSQETTSKASQDAILAPTTEVEIGGSREEV 

LDEEEKVALQMIEELLNTN* 

>G994 (180.. 917) 

TGTATATATAGTTAGTTAGTTGAGATAAACTTGGTTACCACTTTTGTGTGGTCTTTCTTT 
TTCTTTTTCTCCATTTTCCATTTATCGACCCCTTGGGTGTAGCTAATTACTTTCGCGATT 
TTCAAATC CAATAAAGTTTTAATTTG ATGAAG CTTTTTTTAAAC CATATAATATAAATAA 
TGGGTGGTCGTAAACCATGTTGTGATGAGGTTGGATTAAGAAAGGGTCCATGGACAGTGG 
AAGAAGATGGGAAACTAGTTGATTTCTTAAGGGCACGTGGCAACTGCGGTGGTGGTGGAG 
GAGGATGGTGCTGGAGAGACGTGCCAAAACTGGCGGGGCTAAGGAGGTGTGGCAAAAGTT 
GCCGTCTCCGGTGGACTAATTATCTCCGGCCAGATCTCAAGAGAGGTCTTTTTACTGAAG 
AAGAAATCCAACTAGTCATTGATCTTCATGCTCGCCTTGGCAATAGATGGTCGAAGATTG 
C^GTGGAGTTACCAGGAAGAACAGACAACGATATCAAAAATTATTGGAACACTCATATAA 
AGAGGAAGCTTATAAGAATGGGTATTGATCCAAACACACATCGTCGATTTGACCAACAAA 
AAGTCAACGAGGAGGAAACGATATTGGTCAACGATCCAAAGCCTCTGTCTGAGACCGAGG 
TATCTGTTGCTTTG7tAGAATGA(^CGTCAGCAGTGTTATCAGGAAATCTAAACCAATTGG 
CTGACGTGGACGGTGATGATCAGCCGTGGAGCTTTCTAATGGAAAATGACGAAGGAGGAG 
GTGGCGACGCCGCCGGAGAGCTTACGATGCTATTGTCCGGTGACATTACGTCATCATGTT 
CTTCTTCGTCATCTTTGTGGATGAAGTATGGAGAATTCGGATACGAAGATTTAGAACTTG 
GATGTTTCGATGTTTAGAGATTCAAGTATGTTTAATTAGGCCGTAGGTTGATTAATCATA 
AGGTTCATTGACTTCATTCTAGAATTGTGTAGTTGGACCAGTATAAAGAATCAAAGTTAT 
GAAACATTGTAATTTGATTTCCAAATTAATCTAATGAATAAATGTGCTTTGCAAAAAAAA 
AAAAAAAAAAAAAAA 

>G994 Amino Acid Sequence (domain in AA coordinates: 14-123) 
MGGRKPCCDEVGLRKGPWTV^EDGKLVI)FLRARGNCGGGGGGWCW^VPKLAGLRRCGKS 
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CRLRWTNYLRPDLKRGLFTEEEIQLVIDLHARLGNRWSKIAVELPGRTDNDIKNYWNTHI 
KRKLIRMGIDPNTHRRFDQQKWEEETII.VNDPKPLSETEVSVALKNDTSAVLSG™ 

>G2347 (81.. 626) 

^^? TCCTTC ^ CATTGCTTCCTAACCAGA AATCCACCATCATCTTCCCACGAATACA 
^AAAGCTTTACCAGAAAATGGAGGGTCAGAGAACAC^ 

acaaggctacagtctccaaccttgttgaagaagaaatggagaatggcatggmgS^ 

aggaggatggaggagacgaagacaaaaggaagaaggtgatggaaagagS^^ 

gcactgaccgtgttccatcgcgactgtgccaggtcgataggtg^^ 

^^gcagtattaccgcagacacagagtatgtgaagtacatcSg^^tc 



AACTGATCCAGACTCAAGAAAGAAACAGGGTAGA^ 



======= 



CATTCAAGCGACCACAGATCAGATAAACCCTCCCGCTCTCTCTCTTCTGTCATC- m '-r, 
TGCTCTATCTACACTCTTATTAGACA^^ 

TCATGGTATTAAATCCTACACGGATATATAACTATAAACCTCTAG^^ 

ss^g^ 

>G2347 Amino Acid Sequence (domain in AA coordinates- fin nfi* 
MEGQRTQRRGYLKDKATVSNLVEEEMENGMDGEEEDGGDEDKRKKVMERWGPSTDRip^ 

rl^rctvnlteakqyyrrhrvcevhakasaatvagvrqrfcqqcsrS 

>G2010 (1..525) 

AAGGCATCTTCTGTCTTTCTCTCAGGACTTAACCAACGCTTTTGTCAACAATC 
TTTCATGACCTCCAAGAGTTTGATGAAGCTAAGAGAAGrrGCAGGA^^ 

J°i^ TCAATGGTCAGGTGGTGATGCAGAA TCAAGAAAGATCAAGGG?^^^ 
CTTCCTATGCCAAACTCATCATTCAAGCGACCACAGATTAGATAG ' 



M G ^L A ^f°„i C l d . S . eqUence domain in AA coordinates: 53-127) 

bCQVDRCT 
3CRRRLAG 



^ G ^i Q f? G ™^ YLVEEDME ™ T DEEEEVGRDRVRGSRGSI^GGSLRLCQVDRCT 



ADM^AKLYHRRHKVCEVHAKASSVFLSGLNQRFCQQCSRFHDLQEPDEAiS 

hnerrrkssgestygegsgrrgingqwmqnqersrvemtlpmpnssfSSir* 
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